Hi Timo, Thanks for the FLIP. It looks good to me overall. I have two questions. 1. When should we use a resolved schema and when to use an unresolved one? 2. The FLIP mentions only resolved tables/views can be stored into a catalog. Does that mean the getTable method should also return a resolved object?
On Tue, Feb 9, 2021 at 6:29 PM Timo Walther <twal...@apache.org> wrote: > Hi Jark, > > thanks for your feedback. Let me answer some of your comments: > > 1) Since we decided to build an entire new stack, we can also introduce > better names for columns, constraints, and watermark spec. My goal was > to shorten the names during this refactoring. Therefore, `TableSchema` > becomes `Schema` and `TableColumn` becomes `Column`. This also fits > better to a `CatalogView` that has a schema but is actually not a table > but a view. So `Column` is very generic. What do you think? > > 2) `ComputedColumn` and `WatermarkSpec` of the new generation will store > `ResolvedExpression`. > > 3) I adopted most of the methods from `TableSchema` in `ResolvedSchema`. > However, I skipped `getColumnDataTypes()` because the behavior is not > clear to me. Should it include computed columns or virtual metadata > columns? I think we should force users to think about what they require. > Otherwise we implicitly introduce bugs. > > Regards, > Timo > > On 09.02.21 10:56, Jark Wu wrote: > > Hi Timo, > > > > The messy TableSchema confuses many developers. > > It's great to see we can finally come up with a clean interface hierarchy > > and still backward compatible. > > > > Thanks for preparing the nice FLIP. It looks good to me. I have some > minor > > comments: > > > > 1) Should `ResolvedSchema#getColumn(int)` returns `TableColumn` instead > of > > `Column`? > > > > 2) You mentioned ResolvedSchema should store ResolvedExpression, should > we > > extend > > `ComputedColumn` and `WatermarkSpec` to allow `ResolvedExpression`? > > > > 3) `ResolvedSchema` aims to replace `TableSchema`, it would be better to > > add un-deprecated > > methods of `TableSchema` into `ResolvedSchema` > > (e.g. `getColumnDataTypes()`). > > Then users can have a smooth migration. > > > > Best, > > Jark > > > > On Mon, 8 Feb 2021 at 20:21, Dawid Wysakowicz <dwysakow...@apache.org> > > wrote: > > > >> Hi Timo, > >> > >> From my perspective the proposed changes look good. I agree it is an > >> important step towards FLIP-129 and FLIP-136. Personally I feel > >> comfortable voting on the document. > >> > >> Best, > >> > >> Dawid > >> > >> On 05/02/2021 16:09, Timo Walther wrote: > >>> Hi everyone, > >>> > >>> you might have seen that we discussed a better schema API in past as > >>> part of FLIP-129 and FLIP-136. We also discussed this topic during > >>> different releases: > >>> > >>> https://issues.apache.org/jira/browse/FLINK-17793 > >>> > >>> Jark and I had an offline discussion how we can finally fix this > >>> shortcoming and maintain backwards compatibile for a couple of > >>> releases to give people time to update their code. > >>> > >>> I would like to propose the following FLIP: > >>> > >>> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-164%3A+Improve+Schema+Handling+in+Catalogs > >>> > >>> > >>> The FLIP updates the class hierarchy to achieve the following goals: > >>> > >>> - make it visible whether a schema is resolved or unresolved and when > >>> the resolution happens > >>> - offer a unified API for FLIP-129, FLIP-136, and catalogs > >>> - allow arbitrary data types and expressions in the schema for > >>> watermark spec or columns > >>> - have access to other catalogs for declaring a data type or > >>> expression via CatalogManager > >>> - a cleaned up TableSchema > >>> - remain backwards compatible in the persisted properties and API > >>> > >>> Looking forward to your feedback. > >>> > >>> Thanks, > >>> Timo > >> > >> > > > > -- Best regards! Rui Li