My POC is here for the hints options merge [1]. Personally, I have no strong objections for splitting hints with the CatalogTable, the only cons is a more complex implementation but the concept is more clear, and I have updated the WIKI.
I think it would be nice if we can support the format “ignore-parse error” option key, the CSV source already has a key [2] and we can use that in the supportedHIntOptions, for the common CSV and JSON formats, we cal also give a support. This is the only kind of key in formats that “do not change the semantics” (somehow), what do you think about this ~ [1] https://github.com/danny0405/flink/commit/5d925fa16c3c553423c4b7d93001521b8e6e6bee#diff-6e569a6dd124fd2091c18e2790fb49c5 [2] https://github.com/apache/flink/blob/b83060dff6d403b6994b6646b3f29a374f599530/flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/table/sources/CsvTableSourceFactoryBase.java#L92 Best, Danny Chan 在 2020年3月18日 +0800 PM9:10,Timo Walther <twal...@apache.org>,写道: > Hi everyone, > > +1 to Kurt's suggestion. Let's just have it in source and sink factories > for now. We can still move this method up in the future. Currently, I > don't see a need for catalogs or formats. Because how would you target a > format in the query? > > @Danny: Can you send a link to your PoC? I'm very skeptical about > creating a new CatalogTable in planner. Actually CatalogTable should be > immutable between Catalog and Factory. Because a catalog can return its > own factory and fully control the instantiation. Depending on the > implementation, that means it can be possible that the catalog has > encoded more information in a concrete subclass implementing the > interface. I vote for separating the concerns of catalog information and > hints in the factory explicitly. > > Regards, > Timo > > > On 18.03.20 05:41, Jingsong Li wrote: > > Hi, > > > > I am thinking we can provide hints to *table* related instances. > > - TableFormatFactory: of cause we need hints support, there are many format > > options in DDL too. > > - catalog and module: I don't know, maybe in future we can provide some > > hints for them. > > > > Best, > > Jingsong Lee > > > > On Wed, Mar 18, 2020 at 12:28 PM Danny Chan <yuzhao....@gmail.com> wrote: > > > > > Yes, I think we should move the `supportedHintOptions` from TableFactory > > > to TableSourceFactory, and we also need to add the interface to > > > TableSinkFactory though because sink target table may also have hints > > > attached. > > > > > > Best, > > > Danny Chan > > > 在 2020年3月18日 +0800 AM11:08,Kurt Young <ykt...@gmail.com>,写道: > > > > Have one question for adding `supportedHintOptions` method to > > > > `TableFactory`. It seems > > > > `TableFactory` is a base factory interface for all *table module* > > > > related > > > > instances, such as > > > > catalog, module, format and so on. It's not created only for *table*. Is > > > it > > > > possible to move it > > > > to `TableSourceFactory`? > > > > > > > > Best, > > > > Kurt > > > > > > > > > > > > On Wed, Mar 18, 2020 at 10:59 AM Danny Chan <yuzhao....@gmail.com> > > > wrote: > > > > > > > > > Thanks Timo ~ > > > > > > > > > > For the naming itself, I also think the PROPERTIES is not that > > > concise, so > > > > > +1 for OPTIONS (I had thought about that, but there are many codes in > > > > > current Flink called it properties, i.e. the DescriptorProperties, > > > > > #getSupportedProperties), let’s use OPTIONS if this is our new > > > preference. > > > > > > > > > > +1 to `Set<ConfigOption> supportedHintOptions()` because the > > > ConfigOption > > > > > can take more info. AFAIK, Spark also call their table options instead > > > of > > > > > properties. [1] > > > > > > > > > > In my local POC, I did create a new CatalogTable, and it works for > > > current > > > > > connectors well, all the DDL tables would finally yield a CatalogTable > > > > > instance and we can apply the options to that(in the > > > > > CatalogSourceTable > > > > > when we generating the TableSource), the pros is that we do not need > > > > > to > > > > > modify the codes of connectors itself. If we split the options from > > > > > CatalogTable, we may need to add some additional logic in each > > > connector > > > > > factories in order to merge these properties (and the logic are almost > > > the > > > > > same), what do you think about this? > > > > > > > > > > [1] > > > > > > > > https://docs.databricks.com/spark/latest/spark-sql/language-manual/create-table.html > > > > > > > > > > Best, > > > > > Danny Chan > > > > > 在 2020年3月17日 +0800 PM10:10,Timo Walther <twal...@apache.org>,写道: > > > > > > Hi Danny, > > > > > > > > > > > > thanks for updating the FLIP. I think your current design is > > > sufficient > > > > > > to separate hints from result-related properties. > > > > > > > > > > > > One remark to the naming itself: I would vote for calling the hints > > > > > > around table scan `OPTIONS('k'='v')`. We used the term "properties" > > > in > > > > > > the past but since we want to unify the Flink configuration > > > experience, > > > > > > we should use consistent naming and classes around `ConfigOptions`. > > > > > > > > > > > > It would be nice to use `Set<ConfigOption> supportedHintOptions();` > > > to > > > > > > start using config options instead of pure string properties. This > > > will > > > > > > also allow us to generate documentation in the future around > > > supported > > > > > > data types, ranges, etc. for options. At some point we would also > > > like > > > > > > to drop `DescriptorProperties` class. "Options" is also used in the > > > > > > documentation [1] and in the SQL/MED standard [2]. > > > > > > > > > > > > Furthermore, I would still vote for separating CatalogTable and hint > > > > > > options. Otherwise the planner would need to create a new > > > CatalogTable > > > > > > instance which might not always be easy. We should offer them via: > > > > > > > > > > > > org.apache.flink.table.factories.TableSourceFactory.Context#getHints: > > > > > > ReadableConfig > > > > > > > > > > > > What do you think? > > > > > > > > > > > > Regards, > > > > > > Timo > > > > > > > > > > > > [1] > > > > > > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/create.html#create-table > > > > > > [2] https://wiki.postgresql.org/wiki/SQL/MED > > > > > > > > > > > > > > > > > > On 12.03.20 15:06, Stephan Ewen wrote: > > > > > > > @Danny sounds good. > > > > > > > > > > > > > > Maybe it is worth listing all the classes of problems that you > > > want to > > > > > > > address and then look at each class and see if hints are a good > > > default > > > > > > > solution or a good optional way of simplifying things? > > > > > > > The discussion has grown a lot and it is starting to be hard to > > > > > distinguish > > > > > > > the parts where everyone agrees from the parts were there are > > > concerns. > > > > > > > > > > > > > > On Thu, Mar 12, 2020 at 2:31 PM Danny Chan <danny0...@apache.org> > > > > > wrote: > > > > > > > > > > > > > > > Thanks Stephan ~ > > > > > > > > > > > > > > > > We can remove the support for properties that may change the > > > > > semantics of > > > > > > > > query if you think that is a trouble. > > > > > > > > > > > > > > > > How about we support the /*+ properties() */ hint only for those > > > > > optimize > > > > > > > > parameters, such as the fetch size of source or something like > > > that, > > > > > does > > > > > > > > that make sense? > > > > > > > > > > > > > > > > Stephan Ewen <se...@apache.org>于2020年3月12日 周四下午7:45写道: > > > > > > > > > > > > > > > > > I think Bowen has actually put it very well. > > > > > > > > > > > > > > > > > > (1) Hints that change semantics looks like trouble waiting to > > > > > happen. For > > > > > > > > > example Kafka offset handling should be in filters. The Kafka > > > > > source > > > > > > > > should > > > > > > > > > support predicate pushdown. > > > > > > > > > > > > > > > > > > (2) Hints should not be a workaround for current shortcomings. > > > A > > > > > lot of > > > > > > > > the > > > > > > > > > suggested above sounds exactly like that. Working around > > > > > catalog/DDL > > > > > > > > > shortcomings, missing exposure of metadata (offsets), missing > > > > > predicate > > > > > > > > > pushdown in Kafka. Abusing a feature like hints now as a quick > > > fix > > > > > for > > > > > > > > > these issues, rather than fixing the root causes, will much > > > likely > > > > > bite > > > > > > > > us > > > > > > > > > back badly in the future. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Stephan > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Mar 12, 2020 at 10:43 AM Kurt Young <ykt...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > > > > > > > It seems this FLIP's name is somewhat misleading. From my > > > > > > > > understanding, > > > > > > > > > > this FLIP is trying to > > > > > > > > > > address the dynamic parameter issue, and table hints is the > > > way > > > > > we wan > > > > > > > > to > > > > > > > > > > choose. I think we should > > > > > > > > > > be focus on "what's the right way to solve dynamic property" > > > > > instead of > > > > > > > > > > discussing "whether table > > > > > > > > > > hints can affect query semantics". > > > > > > > > > > > > > > > > > > > > For now, there are two proposed ways to achieve dynamic > > > property: > > > > > > > > > > 1. FLIP-110: create temporary table xx like xx with (xxx) > > > > > > > > > > 2. use custom "from t with (xxx)" syntax > > > > > > > > > > 3. "Borrow" the table hints to have a special PROPERTIES > > > hint. > > > > > > > > > > > > > > > > > > > > The first one didn't break anything, but the only problem i > > > see > > > > > is a > > > > > > > > > little > > > > > > > > > > more verbose than the table hint > > > > > > > > > > approach. I can imagine when someone using SQL CLI to have a > > > sql > > > > > > > > > > experience, it's quite often that > > > > > > > > > > he will modify the table property, some use cases i can > > > think of: > > > > > > > > > > 1. the source contains some corrupted data, i want to turn > > > on the > > > > > > > > > > "ignore-error" flag for certain formats. > > > > > > > > > > 2. I have a kafka table and want to see some sample data > > > from the > > > > > > > > > > beginning, so i change the offset > > > > > > > > > > to "earliest", and then I want to observe the latest data > > > which > > > > > keeps > > > > > > > > > > coming in. I would write another query > > > > > > > > > > to select from the latest table. > > > > > > > > > > 3. I want to my jdbc sink flush data more eagerly then i can > > > > > observe > > > > > > > > the > > > > > > > > > > data from database side. > > > > > > > > > > > > > > > > > > > > Most of such use cases are quite ad-hoc. If every time I > > > want to > > > > > have a > > > > > > > > > > different experience, i need to create > > > > > > > > > > a temporary table and then also modify my query, it doesn't > > > feel > > > > > > > > smooth. > > > > > > > > > > Embed such dynamic property into > > > > > > > > > > query would have better user experience. > > > > > > > > > > > > > > > > > > > > Both 2 & 3 can make this happen. The cons of #2 is breaking > > > SQL > > > > > > > > > compliant, > > > > > > > > > > and for #3, it only breaks some > > > > > > > > > > unwritten rules, but we can have an explanation on that. And > > > I > > > > > really > > > > > > > > > doubt > > > > > > > > > > whether user would complain about > > > > > > > > > > this when they actually have flexible and good experience > > > using > > > > > this. > > > > > > > > > > > > > > > > > > > > My tendency would be #3 > #1 > #2, what do you think? > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Kurt > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Mar 12, 2020 at 1:11 PM Danny Chan < > > > yuzhao....@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Thanks Aljoscha ~ > > > > > > > > > > > > > > > > > > > > > > I agree for most of the query hints, they are optional as > > > an > > > > > > > > optimizer > > > > > > > > > > > instruction, especially for the traditional RDBMS. > > > > > > > > > > > > > > > > > > > > > > But, just like BenChao said, Flink as a computation engine > > > has > > > > > many > > > > > > > > > > > different kind of data sources, thus, dynamic parameters > > > like > > > > > > > > > > start_offest > > > > > > > > > > > can only bind to each table scope, we can not set a > > > > > > > > > > > session > > > > > config > > > > > > > > like > > > > > > > > > > > KSQL because they are all about Kafka: > > > > > > > > > > > > SET ‘auto.offset.reset’=‘earliest’; > > > > > > > > > > > > > > > > > > > > > > Thus the most flexible way to set up these dynamic params > > > is > > > > > to bind > > > > > > > > to > > > > > > > > > > > the table scope in the query when we want to override > > > > > something, so > > > > > > > > we > > > > > > > > > > have > > > > > > > > > > > these solutions above (with pros and cons from my side): > > > > > > > > > > > > > > > > > > > > > > • 1. Select * from t(offset=123) (from Timo) > > > > > > > > > > > > > > > > > > > > > > Pros: > > > > > > > > > > > - Easy to add > > > > > > > > > > > - Parameters are part of the main query > > > > > > > > > > > Cons: > > > > > > > > > > > - Not SQL compliant > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > • 2. Select * from t /*+ PROPERTIES(offset=123) */ (from > > > me) > > > > > > > > > > > > > > > > > > > > > > Pros: > > > > > > > > > > > - Easy to add > > > > > > > > > > > - SQL compliant because it is nested in the comments > > > > > > > > > > > > > > > > > > > > > > Cons: > > > > > > > > > > > - Parameters are not part of the main query > > > > > > > > > > > - Cryptic syntax for new users > > > > > > > > > > > > > > > > > > > > > > The biggest problem for hints way may be the “if hints > > > must be > > > > > > > > > optional”, > > > > > > > > > > > actually we have though about 1 for a while but aborted > > > > > because it > > > > > > > > > breaks > > > > > > > > > > > the SQL standard too much. And we replace it with 2, > > > because > > > > > the > > > > > > > > hints > > > > > > > > > > > syntax do not break SQL standard(nested in comments). > > > > > > > > > > > > > > > > > > > > > > What if we have the special /*+ PROPERTIES */ hint that > > > allows > > > > > > > > override > > > > > > > > > > > some properties of table dynamically, it does not break > > > > > anything, at > > > > > > > > > > lease > > > > > > > > > > > for current Flink use cases. > > > > > > > > > > > > > > > > > > > > > > Planner hints are optional just because they are naturally > > > > > enforcers > > > > > > > > of > > > > > > > > > > > the planner, most of them aim to instruct the optimizer, > > > but, > > > > > the > > > > > > > > table > > > > > > > > > > > hints is a little different, table hints can specify the > > > table > > > > > meta > > > > > > > > > like > > > > > > > > > > > index column, and it is very convenient to specify table > > > > > properties. > > > > > > > > > > > > > > > > > > > > > > Or shall we not call /*+ PROPERTIES(offset=123) */ table > > > hint, > > > > > we > > > > > > > > can > > > > > > > > > > > call it table dynamic parameters. > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > Danny Chan > > > > > > > > > > > 在 2020年3月11日 +0800 PM9:20,Aljoscha Krettek < > > > > > aljos...@apache.org>,写道: > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > > > I don't understand this discussion. Hints, as I > > > understand > > > > > them, > > > > > > > > > should > > > > > > > > > > > > work like this: > > > > > > > > > > > > > > > > > > > > > > > > - hints are *optional* advice for the optimizer to try > > > and > > > > > help it > > > > > > > > to > > > > > > > > > > > > find a good execution strategy > > > > > > > > > > > > - hints should not change query semantics, i.e. they > > > should > > > > > not > > > > > > > > > change > > > > > > > > > > > > connector properties executing a query with taking into > > > > > account the > > > > > > > > > > > > hints *must* produce the same result as executing the > > > query > > > > > without > > > > > > > > > > > > taking into account the hints > > > > > > > > > > > > > > > > > > > > > > > > From these simple requirements you can derive a solution > > > > > that makes > > > > > > > > > > > > sense. I don't have a strong preference for the syntax > > > but we > > > > > > > > should > > > > > > > > > > > > strive to be in line with prior work. > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > Aljoscha > > > > > > > > > > > > > > > > > > > > > > > > On 11.03.20 11:53, Danny Chan wrote: > > > > > > > > > > > > > Thanks Timo for summarize the 3 options ~ > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with Kurt that option2 is too complicated to > > > use > > > > > because: > > > > > > > > > > > > > > > > > > > > > > > > > > • As a Kafka topic consumer, the user must define both > > > the > > > > > > > > virtual > > > > > > > > > > > column for start offset and he must apply a special filter > > > > > predicate > > > > > > > > > > after > > > > > > > > > > > each query > > > > > > > > > > > > > • And for the internal implementation, the metadata > > > column > > > > > push > > > > > > > > > down > > > > > > > > > > > is another hard topic, each kind of message queue may have > > > its > > > > > offset > > > > > > > > > > > attribute, we need to consider the expression type for > > > > > different > > > > > > > > kind; > > > > > > > > > > the > > > > > > > > > > > source also need to recognize the constant column as a > > > config > > > > > > > > > > option(which > > > > > > > > > > > is weird because usually what we pushed down is a table > > > column) > > > > > > > > > > > > > > > > > > > > > > > > > > For option 1 and option3, I think there is no > > > difference, > > > > > option1 > > > > > > > > > is > > > > > > > > > > > also a hint syntax which is introduced in Sybase and > > > > > referenced then > > > > > > > > > > > deprecated by MS-SQL in 199X years because of the > > > > > ambitiousness. > > > > > > > > > > Personally > > > > > > > > > > > I prefer /*+ */ style table hint than WITH keyword for > > > these > > > > > reasons: > > > > > > > > > > > > > > > > > > > > > > > > > > • We do not break the standard SQL, the hints are > > > nested > > > > > in SQL > > > > > > > > > > > comments > > > > > > > > > > > > > • We do not need to introduce additional WITH keyword > > > > > which may > > > > > > > > > > appear > > > > > > > > > > > in a query if we use that because a table can be > > > referenced in > > > > > all > > > > > > > > > kinds > > > > > > > > > > of > > > > > > > > > > > SQL contexts: INSERT/DELETE/FROM/JOIN …. That would make > > > our > > > > > sql > > > > > > > > query > > > > > > > > > > > break too much of the SQL from standard > > > > > > > > > > > > > • We would have uniform syntax for hints as query > > > hint, one > > > > > > > > syntax > > > > > > > > > > > fits all and more easy to use > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > And here is the reason why we choose a uniform Oracle > > > > > style query > > > > > > > > > > > hint syntax which is addressed by Julian Hyde when we > > > design > > > > > the > > > > > > > > syntax > > > > > > > > > > > from the Calcite community: > > > > > > > > > > > > > > > > > > > > > > > > > > I don’t much like the MSSQL-style syntax for table > > > hints. > > > > > It > > > > > > > > adds a > > > > > > > > > > > new use of the WITH keyword that is unrelated to the use > > > > > > > > > > > of > > > > > WITH for > > > > > > > > > > > common-table expressions. > > > > > > > > > > > > > > > > > > > > > > > > > > A historical note. Microsoft SQL Server inherited its > > > hint > > > > > syntax > > > > > > > > > > from > > > > > > > > > > > Sybase a very long time ago. (See “Transact SQL > > > > > Programming”[1], page > > > > > > > > > > 632, > > > > > > > > > > > “Optimizer hints”. The book was written in 1999, and > > > > > > > > > > > covers > > > > > Microsoft > > > > > > > > > SQL > > > > > > > > > > > Server 6.5 / 7.0 and Sybase Adaptive Server 11.5, but the > > > > > syntax very > > > > > > > > > > > likely predates Sybase 4.3, from which Microsoft SQL > > > Server was > > > > > > > > forked > > > > > > > > > in > > > > > > > > > > > 1993.) > > > > > > > > > > > > > > > > > > > > > > > > > > Microsoft later added the WITH keyword to make it less > > > > > ambiguous, > > > > > > > > > and > > > > > > > > > > > has now deprecated the syntax that does not use WITH. > > > > > > > > > > > > > > > > > > > > > > > > > > They are forced to keep the syntax for backwards > > > > > compatibility > > > > > > > > but > > > > > > > > > > > that doesn’t mean that we should shoulder their burden. > > > > > > > > > > > > > > > > > > > > > > > > > > I think formatted comments are the right container for > > > > > hints > > > > > > > > > because > > > > > > > > > > > it allows us to change the hint syntax without changing > > > the SQL > > > > > > > > parser, > > > > > > > > > > and > > > > > > > > > > > makes clear that we are at liberty to ignore hints > > > entirely. > > > > > > > > > > > > > > > > > > > > > > > > > > Julian > > > > > > > > > > > > > > > > > > > > > > > > > > [1] https://www.amazon.com/s?k=9781565924017 < > > > > > > > > > > > https://www.amazon.com/s?k=9781565924017> > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > Danny Chan > > > > > > > > > > > > > 在 2020年3月11日 +0800 PM4:03,Timo Walther < > > > twal...@apache.org > > > > > > ,写道: > > > > > > > > > > > > > > Hi Danny, > > > > > > > > > > > > > > > > > > > > > > > > > > > > it is true that our DDL is not standard compliant by > > > > > using the > > > > > > > > > WITH > > > > > > > > > > > > > > clause. Nevertheless, we aim for not diverging too > > > much > > > > > and the > > > > > > > > > > LIKE > > > > > > > > > > > > > > clause is an example of that. It will solve things > > > like > > > > > > > > > overwriting > > > > > > > > > > > > > > WATERMARKs, add additional/modifying properties and > > > > > inherit > > > > > > > > > schema. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Bowen is right that Flink's DDL is mixing 3 types > > > > > definition > > > > > > > > > > > together. > > > > > > > > > > > > > > We are not the first ones that try to solve this. > > > There > > > > > is also > > > > > > > > > the > > > > > > > > > > > SQL > > > > > > > > > > > > > > MED standard [1] that tried to tackle this problem. > > > > > > > > > > > > > > I > > > > > think it > > > > > > > > > was > > > > > > > > > > > not > > > > > > > > > > > > > > considered when designing the current DDL. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Currently, I see 3 options for handling Kafka > > > offsets. I > > > > > will > > > > > > > > > give > > > > > > > > > > > some > > > > > > > > > > > > > > examples and look forward to feedback here: > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Option 1* Runtime and semantic parms as part of the > > > > > query > > > > > > > > > > > > > > > > > > > > > > > > > > > > `SELECT * FROM MyTable('offset'=123)` > > > > > > > > > > > > > > > > > > > > > > > > > > > > Pros: > > > > > > > > > > > > > > - Easy to add > > > > > > > > > > > > > > - Parameters are part of the main query > > > > > > > > > > > > > > - No complicated hinting syntax > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cons: > > > > > > > > > > > > > > - Not SQL compliant > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Option 2* Use metadata in query > > > > > > > > > > > > > > > > > > > > > > > > > > > > `CREATE TABLE MyTable (id INT, offset AS > > > > > > > > > > SYSTEM_METADATA('offset'))` > > > > > > > > > > > > > > > > > > > > > > > > > > > > `SELECT * FROM MyTable WHERE offset > TIMESTAMP > > > > > '2012-12-12 > > > > > > > > > > > 12:34:22'` > > > > > > > > > > > > > > > > > > > > > > > > > > > > Pros: > > > > > > > > > > > > > > - SQL compliant in the query > > > > > > > > > > > > > > - Access of metadata in the DDL which is required > > > anyway > > > > > > > > > > > > > > - Regular pushdown rules apply > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cons: > > > > > > > > > > > > > > - Users need to add an additional comlumn in the DDL > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Option 3*: Use hints for properties > > > > > > > > > > > > > > > > > > > > > > > > > > > > ` > > > > > > > > > > > > > > SELECT * > > > > > > > > > > > > > > FROM MyTable /*+ PROPERTIES('offset'=123) */ > > > > > > > > > > > > > > ` > > > > > > > > > > > > > > > > > > > > > > > > > > > > Pros: > > > > > > > > > > > > > > - Easy to add > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cons: > > > > > > > > > > > > > > - Parameters are not part of the main query > > > > > > > > > > > > > > - Cryptic syntax for new users > > > > > > > > > > > > > > - Not standard compliant. > > > > > > > > > > > > > > > > > > > > > > > > > > > > If we go with this option, I would suggest to make > > > > > > > > > > > > > > it > > > > > available > > > > > > > > > in > > > > > > > > > > a > > > > > > > > > > > > > > separate map and don't mix it with statically > > > > > > > > > > > > > > defined > > > > > > > > properties. > > > > > > > > > > > Such > > > > > > > > > > > > > > that the factory can decide which properties have > > > > > > > > > > > > > > the > > > > > right to > > > > > > > > be > > > > > > > > > > > > > > overwritten by the hints: > > > > > > > > > > > > > > TableSourceFactory.Context.getQueryHints(): > > > > > ReadableConfig > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > Timo > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] https://en.wikipedia.org/wiki/SQL/MED > > > > > > > > > > > > > > > > > > > > > > > > > > > > Currently I see 3 options as a > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 11.03.20 07:21, Danny Chan wrote: > > > > > > > > > > > > > > > Thanks Bowen ~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree we should somehow categorize our connector > > > > > > > > parameters. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For type1, I’m already preparing a solution like > > > the > > > > > > > > Confluent > > > > > > > > > > > schema registry + Avro schema inference thing, so this may > > > not > > > > > be a > > > > > > > > > > problem > > > > > > > > > > > in the near future. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For type3, I have some questions: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > "SELECT * FROM mykafka WHERE offset > 12pm > > > yesterday” > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Where does the offset column come from, a virtual > > > > > column from > > > > > > > > > the > > > > > > > > > > > table schema, you said that > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > They change > > > > > > > > > > > > > > > almost every time a query starts and have nothing > > > to > > > > > do with > > > > > > > > > > > metadata, thus > > > > > > > > > > > > > > > should not be part of table definition/DDL > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > But why you can reference it in the query, I’m > > > > > confused for > > > > > > > > > that, > > > > > > > > > > > can you elaborate a little ? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > Danny Chan > > > > > > > > > > > > > > > 在 2020年3月11日 +0800 PM12:52,Bowen Li < > > > > > bowenl...@gmail.com > > > > > > > > > ,写道: > > > > > > > > > > > > > > > > Thanks Danny for kicking off the effort > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The root cause of too much manual work is Flink > > > DDL > > > > > has > > > > > > > > > mixed 3 > > > > > > > > > > > types of > > > > > > > > > > > > > > > > params together and doesn't handle each of them > > > very > > > > > well. > > > > > > > > > > Below > > > > > > > > > > > are how I > > > > > > > > > > > > > > > > categorize them and corresponding solutions in > > > > > > > > > > > > > > > > my > > > > > mind: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - type 1: Metadata of external data, like > > > external > > > > > > > > > > endpoint/url, > > > > > > > > > > > > > > > > username/pwd, schemas, formats. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Such metadata are mostly already accessible in > > > > > external > > > > > > > > > system > > > > > > > > > > > as long as > > > > > > > > > > > > > > > > endpoints and credentials are provided. Flink > > > > > > > > > > > > > > > > can > > > > > get it > > > > > > > > thru > > > > > > > > > > > catalogs, but > > > > > > > > > > > > > > > > we haven't had many catalogs yet and thus Flink > > > just > > > > > hasn't > > > > > > > > > > been > > > > > > > > > > > able to > > > > > > > > > > > > > > > > leverage that. So the solution should be > > > > > > > > > > > > > > > > building > > > > > more > > > > > > > > > > catalogs. > > > > > > > > > > > Such > > > > > > > > > > > > > > > > params should be part of a Flink table > > > > > DDL/definition, and > > > > > > > > > not > > > > > > > > > > > overridable > > > > > > > > > > > > > > > > in any means. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - type 2: Runtime params, like jdbc connector's > > > > > fetch size, > > > > > > > > > > > elasticsearch > > > > > > > > > > > > > > > > connector's bulk flush size. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Such params don't affect query results, but > > > affect > > > > > how > > > > > > > > > results > > > > > > > > > > > are produced > > > > > > > > > > > > > > > > (eg. fast or slow, aka performance) - they are > > > > > essentially > > > > > > > > > > > execution and > > > > > > > > > > > > > > > > implementation details. They change often in > > > > > exploration or > > > > > > > > > > > development > > > > > > > > > > > > > > > > stages, but not quite frequently in well-defined > > > > > > > > long-running > > > > > > > > > > > pipelines. > > > > > > > > > > > > > > > > They should always have default values and can > > > > > > > > > > > > > > > > be > > > > > missing > > > > > > > > in > > > > > > > > > > > query. They > > > > > > > > > > > > > > > > can be part of a table DDL/definition, but > > > > > > > > > > > > > > > > should > > > > > also be > > > > > > > > > > > replaceable in a > > > > > > > > > > > > > > > > query - *this is what table "hints" in FLIP-113 > > > > > should > > > > > > > > > cover*. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - type 3: Semantic params, like kafka > > > > > > > > > > > > > > > > connector's > > > > > start > > > > > > > > > offset. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Such params affect query results - the > > > > > > > > > > > > > > > > semantics. > > > > > They'd > > > > > > > > > better > > > > > > > > > > > be as > > > > > > > > > > > > > > > > filter conditions in WHERE clause that can be > > > pushed > > > > > down. > > > > > > > > > They > > > > > > > > > > > change > > > > > > > > > > > > > > > > almost every time a query starts and have > > > nothing to > > > > > do > > > > > > > > with > > > > > > > > > > > metadata, thus > > > > > > > > > > > > > > > > should not be part of table definition/DDL, nor > > > be > > > > > > > > persisted > > > > > > > > > in > > > > > > > > > > > catalogs. > > > > > > > > > > > > > > > > If they will, users should create views to keep > > > such > > > > > params > > > > > > > > > > > around (note > > > > > > > > > > > > > > > > this is different from variable substitution). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Take Flink-Kafka as an example. Once we get > > > > > > > > > > > > > > > > these > > > > > params > > > > > > > > > right, > > > > > > > > > > > here're the > > > > > > > > > > > > > > > > steps users need to do to develop and run a > > > > > > > > > > > > > > > > Flink > > > > > job: > > > > > > > > > > > > > > > > - configure a Flink ConfluentSchemaRegistry with > > > url, > > > > > > > > > username, > > > > > > > > > > > and password > > > > > > > > > > > > > > > > - run "SELECT * FROM mykafka WHERE offset > 12pm > > > > > yesterday" > > > > > > > > > > > (simplified > > > > > > > > > > > > > > > > timestamp) in SQL CLI, Flink automatically > > > retrieves > > > > > all > > > > > > > > > > > metadata of > > > > > > > > > > > > > > > > schema, file format, etc and start the job > > > > > > > > > > > > > > > > - users want to make the job read Kafka topic > > > > > faster, so it > > > > > > > > > > goes > > > > > > > > > > > as "SELECT > > > > > > > > > > > > > > > > * FROM mykafka /* faster_read_key=value*/ WHERE > > > > > offset > > > > > > > > > 12pm > > > > > > > > > > > yesterday" > > > > > > > > > > > > > > > > - done and satisfied, users submit it to > > > production > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding "CREATE TABLE t LIKE with (k1=v1, > > > k2=v2), > > > > > I think > > > > > > > > > > it's > > > > > > > > > > > a > > > > > > > > > > > > > > > > nice-to-have feature, but not a strategically > > > > > critical, > > > > > > > > > > > long-term solution, > > > > > > > > > > > > > > > > because > > > > > > > > > > > > > > > > 1) It may seem promising at the current stage to > > > > > solve the > > > > > > > > > > > > > > > > too-much-manual-work problem, but that's only > > > > > because Flink > > > > > > > > > > > hasn't > > > > > > > > > > > > > > > > leveraged catalogs well and handled the 3 types > > > of > > > > > params > > > > > > > > > above > > > > > > > > > > > properly. > > > > > > > > > > > > > > > > Once we get the params types right, the LIKE > > > syntax > > > > > won't > > > > > > > > be > > > > > > > > > > that > > > > > > > > > > > > > > > > important, and will be just an easier way to > > > create > > > > > tables > > > > > > > > > > > without retyping > > > > > > > > > > > > > > > > long fields like username and pwd. > > > > > > > > > > > > > > > > 2) Note that only some rare type of catalog can > > > > > store k-v > > > > > > > > > > > property pair, so > > > > > > > > > > > > > > > > table created this way often cannot be > > > persisted. In > > > > > the > > > > > > > > > > > foreseeable > > > > > > > > > > > > > > > > future, such catalog will only be HiveCatalog, > > > and > > > > > not > > > > > > > > > everyone > > > > > > > > > > > has a Hive > > > > > > > > > > > > > > > > metastore. To be honest, without persistence, > > > > > recreating > > > > > > > > > tables > > > > > > > > > > > every time > > > > > > > > > > > > > > > > this way is still a lot of keyboard typing. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > Bowen > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 10, 2020 at 8:07 PM Kurt Young < > > > > > > > > ykt...@gmail.com > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If a specific connector want to have such > > > > > parameter and > > > > > > > > > read > > > > > > > > > > > if out of > > > > > > > > > > > > > > > > > configuration, then that's fine. > > > > > > > > > > > > > > > > > If we are talking about a configuration for > > > > > > > > > > > > > > > > > all > > > > > kinds of > > > > > > > > > > > sources, I would > > > > > > > > > > > > > > > > > be super careful about that. > > > > > > > > > > > > > > > > > It's true it can solve maybe 80% cases, but it > > > > > will also > > > > > > > > > make > > > > > > > > > > > the left 20% > > > > > > > > > > > > > > > > > feels weird. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > Kurt > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 11, 2020 at 11:00 AM Jark Wu < > > > > > > > > imj...@gmail.com > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Kurt, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > #3 Regarding to global offset: > > > > > > > > > > > > > > > > > > I'm not saying to use the global > > > configuration to > > > > > > > > > override > > > > > > > > > > > connector > > > > > > > > > > > > > > > > > > properties by the planner. > > > > > > > > > > > > > > > > > > But the connector should take this > > > configuration > > > > > and > > > > > > > > > > > translate into their > > > > > > > > > > > > > > > > > > client API. > > > > > > > > > > > > > > > > > > AFAIK, almost all the message queues support > > > > > eariliest > > > > > > > > > and > > > > > > > > > > > latest and a > > > > > > > > > > > > > > > > > > timestamp value as start point. > > > > > > > > > > > > > > > > > > So we can support 3 options for this > > > > > configuration: > > > > > > > > > > > "eariliest", "latest" > > > > > > > > > > > > > > > > > > and a timestamp string value. > > > > > > > > > > > > > > > > > > Of course, this can't solve 100% cases, but > > > > > > > > > > > > > > > > > > I > > > > > guess can > > > > > > > > > > > sovle 80% or 90% > > > > > > > > > > > > > > > > > > cases. > > > > > > > > > > > > > > > > > > And the remaining cases can be resolved by > > > LIKE > > > > > syntax > > > > > > > > > > which > > > > > > > > > > > I guess is > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > very common cases. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, 11 Mar 2020 at 10:33, Kurt Young < > > > > > > > > > ykt...@gmail.com > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Good to have such lovely discussions. I > > > also > > > > > want to > > > > > > > > > > share > > > > > > > > > > > some of my > > > > > > > > > > > > > > > > > > > opinions. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > #1 Regarding to error handling: I also > > > think > > > > > ignore > > > > > > > > > > > invalid hints would > > > > > > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > > > dangerous, maybe > > > > > > > > > > > > > > > > > > > the simplest solution is just throw an > > > > > exception. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > #2 Regarding to property replacement: I > > > don't > > > > > think > > > > > > > > we > > > > > > > > > > > should > > > > > > > > > > > > > > > > > constraint > > > > > > > > > > > > > > > > > > > ourself to > > > > > > > > > > > > > > > > > > > the meaning of the word "hint", and > > > forbidden > > > > > it > > > > > > > > > > modifying > > > > > > > > > > > any > > > > > > > > > > > > > > > > > properties > > > > > > > > > > > > > > > > > > > which can effect > > > > > > > > > > > > > > > > > > > query results. IMO `PROPERTIES` is one of > > > the > > > > > table > > > > > > > > > > hints, > > > > > > > > > > > and a > > > > > > > > > > > > > > > > > powerful > > > > > > > > > > > > > > > > > > > one. It can > > > > > > > > > > > > > > > > > > > modify properties located in DDL's WITH > > > block. > > > > > But I > > > > > > > > > also > > > > > > > > > > > see the harm > > > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > > if we make it > > > > > > > > > > > > > > > > > > > too flexible like change the kafka topic > > > name > > > > > with a > > > > > > > > > > hint. > > > > > > > > > > > Such use > > > > > > > > > > > > > > > > > case > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > not common and > > > > > > > > > > > > > > > > > > > sounds very dangerous to me. I would > > > propose > > > > > we have > > > > > > > > a > > > > > > > > > > map > > > > > > > > > > > of hintable > > > > > > > > > > > > > > > > > > > properties for each > > > > > > > > > > > > > > > > > > > connector, and should validate all passed > > > in > > > > > > > > properties > > > > > > > > > > > are actually > > > > > > > > > > > > > > > > > > > hintable. And combining with > > > > > > > > > > > > > > > > > > > #1 error handling, we can throw an > > > exception > > > > > once > > > > > > > > > > received > > > > > > > > > > > invalid > > > > > > > > > > > > > > > > > > > property. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > #3 Regarding to global offset: I'm not > > > > > > > > > > > > > > > > > > > sure > > > > > it's > > > > > > > > > > feasible. > > > > > > > > > > > Different > > > > > > > > > > > > > > > > > > > connectors will have totally > > > > > > > > > > > > > > > > > > > different properties to represent offset, > > > some > > > > > might > > > > > > > > be > > > > > > > > > > > timestamps, > > > > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > > > might be string literals > > > > > > > > > > > > > > > > > > > like "earliest", and others might be just > > > > > integers. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > Kurt > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 10, 2020 at 11:46 PM Jark Wu < > > > > > > > > > > imj...@gmail.com> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I want to jump in the discussion about > > > the > > > > > "dynamic > > > > > > > > > > > start offset" > > > > > > > > > > > > > > > > > > > problem. > > > > > > > > > > > > > > > > > > > > First of all, I share the same concern > > > with > > > > > Timo > > > > > > > > and > > > > > > > > > > > Fabian, that the > > > > > > > > > > > > > > > > > > > > "start offset" affects the query > > > semantics, > > > > > i.e. > > > > > > > > the > > > > > > > > > > > query result. > > > > > > > > > > > > > > > > > > > > But "hints" is just used for > > > > > > > > > > > > > > > > > > > > optimization > > > > > which > > > > > > > > > should > > > > > > > > > > > affect the > > > > > > > > > > > > > > > > > > result? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think the "dynamic start offset" is an > > > very > > > > > > > > > important > > > > > > > > > > > usability > > > > > > > > > > > > > > > > > > problem > > > > > > > > > > > > > > > > > > > > which will be faced by many streaming > > > > > platforms. > > > > > > > > > > > > > > > > > > > > I also agree "CREATE TEMPORARY TABLE > > > > > > > > > > > > > > > > > > > > Temp > > > > > (LIKE t) > > > > > > > > > WITH > > > > > > > > > > > > > > > > > > > > ('connector.startup-timestamp-millis' = > > > > > > > > > > > '1578538374471')" is verbose, > > > > > > > > > > > > > > > > > > > what > > > > > > > > > > > > > > > > > > > > if we have 10 tables to join? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > However, what I want to propose (should > > > be > > > > > another > > > > > > > > > > > thread) is a > > > > > > > > > > > > > > > > > global > > > > > > > > > > > > > > > > > > > > configuration to reset start offsets of > > > all > > > > > the > > > > > > > > > source > > > > > > > > > > > connectors > > > > > > > > > > > > > > > > > > > > in the query session, e.g. > > > > > > > > > > "table.sources.start-offset". > > > > > > > > > > > This is > > > > > > > > > > > > > > > > > > possible > > > > > > > > > > > > > > > > > > > > now because `TableSourceFactory.Context` > > > has > > > > > > > > > > > `getConfiguration` > > > > > > > > > > > > > > > > > > > > method to get the session configuration, > > > and > > > > > use it > > > > > > > > > to > > > > > > > > > > > create an > > > > > > > > > > > > > > > > > > adapted > > > > > > > > > > > > > > > > > > > > TableSource. > > > > > > > > > > > > > > > > > > > > Then we can also expose to SQL CLI via > > > SET > > > > > command, > > > > > > > > > > e.g. > > > > > > > > > > > `SET > > > > > > > > > > > > > > > > > > > > > > > 'table.sources.start-offset'='earliest';`, > > > > > which is > > > > > > > > > > > pretty simple and > > > > > > > > > > > > > > > > > > > > straightforward. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This is very similar to KSQL's `SET > > > > > > > > > > > 'auto.offset.reset'='earliest'` > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > is very helpful IMO. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, 10 Mar 2020 at 22:29, Timo > > > Walther < > > > > > > > > > > > twal...@apache.org> > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Danny, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > compared to the hints, FLIP-110 is > > > fully > > > > > > > > compliant > > > > > > > > > to > > > > > > > > > > > the SQL > > > > > > > > > > > > > > > > > > standard. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't think that `CREATE TEMPORARY > > > TABLE > > > > > Temp > > > > > > > > > (LIKE > > > > > > > > > > > t) WITH > > > > > > > > > > > > > > > > > (k=v)` > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > too verbose or awkward for the power > > > > > > > > > > > > > > > > > > > > > of > > > > > basically > > > > > > > > > > > changing the > > > > > > > > > > > > > > > > > entire > > > > > > > > > > > > > > > > > > > > > connector. Usually, this statement > > > would > > > > > just > > > > > > > > > precede > > > > > > > > > > > the query in > > > > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > > > > multiline file. So it can be change > > > > > "in-place" > > > > > > > > like > > > > > > > > > > > the hints you > > > > > > > > > > > > > > > > > > > > proposed. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Many companies have a well-defined set > > > of > > > > > tables > > > > > > > > > that > > > > > > > > > > > should be > > > > > > > > > > > > > > > > > used. > > > > > > > > > > > > > > > > > > > It > > > > > > > > > > > > > > > > > > > > > would be dangerous if users can change > > > the > > > > > path > > > > > > > > or > > > > > > > > > > > topic in a hint. > > > > > > > > > > > > > > > > > > The > > > > > > > > > > > > > > > > > > > > > catalog/catalog manager should be the > > > > > entity that > > > > > > > > > > > controls which > > > > > > > > > > > > > > > > > > tables > > > > > > > > > > > > > > > > > > > > > exist and how they can be accessed. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > what’s the problem there if we user > > > the > > > > > table > > > > > > > > > hints > > > > > > > > > > > to support > > > > > > > > > > > > > > > > > > > “start > > > > > > > > > > > > > > > > > > > > > offset”? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > IMHO it violates the meaning of a > > > > > > > > > > > > > > > > > > > > > hint. > > > > > According > > > > > > > > > to > > > > > > > > > > > the > > > > > > > > > > > > > > > > > dictionary, > > > > > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > > > > hint is "a statement that expresses > > > > > indirectly > > > > > > > > what > > > > > > > > > > > one prefers not > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > say explicitly". But offsets are a > > > > > property that > > > > > > > > > are > > > > > > > > > > > very explicit. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If we go with the hint approach, it > > > should > > > > > be > > > > > > > > > > > expressible in the > > > > > > > > > > > > > > > > > > > > > TableSourceFactory which properties > > > > > > > > > > > > > > > > > > > > > are > > > > > supported > > > > > > > > > for > > > > > > > > > > > hinting. Or > > > > > > > > > > > > > > > > > do > > > > > > > > > > > > > > > > > > > you > > > > > > > > > > > > > > > > > > > > > plan to offer those hints in a > > > > > > > > > > > > > > > > > > > > > separate > > > > > > > > Map<String, > > > > > > > > > > > String> that > > > > > > > > > > > > > > > > > > cannot > > > > > > > > > > > > > > > > > > > > > overwrite existing properties? I think > > > > > this would > > > > > > > > > be > > > > > > > > > > a > > > > > > > > > > > different > > > > > > > > > > > > > > > > > > > story... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > > > > > > Timo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 10.03.20 10:34, Danny Chan wrote: > > > > > > > > > > > > > > > > > > > > > > Thanks Timo ~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Personally I would say that offset > > > > 0 > > > > > and > > > > > > > > start > > > > > > > > > > > offset = 10 does > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > > have the same semantic, so from the > > > > > > > > > > > > > > > > > > > > > SQL > > > > > aspect, > > > > > > > > we > > > > > > > > > > can > > > > > > > > > > > not > > > > > > > > > > > > > > > > > implement > > > > > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > > > > “starting offset” hint for query with > > > such > > > > > a > > > > > > > > > syntax. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > And the CREATE TABLE LIKE syntax is > > > > > > > > > > > > > > > > > > > > > > a > > > > > DDL which > > > > > > > > > is > > > > > > > > > > > just verbose > > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > > defining such dynamic parameters even > > > if > > > > > it could > > > > > > > > > do > > > > > > > > > > > that, shall we > > > > > > > > > > > > > > > > > > > force > > > > > > > > > > > > > > > > > > > > > users to define a temporal table for > > > each > > > > > query > > > > > > > > > with > > > > > > > > > > > dynamic > > > > > > > > > > > > > > > > > params, > > > > > > > > > > > > > > > > > > I > > > > > > > > > > > > > > > > > > > > > would say it’s an awkward solution. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > "Hints should give "hints" but not > > > > > affect the > > > > > > > > > > actual > > > > > > > > > > > produced > > > > > > > > > > > > > > > > > > > result.” > > > > > > > > > > > > > > > > > > > > > You mentioned that multiple times and > > > > > could we > > > > > > > > > give a > > > > > > > > > > > reason, > > > > > > > > > > > > > > > > > what’s > > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > problem there if we user the table > > > hints to > > > > > > > > support > > > > > > > > > > > “start offset” > > > > > > > > > > > > > > > > > ? > > > > > > > > > > > > > > > > > > > From > > > > > > > > > > > > > > > > > > > > > my side I saw some benefits for that: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > • It’s very convent to set up these > > > > > parameters, > > > > > > > > > the > > > > > > > > > > > syntax is > > > > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > > > much > > > > > > > > > > > > > > > > > > > > > like the DDL definition > > > > > > > > > > > > > > > > > > > > > > • It’s scope is very clear, right on > > > the > > > > > table > > > > > > > > it > > > > > > > > > > > attathed > > > > > > > > > > > > > > > > > > > > > > • It does not affect the table > > > schema, > > > > > which > > > > > > > > > means > > > > > > > > > > > in order to > > > > > > > > > > > > > > > > > > > specify > > > > > > > > > > > > > > > > > > > > > the offset, there is no need to define > > > an > > > > > offset > > > > > > > > > > > column which is > > > > > > > > > > > > > > > > > > weird > > > > > > > > > > > > > > > > > > > > > actually, offset should never be a > > > column, > > > > > it’s > > > > > > > > > more > > > > > > > > > > > like a > > > > > > > > > > > > > > > > > metadata > > > > > > > > > > > > > > > > > > > or a > > > > > > > > > > > > > > > > > > > > > start option. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > So in total, FLIP-110 uses the > > > > > > > > > > > > > > > > > > > > > > offset > > > > > more > > > > > > > > like a > > > > > > > > > > > Hive partition > > > > > > > > > > > > > > > > > > > prune, > > > > > > > > > > > > > > > > > > > > > we can do that if we have an offset > > > > > column, but > > > > > > > > > most > > > > > > > > > > > of the case we > > > > > > > > > > > > > > > > > > do > > > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > > define that, so there is actually no > > > > > conflict or > > > > > > > > > > > overlap. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > Danny Chan > > > > > > > > > > > > > > > > > > > > > > 在 2020年3月10日 +0800 PM4:28,Timo > > > Walther < > > > > > > > > > > > twal...@apache.org>,写道: > > > > > > > > > > > > > > > > > > > > > > > Hi Danny, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > shouldn't FLIP-110[1] solve most > > > of the > > > > > > > > > problems > > > > > > > > > > > we have around > > > > > > > > > > > > > > > > > > > > defining > > > > > > > > > > > > > > > > > > > > > > > table properties more dynamically > > > > > without > > > > > > > > > manual > > > > > > > > > > > schema work? > > > > > > > > > > > > > > > > > Also > > > > > > > > > > > > > > > > > > > > > > > offset definition is easier with > > > such a > > > > > > > > syntax. > > > > > > > > > > > They must not be > > > > > > > > > > > > > > > > > > > > defined > > > > > > > > > > > > > > > > > > > > > > > in catalog but could be temporary > > > > > tables that > > > > > > > > > > > extend from the > > > > > > > > > > > > > > > > > > > original > > > > > > > > > > > > > > > > > > > > > > > table. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In general, we should aim to keep > > > the > > > > > syntax > > > > > > > > > > > concise and don't > > > > > > > > > > > > > > > > > > > provide > > > > > > > > > > > > > > > > > > > > > > > too many ways of doing the same > > > thing. > > > > > Hints > > > > > > > > > > > should give "hints" > > > > > > > > > > > > > > > > > > but > > > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > > > > affect the actual produced result. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Some connector properties might > > > also > > > > > change > > > > > > > > the > > > > > > > > > > > plan or schema > > > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > > > future. E.g. they might also > > > > > > > > > > > > > > > > > > > > > > > define > > > > > whether a > > > > > > > > > > > table source > > > > > > > > > > > > > > > > > > supports > > > > > > > > > > > > > > > > > > > > > > > certain push-downs (e.g. predicate > > > > > > > > push-down). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Dawid is currently working a draft > > > > > that might > > > > > > > > > > > makes it possible > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > > > expose a Kafka offset via the > > > schema > > > > > such > > > > > > > > that > > > > > > > > > > > `SELECT * FROM > > > > > > > > > > > > > > > > > > Topic > > > > > > > > > > > > > > > > > > > > > > > WHERE offset > 10` would become > > > > > possible and > > > > > > > > > > could > > > > > > > > > > > be pushed > > > > > > > > > > > > > > > > > down. > > > > > > > > > > > > > > > > > > > But > > > > > > > > > > > > > > > > > > > > > > > this is of course, not planned > > > > > initially. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > > > > > > > > Timo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 10.03.20 08:34, Danny Chan > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > Thanks Wenlong ~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For PROPERTIES Hint Error > > > handling > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Actually we have no way to > > > figure out > > > > > > > > > whether a > > > > > > > > > > > error prone > > > > > > > > > > > > > > > > > hint > > > > > > > > > > > > > > > > > > > is a > > > > > > > > > > > > > > > > > > > > > PROPERTIES hint, for example, if use > > > > > writes a > > > > > > > > hint > > > > > > > > > > like > > > > > > > > > > > > > > > > > ‘PROPERTIAS’, > > > > > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > > > do > > > > > > > > > > > > > > > > > > > > > not know if this hint is a PROPERTIES > > > > > hint, what > > > > > > > > we > > > > > > > > > > > know is that > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > hint > > > > > > > > > > > > > > > > > > > > > name was not registered in our Flink. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the user writes the hint name > > > > > correctly > > > > > > > > > > (i.e. > > > > > > > > > > > PROPERTIES), > > > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > > did > > > > > > > > > > > > > > > > > > > > > can enforce the validation of the hint > > > > > options > > > > > > > > > though > > > > > > > > > > > the pluggable > > > > > > > > > > > > > > > > > > > > > HintOptionChecker. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For PROPERTIES Hint Option > > > > > > > > > > > > > > > > > > > > > > > > Format > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For a key value style hint > > > option, > > > > > the key > > > > > > > > > can > > > > > > > > > > > be either a > > > > > > > > > > > > > > > > > simple > > > > > > > > > > > > > > > > > > > > > identifier or a string literal, which > > > > > means that > > > > > > > > > it’s > > > > > > > > > > > compatible > > > > > > > > > > > > > > > > > with > > > > > > > > > > > > > > > > > > > our > > > > > > > > > > > > > > > > > > > > > DDL syntax. We support simple > > > identifier > > > > > because > > > > > > > > > many > > > > > > > > > > > other hints > > > > > > > > > > > > > > > > > do > > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > > have the component complex keys like > > > the > > > > > table > > > > > > > > > > > properties, and we > > > > > > > > > > > > > > > > > > want > > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > unify the parse block. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > Danny Chan > > > > > > > > > > > > > > > > > > > > > > > > 在 2020年3月10日 +0800 > > > > > PM3:19,wenlong.lwl < > > > > > > > > > > > wenlong88....@gmail.com > > > > > > > > > > > > > > > > > > > > ,写道: > > > > > > > > > > > > > > > > > > > > > > > > > Hi Danny, thanks for the > > > proposal. > > > > > +1 for > > > > > > > > > > > adding table hints, > > > > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > really > > > > > > > > > > > > > > > > > > > > > > > > > a necessary feature for flink > > > sql > > > > > to > > > > > > > > > > integrate > > > > > > > > > > > with a catalog. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For error handling, I think it > > > > > would be > > > > > > > > > more > > > > > > > > > > > natural to throw > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > > > > > > exception when error table > > > > > > > > > > > > > > > > > > > > > > > > > hint > > > > > provided, > > > > > > > > > > > because the > > > > > > > > > > > > > > > > > properties > > > > > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > > > > > hint > > > > > > > > > > > > > > > > > > > > > > > > > will be merged and used to > > > > > > > > > > > > > > > > > > > > > > > > > find > > > > > the table > > > > > > > > > > > factory which would > > > > > > > > > > > > > > > > > > > cause > > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > > > > > > exception when error > > > > > > > > > > > > > > > > > > > > > > > > > properties > > > > > provided, > > > > > > > > > > > right? On the other > > > > > > > > > > > > > > > > > > > hand, > > > > > > > > > > > > > > > > > > > > > unlike > > > > > > > > > > > > > > > > > > > > > > > > > other hints which just affect > > > the > > > > > way to > > > > > > > > > > > execute the query, > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > property > > > > > > > > > > > > > > > > > > > > > > > > > table hint actually affects > > > > > > > > > > > > > > > > > > > > > > > > > the > > > > > result of > > > > > > > > > the > > > > > > > > > > > query, we should > > > > > > > > > > > > > > > > > > > never > > > > > > > > > > > > > > > > > > > > > ignore > > > > > > > > > > > > > > > > > > > > > > > > > the given property hints. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For the format of property > > > hints, > > > > > > > > > currently, > > > > > > > > > > > in sql client, we > > > > > > > > > > > > > > > > > > > > accept > > > > > > > > > > > > > > > > > > > > > > > > > properties in format of string > > > > > only in > > > > > > > > DDL: > > > > > > > > > > > > > > > > > > > > 'connector.type'='kafka', > > > > > > > > > > > > > > > > > > > > > I > > > > > > > > > > > > > > > > > > > > > > > > > think the format of properties > > > in > > > > > hint > > > > > > > > > should > > > > > > > > > > > be the same as > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > format we > > > > > > > > > > > > > > > > > > > > > > > > > defined in ddl. What do you > > > think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Bests, > > > > > > > > > > > > > > > > > > > > > > > > > Wenlong Lyu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, 10 Mar 2020 at 14:22, > > > > > Danny Chan > > > > > > > > < > > > > > > > > > > > > > > > > > yuzhao....@gmail.com> > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To Weike: About the Error > > > Handing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To be consistent with other > > > SQL > > > > > > > > vendors, > > > > > > > > > > the > > > > > > > > > > > default is to > > > > > > > > > > > > > > > > > log > > > > > > > > > > > > > > > > > > > > > warnings > > > > > > > > > > > > > > > > > > > > > > > > > > and if there is any error > > > > > (invalid hint > > > > > > > > > > name > > > > > > > > > > > or options), the > > > > > > > > > > > > > > > > > > > hint > > > > > > > > > > > > > > > > > > > > > is just > > > > > > > > > > > > > > > > > > > > > > > > > > ignored. I have already > > > > > addressed in > > > > > > > > the > > > > > > > > > > > wiki. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To Timo: About the > > > > > > > > > > > > > > > > > > > > > > > > > > PROPERTIES > > > > > Table > > > > > > > > Hint > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > • The properties hints is > > > also > > > > > > > > optional, > > > > > > > > > > > user can pass in an > > > > > > > > > > > > > > > > > > > option > > > > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > > > > > > override the table > > > > > > > > > > > > > > > > > > > > > > > > > > properties > > > > > but this > > > > > > > > > does > > > > > > > > > > > not mean it is > > > > > > > > > > > > > > > > > > > > required. > > > > > > > > > > > > > > > > > > > > > > > > > > • They should not include > > > > > semantics: > > > > > > > > does > > > > > > > > > > > the properties > > > > > > > > > > > > > > > > > belong > > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > > > > > > semantic ? I don't think so, > > > the > > > > > plan > > > > > > > > > does > > > > > > > > > > > not change right ? > > > > > > > > > > > > > > > > > > The > > > > > > > > > > > > > > > > > > > > > result > > > > > > > > > > > > > > > > > > > > > > > > > > set may be affected, but > > > there > > > > > are > > > > > > > > > already > > > > > > > > > > > some hints do so, > > > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > > example, > > > > > > > > > > > > > > > > > > > > > > > > > > MS-SQL MAXRECURSION and > > > SNAPSHOT > > > > > hint > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > • `SELECT * FROM t(k=v, > > > k=v)`: > > > > > this > > > > > > > > > grammar > > > > > > > > > > > breaks the SQL > > > > > > > > > > > > > > > > > > > standard > > > > > > > > > > > > > > > > > > > > > > > > > > compared to the hints > > > way(which > > > > > is > > > > > > > > > included > > > > > > > > > > > in comments) > > > > > > > > > > > > > > > > > > > > > > > > > > • I actually didn't found > > > > > > > > > > > > > > > > > > > > > > > > > > any > > > > > vendors > > > > > > > > to > > > > > > > > > > > support such > > > > > > > > > > > > > > > > > grammar, > > > > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > > > > there > > > > > > > > > > > > > > > > > > > > > > > > > > is no way to override table > > > level > > > > > > > > > > properties > > > > > > > > > > > dynamically. For > > > > > > > > > > > > > > > > > > > > normal > > > > > > > > > > > > > > > > > > > > > RDBMS, > > > > > > > > > > > > > > > > > > > > > > > > > > I think there are no > > > > > > > > > > > > > > > > > > > > > > > > > > requests > > > > > for such > > > > > > > > > > > dynamic parameters > > > > > > > > > > > > > > > > > > because > > > > > > > > > > > > > > > > > > > > > all the > > > > > > > > > > > > > > > > > > > > > > > > > > table have the same storage > > > and > > > > > > > > > computation > > > > > > > > > > > and they are > > > > > > > > > > > > > > > > > almost > > > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > > > > batch > > > > > > > > > > > > > > > > > > > > > > > > > > tables. > > > > > > > > > > > > > > > > > > > > > > > > > > • While Flink as a > > > computation > > > > > engine > > > > > > > > has > > > > > > > > > > > many connectors, > > > > > > > > > > > > > > > > > > > > > especially for > > > > > > > > > > > > > > > > > > > > > > > > > > some message queue like > > > Kafka, > > > > > we would > > > > > > > > > > have > > > > > > > > > > > a start_offset > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > > > > > > different each time we start > > > the > > > > > query, > > > > > > > > > > such > > > > > > > > > > > parameters can > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > > > > > > > > > > persisted to catalog, > > > > > > > > > > > > > > > > > > > > > > > > > > because > > > > > it’s not > > > > > > > > > > > static, this is > > > > > > > > > > > > > > > > > actually > > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > > > > > > background we propose the > > > table > > > > > hints > > > > > > > > to > > > > > > > > > > > indicate such > > > > > > > > > > > > > > > > > > properties > > > > > > > > > > > > > > > > > > > > > > > > > > dynamically. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To Jark and Jinsong: I have > > > > > removed the > > > > > > > > > > > query hints part and > > > > > > > > > > > > > > > > > > > change > > > > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > > > > > > title. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-query?view=sql-server-ver15 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > Danny Chan > > > > > > > > > > > > > > > > > > > > > > > > > > 在 2020年3月9日 +0800 > > > > > > > > > > > > > > > > > > > > > > > > > > PM5:46,Timo > > > > > Walther < > > > > > > > > > > > twal...@apache.org > > > > > > > > > > > > > > > > > > ,写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Danny, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > thanks for the proposal. I > > > > > agree with > > > > > > > > > > Jark > > > > > > > > > > > and Jingsong. > > > > > > > > > > > > > > > > > > Planner > > > > > > > > > > > > > > > > > > > > > hints > > > > > > > > > > > > > > > > > > > > > > > > > > > and table hints are > > > orthogonal > > > > > topics > > > > > > > > > > that > > > > > > > > > > > should be > > > > > > > > > > > > > > > > > discussed > > > > > > > > > > > > > > > > > > > > > > > > > > separately. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I share Jingsong's opinion > > > > > that we > > > > > > > > > should > > > > > > > > > > > not use planner > > > > > > > > > > > > > > > > > > hints > > > > > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > > > > > > > > passing connector > > > properties. > > > > > Planner > > > > > > > > > > > hints should be > > > > > > > > > > > > > > > > > optional > > > > > > > > > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > any > > > > > > > > > > > > > > > > > > > > > > > > > > > time. They should not > > > include > > > > > > > > semantics > > > > > > > > > > > but only affect > > > > > > > > > > > > > > > > > > > execution > > > > > > > > > > > > > > > > > > > > > time. > > > > > > > > > > > > > > > > > > > > > > > > > > > Connector properties are > > > > > > > > > > > > > > > > > > > > > > > > > > > an > > > > > important > > > > > > > > > > part > > > > > > > > > > > of the query > > > > > > > > > > > > > > > > > > itself. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Have you thought about > > > options > > > > > such > > > > > > > > as > > > > > > > > > > > `SELECT * FROM t(k=v, > > > > > > > > > > > > > > > > > > > > k=v)`? > > > > > > > > > > > > > > > > > > > > > How > > > > > > > > > > > > > > > > > > > > > > > > > > > are other vendors deal > > > > > > > > > > > > > > > > > > > > > > > > > > > with > > > > > this > > > > > > > > > problem? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > > > > > > > > > > > > Timo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 09.03.20 10:37, > > > Jingsong Li > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Danny, +1 for table > > > hints, > > > > > > > > thanks > > > > > > > > > > for > > > > > > > > > > > driving. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I took a look to FLIP, > > > most > > > > > of > > > > > > > > > content > > > > > > > > > > > are talking about > > > > > > > > > > > > > > > > > > query > > > > > > > > > > > > > > > > > > > > > hints. > > > > > > > > > > > > > > > > > > > > > > > > > > It is > > > > > > > > > > > > > > > > > > > > > > > > > > > > hard to discussion and > > > > > voting. So > > > > > > > > +1 > > > > > > > > > to > > > > > > > > > > > split it as Jark > > > > > > > > > > > > > > > > > > said. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Another thing is > > > > > configuration that > > > > > > > > > > > suitable to config with > > > > > > > > > > > > > > > > > > > table > > > > > > > > > > > > > > > > > > > > > > > > > > hints: > > > > > > > > > > > > > > > > > > > > > > > > > > > > "connector.path" and > > > > > > > > > "connector.topic", > > > > > > > > > > > Are they really > > > > > > > > > > > > > > > > > > > suitable > > > > > > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > > > > > > > table > > > > > > > > > > > > > > > > > > > > > > > > > > > > hints? Looks weird to > > > > > > > > > > > > > > > > > > > > > > > > > > > > me. > > > > > Because I > > > > > > > > > > > think these properties > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > > > > > > core of > > > > > > > > > > > > > > > > > > > > > > > > > > > > table. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 9, 2020 at > > > 5:30 > > > > > PM Jark > > > > > > > > > Wu > > > > > > > > > > < > > > > > > > > > > > imj...@gmail.com> > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks Danny for > > > starting > > > > > the > > > > > > > > > > > discussion. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > +1 for this feature. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If we just focus on > > > > > > > > > > > > > > > > > > > > > > > > > > > > > the > > > > > table > > > > > > > > hints > > > > > > > > > > > not the query hints in > > > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > > > > > > > release, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > could you split the > > > FLIP > > > > > into two > > > > > > > > > > > FLIPs? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Because it's hard to > > > vote > > > > > on > > > > > > > > > partial > > > > > > > > > > > part of a FLIP. You > > > > > > > > > > > > > > > > > can > > > > > > > > > > > > > > > > > > > > keep > > > > > > > > > > > > > > > > > > > > > > > > > > the table > > > > > > > > > > > > > > > > > > > > > > > > > > > > > hints proposal in > > > FLIP-113 > > > > > and > > > > > > > > move > > > > > > > > > > > query hints into > > > > > > > > > > > > > > > > > another > > > > > > > > > > > > > > > > > > > > FLIP. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > So that we can focuse > > > on > > > > > the > > > > > > > > table > > > > > > > > > > > hints in the FLIP. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 9 Mar 2020 at > > > > > 17:14, > > > > > > > > DONG, > > > > > > > > > > > Weike < > > > > > > > > > > > > > > > > > > > > kyled...@connect.hku.hk > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Danny, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This is a nice > > > feature, > > > > > +1. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > One thing I am > > > > > interested in > > > > > > > > but > > > > > > > > > > not > > > > > > > > > > > mentioned in the > > > > > > > > > > > > > > > > > > > proposal > > > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > > > > > > > > > error > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > handling, as it is > > > quite > > > > > common > > > > > > > > > for > > > > > > > > > > > users to write > > > > > > > > > > > > > > > > > > > > inappropriate > > > > > > > > > > > > > > > > > > > > > > > > > > hints in > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > SQL code, if illegal > > > or > > > > > "bad" > > > > > > > > > hints > > > > > > > > > > > are given, would the > > > > > > > > > > > > > > > > > > > system > > > > > > > > > > > > > > > > > > > > > > > > > > simply > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ignore them or throw > > > > > > > > exceptions? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks : ) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Weike > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 9, 2020 > > > at > > > > > 5:02 PM > > > > > > > > > > Danny > > > > > > > > > > > Chan < > > > > > > > > > > > > > > > > > > > > yuzhao....@gmail.com> > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Note: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > we only plan to > > > > > support table > > > > > > > > > > > hints in Flink release > > > > > > > > > > > > > > > > > 1.11, > > > > > > > > > > > > > > > > > > > so > > > > > > > > > > > > > > > > > > > > > > > > > > please > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > focus > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > mainly on the > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > table > > > > > hints > > > > > > > > part > > > > > > > > > > and > > > > > > > > > > > just ignore the > > > > > > > > > > > > > > > > > planner > > > > > > > > > > > > > > > > > > > > > > > > > > hints, sorry > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > that mistake ~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Danny Chan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 在 2020年3月9日 +0800 > > > > > > > > PM4:36,Danny > > > > > > > > > > > Chan < > > > > > > > > > > > > > > > > > yuzhao....@gmail.com > > > > > > > > > > > > > > > > > > > > > ,写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, fellows ~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I would like to > > > > > propose the > > > > > > > > > > > supports for SQL hints for > > > > > > > > > > > > > > > > > > our > > > > > > > > > > > > > > > > > > > > > > > > > > Flink SQL. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would support > > > > > hints > > > > > > > > syntax > > > > > > > > > > as > > > > > > > > > > > following: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > select /*+ > > > > > NO_HASH_JOIN, > > > > > > > > > > > RESOURCE(mem='128mb', > > > > > > > > > > > > > > > > > > > > > > > > > > parallelism='24') */ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > from > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > emp /*+ > > > INDEX(idx1, > > > > > idx2) > > > > > > > > */ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > join > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > dept /*+ > > > > > > > > PROPERTIES(k1='v1', > > > > > > > > > > > k2='v2') */ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > emp.deptno = > > > > > dept.deptno > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Basically we > > > would > > > > > support > > > > > > > > > both > > > > > > > > > > > query hints(after the > > > > > > > > > > > > > > > > > > > SELECT > > > > > > > > > > > > > > > > > > > > > > > > > > keyword) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > and table > > > hints(after > > > > > the > > > > > > > > > > > referenced table name), for > > > > > > > > > > > > > > > > > > 1.11, > > > > > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > > > > > > > > > plan to > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > only > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > support table > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > hints > > > > > with a > > > > > > > > hint > > > > > > > > > > > probably named > > > > > > > > > > > > > > > > > PROPERTIES: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > table_name /*+ > > > > > > > > > > > PROPERTIES(k1='v1', k2='v2') *+/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am looking > > > forward > > > > > to > > > > > > > > your > > > > > > > > > > > comments. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > You can access > > > the > > > > > FLIP > > > > > > > > here: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+SQL+and+Planner+Hints > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Danny Chan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >