Hi Julian, Viliam, Thanks for advises in CALCITE-4337 <https://issues.apache.org/jira/browse/CALCITE-4337>. Please have another look at CALCITE-4337 <https://issues.apache.org/jira/browse/CALCITE-4337>. If there is no objections, I would continue develop work based on discussion conclusion.
Best wishes, JING ZHANG JING ZHANG <[email protected]> 于2021年10月12日周二 下午5:59写道: > Hi Julian, > Thanks very much for professional comments in CALCITE-4337. > I have checked the SQL standard and some database vendors behavior on PTF > in order to figure out your questions. I left the message in > https://issues.apache.org/jira/browse/CALCITE-4337. > Please correct me if I'm wrong. Thanks a lot. > > Best, > JING ZHANG > > JING ZHANG <[email protected]> 于2021年10月11日周一 上午11:25写道: > >> Sorry for late reply because we were in a vocation holiday. >> >> @Julian >> >>> Thanks for the examples. The PARTITION BY syntax is a clear improvement >>> for the SESSION function and I think we should do it, even though it is >>> breaking. >> >> Thanks for great suggestion. >> >> I’ll make further comments against >>> https://issues.apache.org/jira/browse/CALCITE-4337 < >>> https://issues.apache.org/jira/browse/CALCITE-4337> >>> >> The further comments in JIRA is great and very professional. I need >> double check in the SQL standard for some points. Once I finish it, I would >> reply in the JIRA as soon as possible. >> >> @Viliam >> >>> the table argument, according to the sql standard, must be in >>> parentheses, like this: >>> >>> SELECT * >>> FROM TABLE(SESSION(TABLE(input_table), ... >>> >> Good point, I would keep it in mind. >> >> Best, >> JING ZHANG >> >> >> Viliam Durina <[email protected]> 于2021年10月3日周日 上午2:21写道: >> >>> Btw, the table argument, according to the sql standard, must be in >>> parentheses, like this: >>> >>> SELECT * >>> FROM TABLE(SESSION(TABLE(input_table), ... >>> >>> When doing a breaking change, we should also consider this. >>> >>> Viliam >>> >>> On Thu, 30 Sept 2021 at 18:11, Julian Hyde <[email protected]> >>> wrote: >>> >>> > Thanks for the examples. The PARTITION BY syntax is a clear improvement >>> > for the SESSION function and I think we should do it, even though it is >>> > breaking. >>> > >>> > I’ll make further comments against >>> > https://issues.apache.org/jira/browse/CALCITE-4337 < >>> > https://issues.apache.org/jira/browse/CALCITE-4337>. >>> > >>> > > On Sep 29, 2021, at 9:58 PM, JING ZHANG <[email protected]> >>> wrote: >>> > > >>> > > Hi Julian, >>> > > Thanks for your feedback, the suggestion is very helpful. >>> > > I've added the discussion content the CALCITE-4337 >>> > > <https://issues.apache.org/jira/browse/CALCITE-4337> [1]. I would >>> > continue >>> > > later discussion in the JIRA case. >>> > > About an example of a query before and after the syntax change. I >>> would >>> > use >>> > > the example in session table function document >>> > > <https://calcite.apache.org/docs/reference.html#session> [2]. >>> > > Old syntax demo: >>> > > >>> > >> SELECT * FROM TABLE( SESSION( TABLE orders, DESCRIPTOR(rowtime), >>> > >> DESCRIPTOR(product), INTERVAL '20' MINUTE)); -- or with the named >>> > params -- >>> > >> note: the DATA param must be the first SELECT * FROM TABLE( SESSION( >>> > DATA >>> > >> => TABLE orders, TIMECOL => DESCRIPTOR(rowtime), KEY => >>> > DESCRIPTOR(product >>> > >> ), SIZE => INTERVAL '20' MINUTE)); >>> > > >>> > > >>> > > New syntax demo is as follows, the difference is use PARTITION BY >>> clause >>> > to >>> > > replace KEY DESCRIPTOR. >>> > > >>> > >> SELECT * FROM TABLE( SESSION( TABLE orders PARTITION BY product, >>> > >> DESCRIPTOR(rowtime), INTERVAL '20' MINUTE)); -- or with the named >>> > params -- >>> > >> note: the DATA param must be the first SELECT * FROM TABLE( SESSION( >>> > DATA >>> > >> => TABLE orders PARTITION BY product, TIMECOL => >>> DESCRIPTOR(rowtime), >>> > SIZE >>> > >> => INTERVAL '20' MINUTE)); >>> > > >>> > > >>> > > Best, >>> > > JING ZHANG >>> > > >>> > > Julian Hyde <[email protected]> 于2021年9月30日周四 上午4:55写道: >>> > > >>> > >> Regarding changes to the syntax of the SESSION table function. I am >>> open >>> > >> to this, even though it would be a breaking change. Can you give an >>> > example >>> > >> of a query before and after the syntax change? >>> > >> >>> > >> I would like to support the new PARTITIONED BY clause for table >>> > functions. >>> > >> I encourage you to make the change for table functions in general, >>> > before >>> > >> and separately from the change to the SESSION function and window >>> > functions. >>> > >> >>> > >> Please ensure that the discussion gets added to the JIRA case. It >>> might >>> > be >>> > >> best if we continue discussion in the JIRA case. >>> > >> >>> > >> Julian >>> > >> >>> > >> >>> > >>> On Sep 28, 2021, at 10:28 PM, JING ZHANG <[email protected]> >>> wrote: >>> > >>> >>> > >>> Hi community, >>> > >>> I'm now working on CALCITE-4337 >>> > >>> <https://issues.apache.org/jira/browse/CALCITE-4337> [1] which >>> aims to >>> > >>> support PARTITION BY clause for table function argument. >>> > >>> I've submitted a pull request >>> > >>> <https://github.com/apache/calcite/pull/2524> [2], >>> > >>> thanks @Danny very much for review. >>> > >>> There are two points left which need more discussion. So I fire >>> this >>> > >>> discussion in order to get more broader suggestions. >>> > >>> 1. SQL standard Polymorphic Table Functions >>> > >>> < >>> > >> >>> > >>> https://standards.iso.org/ittf/PubliclyAvailableStandards/c069776_ISO_IEC_TR_19075-7_2017.zip >>> > >>> >>> > >>> [3] >>> > >>> states: >>> > >>> >>> > >>>> Input tables have either row semantics or set semantics, as >>> follows: >>> > >>>> a) Row semantics means that the the result of the PTF is decided >>> on a >>> > >>>> row-by-row basis. As an extreme example, the DBMS could atomize >>> the >>> > >> input >>> > >>>> table into individual rows, and send each single row to a >>> different >>> > >> virtual >>> > >>>> processor. >>> > >>>> b) Set semantics means that the outcome of the function depends >>> on how >>> > >> the >>> > >>>> data is partitioned. A partition may not be split across virtual >>> > >>>> processors, nor may a virtual processor handle more than one >>> > partition. >>> > >>> >>> > >>> >>> > >>> A SESSION window has an input table with set semantics which means >>> it >>> > >>> requires a PARTITION BY clause. >>> > >>> The new syntax is conflict with current session window table >>> function >>> > >>> syntax, please take a look at session table function >>> > >>> <https://calcite.apache.org/docs/reference.html#session> [4]. >>> > >>> *Could we replace the old syntax directly, or take compatible into >>> > >>> consideration.* >>> > >>> 2. Based on SQL standard, only input tables with set semantics may >>> be >>> > >>> partitioned while input table with row semantics may not be >>> > partitioned. >>> > >>> *Should we have separate branch in Parser.jj for set semantic input >>> > table >>> > >>> of table function(Currently, only input table of session window >>> table >>> > >>> function has set semantics)*? >>> > >>> >>> > >>> Any suggestion is appreciated. Thanks in advanced. >>> > >>> [1] https://issues.apache.org/jira/browse/CALCITE-4337 >>> > >>> [2] https://github.com/apache/calcite/pull/2524 >>> > >>> [3] >>> > >>> >>> > >> >>> > >>> https://standards.iso.org/ittf/PubliclyAvailableStandards/c069776_ISO_IEC_TR_19075-7_2017.zip >>> > >>> [4] https://calcite.apache.org/docs/reference.html#session >>> > >>> >>> > >>> Best >>> > >>> JING ZHANG >>> > >> >>> > >> >>> > >>> > >>> >>> -- >>> This message contains confidential information and is intended only for >>> the >>> individuals named. If you are not the named addressee you should not >>> disseminate, distribute or copy this e-mail. Please notify the sender >>> immediately by e-mail if you have received this e-mail by mistake and >>> delete this e-mail from your system. E-mail transmission cannot be >>> guaranteed to be secure or error-free as information could be >>> intercepted, >>> corrupted, lost, destroyed, arrive late or incomplete, or contain >>> viruses. >>> The sender therefore does not accept liability for any errors or >>> omissions >>> in the contents of this message, which arise as a result of e-mail >>> transmission. If verification is required, please request a hard-copy >>> version. -Hazelcast >>> >>
