Re: [ANNOUNCE] New committer: Jin Xing

2020-04-30 Thread XING JIN
Hi, Julian ~
For introduction of my company [1]. It's under Alibaba Group and doing
financial business.
Regarding "powered by" page[2], it will be great if we could add an entry
and logo on it ;D
Thanks a lot !

Jin

[1]  https://en.wikipedia.org/wiki/Ant_Financial

[2]  https://calcite.apache.org/docs/powered_by.html


XING JIN  于2020年5月1日周五 上午10:54写道:

> Thanks a lot, Julian ~
> I'm not from MaxCompute team, but from big data platform in Alibaba Ant
> Financial Group.
> Actually we cooperate a lot with MaxCompute, it's our sister team.
>
> Jin
>
> Julian Hyde  于2020年5月1日周五 上午1:48写道:
>
>> Welcome Jin! Thanks for your contributions so far, looking forward to
>> more!
>>
>> Are you on the MaxCompute project? It’s already on our “powered by”
>> page[1], so I think people are familiar with it.
>>
>> Julian
>>
>> [1] https://calcite.apache.org/docs/powered_by.html#alibaba-maxcompute <
>> https://calcite.apache.org/docs/powered_by.html#alibaba-maxcompute>
>>
>>
>> > On Apr 29, 2020, at 5:06 AM, XING JIN  wrote:
>> >
>> > Thanks a lot ~
>> > Calcite is a great project and it's great honor for me to work with you
>> > guys. I really appreciate the help from community.
>> > I'm working in Alibaba. My team builds big data system to optimize batch
>> > and streaming jobs. We use Calcite to process Sql queries and
>> accommodate
>> > to different physical engines.
>> > I'm very excited to become Calcite committer and looking forward to make
>> > more contributions.
>> >
>> > Best regards,
>> > Jin
>> >
>> >
>> > Zoltan Haindrich  于2020年4月29日周三 下午1:58写道:
>> >
>> >> Congrats!
>> >>
>> >> On 4/29/20 7:32 AM, Enrico Olivelli wrote:
>> >>> Congratulations!
>> >>>
>> >>> Enrico
>> >>>
>> >>> Il Mer 29 Apr 2020, 04:51 Feng Zhu  ha
>> scritto:
>> >>>
>>   Congrations!
>> 
>>  best,
>>  Feng
>> 
>>  Chunwei Lei  于2020年4月29日周三 上午10:16写道:
>> 
>> > Congrats, Jin!
>> >
>> >
>> > Best,
>> > Chunwei
>> >
>> >
>> > On Wed, Apr 29, 2020 at 10:07 AM Forward Xu > >
>> > wrote:
>> >
>> >> Congrats
>> >>
>> >>
>> >> best,
>> >>
>> >> Forward
>> >>
>> >> 953396112 <953396...@qq.com> 于2020年4月29日周三 上午8:21写道:
>> >>
>> >>> Congrats, Jin Xing!
>> >>>
>> >>>
>> >>> ---Original---
>> >>> From: "Stamatis Zampetakis"> >>> Date: Wed, Apr 29, 2020 05:47 AM
>> >>> To: "dev"> >>> Subject: [ANNOUNCE] New committer: Jin Xing
>> >>>
>> >>>
>> >>> Apache Calcite's Project Management Committee (PMC) has invited
>> Jin
>> > Xing
>> >> to
>> >>> become a committer, and we are pleased to announce that he has
>> > accepted.
>> >>>
>> >>> Jin has contributed a lot of code in the project and many
>> >>> recent improvements in
>> >>> materialized view matching have his signature on them. Apart from
>>  code
>> >>> contributions, Jin provides valuable help to the community by
>> doing
>> >> reviews
>> >>> and
>> >>> answering questions in the devlist.
>> >>>
>> >>> Jin, welcome, thank you for your contributions, and we look
>> forward
>>  to
>> >> your
>> >>> further interactions with the community! If you wish, please feel
>>  free
>> > to
>> >>> tell
>> >>> us more about yourself and what you are working on.
>> >>>
>> >>> Stamatis (on behalf of the Apache Calcite PMC)
>> >>
>> >
>> 
>> >>>
>> >>
>>
>>


Re: "calcite" and "calcite-examples" missing from release 1.22?

2020-04-30 Thread Vladimir Sitnikov
The artifacts have no purpose, they do not exist, so they are not published.

Vladimis


Re: [ANNOUNCE] New committer: Jin Xing

2020-04-30 Thread XING JIN
Thanks a lot, Julian ~
I'm not from MaxCompute team, but from big data platform in Alibaba Ant
Financial Group.
Actually we cooperate a lot with MaxCompute, it's our sister team.

Jin

Julian Hyde  于2020年5月1日周五 上午1:48写道:

> Welcome Jin! Thanks for your contributions so far, looking forward to more!
>
> Are you on the MaxCompute project? It’s already on our “powered by”
> page[1], so I think people are familiar with it.
>
> Julian
>
> [1] https://calcite.apache.org/docs/powered_by.html#alibaba-maxcompute <
> https://calcite.apache.org/docs/powered_by.html#alibaba-maxcompute>
>
>
> > On Apr 29, 2020, at 5:06 AM, XING JIN  wrote:
> >
> > Thanks a lot ~
> > Calcite is a great project and it's great honor for me to work with you
> > guys. I really appreciate the help from community.
> > I'm working in Alibaba. My team builds big data system to optimize batch
> > and streaming jobs. We use Calcite to process Sql queries and accommodate
> > to different physical engines.
> > I'm very excited to become Calcite committer and looking forward to make
> > more contributions.
> >
> > Best regards,
> > Jin
> >
> >
> > Zoltan Haindrich  于2020年4月29日周三 下午1:58写道:
> >
> >> Congrats!
> >>
> >> On 4/29/20 7:32 AM, Enrico Olivelli wrote:
> >>> Congratulations!
> >>>
> >>> Enrico
> >>>
> >>> Il Mer 29 Apr 2020, 04:51 Feng Zhu  ha scritto:
> >>>
>   Congrations!
> 
>  best,
>  Feng
> 
>  Chunwei Lei  于2020年4月29日周三 上午10:16写道:
> 
> > Congrats, Jin!
> >
> >
> > Best,
> > Chunwei
> >
> >
> > On Wed, Apr 29, 2020 at 10:07 AM Forward Xu 
> > wrote:
> >
> >> Congrats
> >>
> >>
> >> best,
> >>
> >> Forward
> >>
> >> 953396112 <953396...@qq.com> 于2020年4月29日周三 上午8:21写道:
> >>
> >>> Congrats, Jin Xing!
> >>>
> >>>
> >>> ---Original---
> >>> From: "Stamatis Zampetakis" >>> Date: Wed, Apr 29, 2020 05:47 AM
> >>> To: "dev" >>> Subject: [ANNOUNCE] New committer: Jin Xing
> >>>
> >>>
> >>> Apache Calcite's Project Management Committee (PMC) has invited Jin
> > Xing
> >> to
> >>> become a committer, and we are pleased to announce that he has
> > accepted.
> >>>
> >>> Jin has contributed a lot of code in the project and many
> >>> recent improvements in
> >>> materialized view matching have his signature on them. Apart from
>  code
> >>> contributions, Jin provides valuable help to the community by doing
> >> reviews
> >>> and
> >>> answering questions in the devlist.
> >>>
> >>> Jin, welcome, thank you for your contributions, and we look forward
>  to
> >> your
> >>> further interactions with the community! If you wish, please feel
>  free
> > to
> >>> tell
> >>> us more about yourself and what you are working on.
> >>>
> >>> Stamatis (on behalf of the Apache Calcite PMC)
> >>
> >
> 
> >>>
> >>
>
>


[jira] [Created] (CALCITE-3965) Excessive time waiting on DiffRepository lock

2020-04-30 Thread Laurent Goujon (Jira)
Laurent Goujon created CALCITE-3965:
---

 Summary: Excessive time waiting on DiffRepository lock
 Key: CALCITE-3965
 URL: https://issues.apache.org/jira/browse/CALCITE-3965
 Project: Calcite
  Issue Type: Bug
  Components: core
Reporter: Laurent Goujon
Assignee: Laurent Goujon


When running the whole test suite from commandline, tests are parallelized and 
gradle/junit tries to use as many cores as possible (16 on my machine). But the 
tests take a very long time, approximatevely 90minutes on my machine, and 
several of them failed because they took too long to complete.

Using jstack to look at the threads state while tests are running show that 
most of them are waiting on {{DiffRepository}} methods 
({{DiffRepository#expand}} in most cases) while one of the thread obtained the 
lock (and is usually flushing data on disk).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Towards Cascades Optimizer

2020-04-30 Thread Haisheng Yuan
Hi all,

As planned in my proposal, I opened the pull request [1] for CALCITE-3896 to 
achieve:
1. Top-down trait request
2. Bottom-up trait derivation
3. Trait enforcement without AbstractConverter

The feature can be turned on or off by a flag, either through property config 
file or VolcanoPlanner set method. Since Calcite doesn't turn on 
AbstractConverter until latest master, I just disabled AbstractConverter by 
turning on top-down trait request, now all tests passed.

In our system, 99 tpcds queries' test results show almost no plan diff, but the 
number of relnodes created during optimization is reduced by 10~15% average 
(even without space pruning). I believe for other systems using VolcanoPlanner, 
more than 20% reduction can be expected.

It also has top-down rule apply in mind, later can be evolved to top-down rule 
apply and space pruning, e.g. integrating code from Jingpeng and Roman's. But 
the interface that is exposed to user, as described in the proposal, can remain 
the same.

Haisheng

[1] https://github.com/apache/calcite/pull/1953


On 2020/04/30 18:01:26, Julian Hyde  wrote: 
> If your test cases are SQL scripts, it might be fairly straightforward to 
> port them to Quidem (.iq) files. Plenty of examples in 
> https://github.com/apache/calcite/tree/master/core/src/test/resources/sql 
> .
> 
> Quidem files are basically SQL scripts. Expected output is embedded in the 
> script. You can run the script once, and if the output looks right, overwrite 
> the input file with the output.
> 
> Julian
> 
> 
> > On Apr 30, 2020, at 3:26 AM, Jinpeng Wu  wrote:
> > 
> > Sure. I will add more cases to my PR.
> > 
> > I did not design more cases because our own product has a test frameworks,
> > which contains thousands of actual user queries.
> > Calcite's code base is quite different. I cannot just migrate cases to
> > calcite.  So it may take some time.
> > 
> > On Wed, Apr 29, 2020 at 4:27 AM Roman Kondakov 
> > wrote:
> > 
> >> Hi Jinpeng,
> >> 
> >> I went through your PR and it seemed very impressive to me. It is very
> >> similar to what I did, but you've reused many existing logic from the
> >> Volcano planner. We should definitely stay in sync in our experiments. I
> >> believe the future Cascades planner will be the kind combination of our
> >> works.
> >> 
> >> Is there any way to run tests that are close to the real system query
> >> execution? May be with Enumerable convention, or, better, with
> >> convention that supports distribution trait? I just want to look through
> >> your planner's optimization steps more thoroughly. I've found some tests
> >> in org.apache.calcite.plan.volcano package, but they use synthetic
> >> conventions and nodes. May be I missed something.
> >> 
> >> Thank you for sharing your work!
> >> 
> >> --
> >> Kind Regards
> >> Roman Kondakov
> >> 
> >> 
> >> On 28.04.2020 15:19, Jinpeng Wu wrote:
> >>> Hi, Roman. It's great to see your proposal. Actually my team has also
> >> been
> >>> working on a cascade planner based on calcite.  And we already have some
> >>> outcome as well.  Maybe we can combine our works.
> >>> 
> >>> I've pushed my code as https://github.com/apache/calcite/pull/1950 .
> >>> 
> >>> Our works have many places in common. We both developed a new
> >>> CascadePlanner and avoid modifying the old VolcanoPlanner directly. We
> >>> both implemented the top-down search strategy according to the
> >>> Columnbia optimizer
> >>> generator
> >>> <
> >> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
> >>> 。But
> >>> we also have some differences.
> >>> 
> >>> The first difference is that I try to reuse the existing VolcanoPlanner
> >> as
> >>> much as possible. My CascadePlanner inherits from the existing
> >>> VolcanoPlanner. Except that it overwrites ruleQueue and findBestPlan
> >> method
> >>> to rearrange rule applies, most logic generally inherit from
> >>> VolcanoPlanner. For example,
> >>>  - It reuses the RelSet and RelSubset class and the register method
> >>>  - Rules are fired as soon as a RelNode is registered (In the
> >>> Columnbia optimizer generator, rules are not fired until exploring). The
> >>> ApplyRule task controls when to invoke the onMatch method of a RuleMatch.
> >>> This design have a benefit that we do not need to worry about missing a
> >>> rule or firing a rule multiple times.
> >>>  - It leverages AbstractConverter to pass traits requirements down.  So
> >>> currently AC is still essential in my code.
> >>> This makes the new planner highly compatible with the old VolcanoPlanner.
> >>> Features like MV and Hints can apply to it directly.  And I tried to
> >> change
> >>> VolcanoPlanner to the new CascadePlanner in tests. Most tests passed.
> >>> Several cases did fail. I know the reason and how to fix them. But I am
> >>> still thinking about making them as "won't fix" as the ruleset violates
> >>> 

Re: "calcite" and "calcite-examples" missing from release 1.22?

2020-04-30 Thread Julian Hyde
Rather than opening up a new debate, can we just carry on doing what we always 
did?


> On Apr 28, 2020, at 12:03 PM, Vladimir Sitnikov  
> wrote:
> 
>> If you search for Calcite on Maven Central [1] you will see that the
> latest version of the “calcite” and “calcite-examples” artifacts is 1.21,
> whereas the latest version of everything else is 1.22.
> 
> What is the purpose of "calcite" artifact?
> Was
> org.apache.calcitecalcite
> useful?
> 
> A missing bit is the publishing of calcite-bom (bill-of-material).
> It would be helpful for the consumers so they get all the versions of
> -linq4j, -core, etc, by adding a single dependency on the bom artifact.
> Here's the bom for JUnit5:
> https://repo1.maven.org/maven2/org/junit/junit-bom/5.7.0-M1/junit-bom-5.7.0-M1.pom
> 
> Vladimir



Re: [DISCUSS] Towards Cascades Optimizer

2020-04-30 Thread Julian Hyde
If your test cases are SQL scripts, it might be fairly straightforward to port 
them to Quidem (.iq) files. Plenty of examples in 
https://github.com/apache/calcite/tree/master/core/src/test/resources/sql 
.

Quidem files are basically SQL scripts. Expected output is embedded in the 
script. You can run the script once, and if the output looks right, overwrite 
the input file with the output.

Julian


> On Apr 30, 2020, at 3:26 AM, Jinpeng Wu  wrote:
> 
> Sure. I will add more cases to my PR.
> 
> I did not design more cases because our own product has a test frameworks,
> which contains thousands of actual user queries.
> Calcite's code base is quite different. I cannot just migrate cases to
> calcite.  So it may take some time.
> 
> On Wed, Apr 29, 2020 at 4:27 AM Roman Kondakov 
> wrote:
> 
>> Hi Jinpeng,
>> 
>> I went through your PR and it seemed very impressive to me. It is very
>> similar to what I did, but you've reused many existing logic from the
>> Volcano planner. We should definitely stay in sync in our experiments. I
>> believe the future Cascades planner will be the kind combination of our
>> works.
>> 
>> Is there any way to run tests that are close to the real system query
>> execution? May be with Enumerable convention, or, better, with
>> convention that supports distribution trait? I just want to look through
>> your planner's optimization steps more thoroughly. I've found some tests
>> in org.apache.calcite.plan.volcano package, but they use synthetic
>> conventions and nodes. May be I missed something.
>> 
>> Thank you for sharing your work!
>> 
>> --
>> Kind Regards
>> Roman Kondakov
>> 
>> 
>> On 28.04.2020 15:19, Jinpeng Wu wrote:
>>> Hi, Roman. It's great to see your proposal. Actually my team has also
>> been
>>> working on a cascade planner based on calcite.  And we already have some
>>> outcome as well.  Maybe we can combine our works.
>>> 
>>> I've pushed my code as https://github.com/apache/calcite/pull/1950 .
>>> 
>>> Our works have many places in common. We both developed a new
>>> CascadePlanner and avoid modifying the old VolcanoPlanner directly. We
>>> both implemented the top-down search strategy according to the
>>> Columnbia optimizer
>>> generator
>>> <
>> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
>>> 。But
>>> we also have some differences.
>>> 
>>> The first difference is that I try to reuse the existing VolcanoPlanner
>> as
>>> much as possible. My CascadePlanner inherits from the existing
>>> VolcanoPlanner. Except that it overwrites ruleQueue and findBestPlan
>> method
>>> to rearrange rule applies, most logic generally inherit from
>>> VolcanoPlanner. For example,
>>>  - It reuses the RelSet and RelSubset class and the register method
>>>  - Rules are fired as soon as a RelNode is registered (In the
>>> Columnbia optimizer generator, rules are not fired until exploring). The
>>> ApplyRule task controls when to invoke the onMatch method of a RuleMatch.
>>> This design have a benefit that we do not need to worry about missing a
>>> rule or firing a rule multiple times.
>>>  - It leverages AbstractConverter to pass traits requirements down.  So
>>> currently AC is still essential in my code.
>>> This makes the new planner highly compatible with the old VolcanoPlanner.
>>> Features like MV and Hints can apply to it directly.  And I tried to
>> change
>>> VolcanoPlanner to the new CascadePlanner in tests. Most tests passed.
>>> Several cases did fail. I know the reason and how to fix them. But I am
>>> still thinking about making them as "won't fix" as the ruleset violates
>>> some basic principles of top-down trait requests.
>>> 
>>> The second difference is that our design have the ability for space
>>> pruning. Currently it contains a simply LowerBoundCost metadata to
>> compute
>>> the lower bound of a RelNdoe. Because logical properties like cardinality
>>> of a RelSet is not stable across exploring, it is required that a group
>> to
>>> be fully explored (implementation rules and enforcement rules should
>> never
>>> modify the logical properties) before it can provide a valid lower bound
>>> cost. Because of that, logical search space pruning is not supported now.
>>> It can only pruned out implementation rules and enforcement rules.
>> Testing
>>> with cases in our own product, the new planner saves about 10% rule
>>> applies. I am still considering how to support logical space pruning,
>>> looking forwards to have more improvements.
>>> 
>>> Hope my code will help.
>>> 
>>> Thanks,
>>> Jinpeng
>>> 
>>> 
>>> On Tue, Apr 28, 2020 at 11:22 AM Xiening Dai 
>> wrote:
>>> 
 For #1, aside from that we need to be able to build physical nodes based
 on a convention. For example, if we merge two EnumerableProject, we
>> would
 want to create an EnumerableProject as a result, instead of
>> LogicalProject.
 The RelBuilder change 

Re: [DISCUSS] Deprecate grouped window functions

2020-04-30 Thread Rui Wang
Polymorphic table function is logged at [1]. I assigned that to myself
because I started to implement DESCRIPTOR (a part of PTF in SQL standard).
It's very welcomed if anyone wants to help to accelerate the
implementation. (But please use another thread to discuss it).


Back to this thread's topic.

Timo:
>Once Calcite supports polymorphic table functions
I am guessing you actually talked about TUMBLE/HOP/SESSION work as PTF in
Calcite? The PTF itself has much more features beyond what is need for
table function windowing.
Regarding the TUMBLE/HOP/SESSION as PTF in Calcite, I believe basic but the
most important functionality will very likely be in 1.23.0 release. There
will be a few small patches added after that (in 1.24.0).

Julian:
>I suggest that Calcite continues to support the SQL syntax in the parser
(and the SqlNode AST) but deprecates and removes the support in the algebra
(RelNode) within one or two releases (3 - 6 months).

+1 on this compromise. We could leave the syntax support there for longer
time to make sure user query running (leave the possibility for downstream
to have a translation if they cannot migrate off the syntax within near or
mid term). Deprecation begins from algebra can happen faster (in 1.25.0 or
1.26.0).


[1]: https://jira.apache.org/jira/browse/CALCITE-2270


-Rui

On Thu, Apr 30, 2020 at 10:38 AM Julian Hyde  wrote:

> I understand that you need to continue to support the SQL syntax while
> your users still want it. I suggest that Calcite continues to support the
> SQL syntax in the parser (and the SqlNode AST) but deprecates and removes
> the support in the algebra (RelNode) within one or two releases (3 - 6
> months).
>
> I’m not familiar with a requirement for polymorphic table functions. Is
> there a JIRA case logged? Is it possible to do this feature without them?
>
> Julian
>
>
> > On Apr 30, 2020, at 7:16 AM, Timo Walther  wrote:
> >
> > Thanks for considering our needs.
> >
> > I'm pretty sure that windows are in almost every streaming pipeline with
> aggregations. Unlike regular Java API, SQL syntax is very difficult to
> deprecate.
> >
> > We usually give Flink user 1-2 releases time to update their code. Once
> Calcite supports polymorphic table functions, I think 6 months would be
> helpful otherwise we need to maintain our own fork which we could mostly
> prevent so far.
> >
> > Regards,
> > Timo
> >
> > On 29.04.20 00:49, Rui Wang wrote:
> >> Agreed. I would like to get more feedback to have a
> >> reasonable accommodation for users.
> >> -Rui
> >> On Mon, Apr 27, 2020 at 11:50 AM Julian Hyde  wrote:
> >>> Changing my +1 to +0. We have to make reasonable accommodations for our
> >>> users. Glad we had this discussion.
> >>>
>  On Apr 24, 2020, at 11:10 AM, Rui Wang  wrote:
> 
>  Hi Timo,
> 
>  My intention is to fully drop concepts such as
> SqlGroupedWindowFunction
> >>> and
>  auxiliary group functions, which include relevant code in
> parser/syntax,
>  operator, planner, etc.
> 
>  Since you mentioned the need for more time to migrate. How many
> Calcite
>  releases that you think can probably leave enough buffer time?
> (Calcite
>  schedules 4 releases a year. So say 2 releases will give 6 months)
> 
> 
>  -Rui
> 
>  On Fri, Apr 24, 2020 at 1:50 AM Timo Walther 
> wrote:
> 
> > Hi everyone,
> >
> > so far Apache Flink depends on this feature. We are fine with
> improving
> > the SQL compliance and eventually dropping GROUP BY
> TUMBLE/HOP/SESSION
> > in the future. However, we would like to give our users some time to
> > migrate their existing pipelines.
> >
> > What does dropping mean for Calcite? Will users of Calcite be able to
> > still support this syntax? In particular, are you intending to also
> drop
> > concepts such as SqlGroupedWindowFunction and auxiliary group
> functions?
> > Or are you intending to just remove entries from Calcite's default
> > operator table?
> >
> > Regards,
> > Timo
> >
> >
> > On 24.04.20 10:30, Julian Hyde wrote:
> >> +1
> >>
> >> Let’s remove TUMBLE etc from the GROUP BY clause. Since this is a
> SQL
> > change, not an API change, I don’t we need to give notice. Let’s just
> >>> do it.
> >>
> >> Julian
> >>
> >>> On Apr 22, 2020, at 4:05 PM, Rui Wang 
> wrote:
> >>>
> >>> Made a mistake on the example above, and update it as follows:
> >>>
> >>> // Table function windowing syntax.
> >>> SELECT
> >>>product_id, count(*), window_start
> >>> FROM TABLE(TUMBLE(order, DESCRIPTOR(rowtime), INTERVAL '1' hour))
> >>> GROUP BY product_id, window_start
> >>>
>  On Wed, Apr 22, 2020 at 2:31 PM Rui Wang 
> >>> wrote:
> 
>  Hi community,
> 
>  I want to kick off a discussion about deprecating grouped window
> > functions
>  (GROUP BY TUMBLE/HOP/SESSION) as the 

Re: [ANNOUNCE] New committer: Forward Xu

2020-04-30 Thread Julian Hyde
Congratulations and welcome, Forward! Thank you for your contributions.

It would be great to add TBDS (and its logo) to the “powered by” page[1]. What 
do you think?

Julian

[1] https://calcite.apache.org/docs/powered_by.html 


> On Apr 29, 2020, at 5:34 AM, Forward Xu  wrote:
> 
> Thank you everyone for your warm welcome!
> I'm working in the TBDS team of Tencent in Shenzhen. TBDS (Tencent Big Data
> Suite) is similar to Alibaba's EMR, TBDS is a big data ecosystem. I am
> responsible for Oceanus(flink streaming jobs) and Tdbank (Tencent real-time
> data collection system). I‘m very happy to become calcite committer and
> looking forward to make more contributions.
> 
> Best,
> Forward
> 
> Zoltan Haindrich  于2020年4月29日周三 下午1:58写道:
> 
>> Congratulations!
>> 
>> On 4/29/20 7:31 AM, Enrico Olivelli wrote:
>>> Congrats!
>>> 
>>> Enrico
>>> 
>>> Il Mer 29 Apr 2020, 04:52 Feng Zhu  ha scritto:
>>> 
  Congrations! Forward!
 
 best,
 Feng
 
 Chunwei Lei  于2020年4月29日周三 上午10:17写道:
 
> Congrats, Forward!
> 
> 
> 
> Best,
> Chunwei
> 
> 
> On Wed, Apr 29, 2020 at 6:46 AM Rui Wang  wrote:
> 
>> Congrats!
>> 
>> 
>> -Rui
>> 
>> On Tue, Apr 28, 2020 at 3:04 PM Francis Chuang <
 francischu...@apache.org
>> 
>> wrote:
>> 
>>> Congrats, Forward!
>>> 
>>> Francis
>>> 
>>> On 29/04/2020 7:53 am, Stamatis Zampetakis wrote:
 Apache Calcite's Project Management Committee (PMC) has invited
> Forward
>>> Xu
 to
 become a committer, and we are pleased to announce that he has
>> accepted.
 
 Forward has been helping the project for some time now. He added
 many
>> new
 SQL
 functions to the project and is one of our JSON experts. On top of
>> that,
>>> and
 other fixes, he is the one who added the Redis adapter to the
> project.
 
 Forward, welcome, thank you for your contributions, and we look
> forward
>>> to
 your
 further interactions with the community! If you wish, please feel
> free
>> to
 tell
 us more about yourself and what you are working on.
 
 Stamatis (on behalf of the Apache Calcite PMC)
 
>>> 
>> 
> 
 
>>> 
>> 



Re: [ANNOUNCE] New committer: Wang Yanlin

2020-04-30 Thread Julian Hyde
Welcome, Yanlin!

Thanks for your contributions so far, and thanks for introducing yourself. I 
often learn so much from committers’ self-introductions about how Calcite is 
being used.

I know we have other Alibaba-related projects on the “powered by” page [1] 
(Flink/Ververica, MaxCompute) but it seems that Ant Financial is a distinct 
business, so deserves its own entry on the page, and logo. What do you think?

Julian

[1] https://calcite.apache.org/docs/powered_by.html#alibaba-maxcompute 


> On Apr 29, 2020, at 5:50 AM, Wang Yanlin <1989yanlinw...@163.com> wrote:
> 
> Hi, guys, thanks for your warm welcome.
> 
> 
> 
> I'm working in Ant Finical, Alibaba  Group. Currently my team is working on 
> building a system to process big data in form of sql.
> We use calcite to parse sql, optimize Relnode and rewrite SqlNode to execute 
> on different engines, like Spark,MaxCompute, HBase and so on.
> Calcite is really a great community, and it's really an honor for me to 
> become calcite committer, hops to make more contribution to calcite.
> 
> 
> Thanks again.
> 
> --
> 
> Best,
> Wang Yanlin
> 
> 
> 
> 
> 
> 在 2020-04-29 13:58:35,"Zoltan Haindrich"  写道:
>> Congratulations!
>> 
>> On 4/29/20 7:32 AM, Enrico Olivelli wrote:
>>> Congrats!
>>> 
>>> Enrico
>>> 
>>> Il Mer 29 Apr 2020, 04:51 Feng Zhu  ha scritto:
>>> 
  Congrations! Yanlin!
 
 best,
 Feng
 
 Chunwei Lei  于2020年4月29日周三 上午10:16写道:
 
> Congrats, Yanlin!
> 
> 
> Best,
> Chunwei
> 
> 
> On Wed, Apr 29, 2020 at 10:07 AM Forward Xu 
> wrote:
> 
>> Congrats
>> 
>> 
>> Best,
>> 
>> Forward
>> 
>> 953396112 <953396...@qq.com> 于2020年4月29日周三 上午8:26写道:
>> 
>>> Congrats, Wang Yanlin!
>>> 
>>> 
>>> 
>>> 
>>> ---Original---
>>> From: "Stamatis Zampetakis">> Date: Wed, Apr 29, 2020 05:51 AM
>>> To: "dev">> Subject: [ANNOUNCE] New committer: Wang Yanlin
>>> 
>>> 
>>> Apache Calcite's Project Management Committee (PMC) has invited Wang
>> Yanlin
>>> to
>>> become a committer, and we are pleased to announce that he has
> accepted.
>>> 
>>> Wang has pushed numerous fixes and improvements to the project,
 landing
>> in
>>> total
>>> the impressive number of 30 commits to the master. Among other
 things,
> he
>>> contributed some important features in the Interpreter.
>>> 
>>> Wang, welcome, thank you for your contributions, and we look forward
> your
>>> further interactions with the community! If you wish, please feel
 free
> to
>>> tell
>>> us more about yourself and what you are working on.
>>> 
>>> Stamatis (on behalf of the Apache Calcite PMC)
>> 
> 
 
>>> 



Re: [ANNOUNCE] New committer: Jin Xing

2020-04-30 Thread Julian Hyde
Welcome Jin! Thanks for your contributions so far, looking forward to more!

Are you on the MaxCompute project? It’s already on our “powered by” page[1], so 
I think people are familiar with it.

Julian

[1] https://calcite.apache.org/docs/powered_by.html#alibaba-maxcompute 



> On Apr 29, 2020, at 5:06 AM, XING JIN  wrote:
> 
> Thanks a lot ~
> Calcite is a great project and it's great honor for me to work with you
> guys. I really appreciate the help from community.
> I'm working in Alibaba. My team builds big data system to optimize batch
> and streaming jobs. We use Calcite to process Sql queries and accommodate
> to different physical engines.
> I'm very excited to become Calcite committer and looking forward to make
> more contributions.
> 
> Best regards,
> Jin
> 
> 
> Zoltan Haindrich  于2020年4月29日周三 下午1:58写道:
> 
>> Congrats!
>> 
>> On 4/29/20 7:32 AM, Enrico Olivelli wrote:
>>> Congratulations!
>>> 
>>> Enrico
>>> 
>>> Il Mer 29 Apr 2020, 04:51 Feng Zhu  ha scritto:
>>> 
  Congrations!
 
 best,
 Feng
 
 Chunwei Lei  于2020年4月29日周三 上午10:16写道:
 
> Congrats, Jin!
> 
> 
> Best,
> Chunwei
> 
> 
> On Wed, Apr 29, 2020 at 10:07 AM Forward Xu 
> wrote:
> 
>> Congrats
>> 
>> 
>> best,
>> 
>> Forward
>> 
>> 953396112 <953396...@qq.com> 于2020年4月29日周三 上午8:21写道:
>> 
>>> Congrats, Jin Xing!
>>> 
>>> 
>>> ---Original---
>>> From: "Stamatis Zampetakis">> Date: Wed, Apr 29, 2020 05:47 AM
>>> To: "dev">> Subject: [ANNOUNCE] New committer: Jin Xing
>>> 
>>> 
>>> Apache Calcite's Project Management Committee (PMC) has invited Jin
> Xing
>> to
>>> become a committer, and we are pleased to announce that he has
> accepted.
>>> 
>>> Jin has contributed a lot of code in the project and many
>>> recent improvements in
>>> materialized view matching have his signature on them. Apart from
 code
>>> contributions, Jin provides valuable help to the community by doing
>> reviews
>>> and
>>> answering questions in the devlist.
>>> 
>>> Jin, welcome, thank you for your contributions, and we look forward
 to
>> your
>>> further interactions with the community! If you wish, please feel
 free
> to
>>> tell
>>> us more about yourself and what you are working on.
>>> 
>>> Stamatis (on behalf of the Apache Calcite PMC)
>> 
> 
 
>>> 
>> 



Re: [DISCUSS] Deprecate grouped window functions

2020-04-30 Thread Julian Hyde
I understand that you need to continue to support the SQL syntax while your 
users still want it. I suggest that Calcite continues to support the SQL syntax 
in the parser (and the SqlNode AST) but deprecates and removes the support in 
the algebra (RelNode) within one or two releases (3 - 6 months).

I’m not familiar with a requirement for polymorphic table functions. Is there a 
JIRA case logged? Is it possible to do this feature without them?

Julian


> On Apr 30, 2020, at 7:16 AM, Timo Walther  wrote:
> 
> Thanks for considering our needs.
> 
> I'm pretty sure that windows are in almost every streaming pipeline with 
> aggregations. Unlike regular Java API, SQL syntax is very difficult to 
> deprecate.
> 
> We usually give Flink user 1-2 releases time to update their code. Once 
> Calcite supports polymorphic table functions, I think 6 months would be 
> helpful otherwise we need to maintain our own fork which we could mostly 
> prevent so far.
> 
> Regards,
> Timo
> 
> On 29.04.20 00:49, Rui Wang wrote:
>> Agreed. I would like to get more feedback to have a
>> reasonable accommodation for users.
>> -Rui
>> On Mon, Apr 27, 2020 at 11:50 AM Julian Hyde  wrote:
>>> Changing my +1 to +0. We have to make reasonable accommodations for our
>>> users. Glad we had this discussion.
>>> 
 On Apr 24, 2020, at 11:10 AM, Rui Wang  wrote:
 
 Hi Timo,
 
 My intention is to fully drop concepts such as SqlGroupedWindowFunction
>>> and
 auxiliary group functions, which include relevant code in parser/syntax,
 operator, planner, etc.
 
 Since you mentioned the need for more time to migrate. How many Calcite
 releases that you think can probably leave enough buffer time? (Calcite
 schedules 4 releases a year. So say 2 releases will give 6 months)
 
 
 -Rui
 
 On Fri, Apr 24, 2020 at 1:50 AM Timo Walther  wrote:
 
> Hi everyone,
> 
> so far Apache Flink depends on this feature. We are fine with improving
> the SQL compliance and eventually dropping GROUP BY TUMBLE/HOP/SESSION
> in the future. However, we would like to give our users some time to
> migrate their existing pipelines.
> 
> What does dropping mean for Calcite? Will users of Calcite be able to
> still support this syntax? In particular, are you intending to also drop
> concepts such as SqlGroupedWindowFunction and auxiliary group functions?
> Or are you intending to just remove entries from Calcite's default
> operator table?
> 
> Regards,
> Timo
> 
> 
> On 24.04.20 10:30, Julian Hyde wrote:
>> +1
>> 
>> Let’s remove TUMBLE etc from the GROUP BY clause. Since this is a SQL
> change, not an API change, I don’t we need to give notice. Let’s just
>>> do it.
>> 
>> Julian
>> 
>>> On Apr 22, 2020, at 4:05 PM, Rui Wang  wrote:
>>> 
>>> Made a mistake on the example above, and update it as follows:
>>> 
>>> // Table function windowing syntax.
>>> SELECT
>>>product_id, count(*), window_start
>>> FROM TABLE(TUMBLE(order, DESCRIPTOR(rowtime), INTERVAL '1' hour))
>>> GROUP BY product_id, window_start
>>> 
 On Wed, Apr 22, 2020 at 2:31 PM Rui Wang 
>>> wrote:
 
 Hi community,
 
 I want to kick off a discussion about deprecating grouped window
> functions
 (GROUP BY TUMBLE/HOP/SESSION) as the table function windowing support
 becomes a thing [1] (FROM TABLE(TUMBLE/HOP/SESSION)). The current
> stage of
 table function windowing is TUMBLE support is checked in. HOP and
> SESSION
 support is likely to be merged in 1.23.0.
 
 A briefly example of two different windowing syntax:
 
 // Grouped window functions.
 SELECT
   product_id, count(*), TUMBLE_START() as window_start
 FROM order
 GROUP BY product_id, TUMBLE(rowtime, INTERVAL '1' hour); // an hour
> long
 fixed window size.
 
 // Table function windowing syntax.
 SELECT
product_id, count(*), window_start
 FROM TABLE(TUMBLE(order, DESCRIPTOR(.rowtime), INTERVAL '1' hour)
 GROUP BY product_id
 
 I am giving a short, selective comparison as the following:
 
 The places that table function windowing behaves better
 1) no GROUPING/GROUP BY enforced. It becomes a problem in streaming
> JOIN.
 For example, one use case is for each hour, apply a JOIN on two
> streams. In
 this case, no GROUP BY is needed.
 2) grouped window functions allow multiple calls in GROUP BY. For
> example,
 from SQL syntax perspective, GROUP BY TUMBLE(...), HOP(...),
> SESSION(...)
 is not wrong, but it is an illegal query.
 3) Calcite includes an Enumerable implementation of table function
 windowing, while grouped 

Re: [DISCUSS] Deprecate grouped window functions

2020-04-30 Thread Julian Hyde



> On Apr 30, 2020, at 8:16 AM, Viliam Durina  wrote:
> 
> What is the status of polymorphic table functions? We'd like to use them.

Off topic. Can you start this discussion in a new thread?

Julian



Re: Building a Calcite Adapter

2020-04-30 Thread Jon Pither
We went down the route of wrapping Calcite with our own JDBC driver that
strips out the `VALIDTIME AS OF (...)` from ``VALIDTIME AS OF (...) SELECT
* FROM FOO`. We do this by overriding CalcitePrepareImpl and adding
internalParameters to the CalciteSignature, that our enumerator then uses
when executing the actual query. Any feedback on this approach is welcome
:-)

Another question. Our underlying DB supports datetime fields, returning
java.util.Dates back from queries for datetime columns. I'm thinking that
we ought be able to map these Dates through to a column we define using
SqlTypeName/TIMESTAMP. To get this to work though, our enumerator has to
convert our dates into millis, for Calcite to then convert them back into
java.util.Dates.. Feel like I missing something obvious to skip this
conversion?

On Mon, 27 Apr 2020 at 16:53, Jon Pither  wrote:

> Hi,
>
> Another route we're looking at is to use `ALTER SESSION SET VALID_TIME =
> date('2010')`. When we experiment with this - hoping to trigger
> `SqlSetOption` - we get an java.lang.UnsupportedOperationException:
>
>CalcitePrepareImpl.java:  369
>  org.apache.calcite.prepare.CalcitePrepareImpl/executeDdl
>
> How could we make use of SqlSetOption? Do we need to extend the parser or
> is there a simpler way?
>
> Regards,
>
> Jon.
>
>
> On Mon, 27 Apr 2020 at 13:30, Jon Pither  wrote:
>
>> Hi Stamatis & Calcite team,
>>
>> Thanks for your response. We've made some good progress since - following
>> JdbcConvention as you suggest - and now we've got the Crux adapter handling
>> joins, sorts and more. We're in a good place I feel, and it's exciting to
>> see Calcite providing a SQL layer on top of our Datalog. Thanks again :-)
>>
>> One Q: is it possible to extend the Calcite parser to do the following:
>> `VALIDTIME AS OF date('2010...') SELECT * FROM FOO`. So far I've played
>> with extending the parser using fmpp & javacc and it certainly feels
>> doable, but I can't quite grok what the extension point would be in Calcite
>> to add this - for example you can hang off arbitrary extensions from
>> subtrees such as CREATE and DROP (by extending SqlCreate and SqlDrop
>> respectively)... where might an arbitrary precursor command such as
>> `VALIDTIME AS OF date()` fit in?
>>
>> Regards,
>>
>> Jon.
>>
>>
>> On Tue, 21 Apr 2020 at 22:43, Stamatis Zampetakis 
>> wrote:
>>
>>> Hi Jon,
>>>
>>> Thanks for your kind words. I'm sure people working on the project are
>>> very
>>> happy to receive some positive feedback for their work from time to time
>>> :)
>>>
>>> I had a quick look on your project and definitely looks interesting.
>>>
>>> If your engine (Crux) uses better join algorithms than the ones provided
>>> by
>>> Calcite and if you have an optimizer that can apply join re-ordering and
>>> other optimization techniques efficiently then I guess going further and
>>> pushing joins and other things to Crux is a good idea.
>>>
>>> Having said that, I am not sure if the TranslatableTable approach will
>>> get
>>> you much further to this direction.
>>> I would suggest to have a look in JdbcConvention [1] and see how the
>>> notion
>>> of Convention along with the respective rules and relational expressions
>>> help to push operations into traditional RDBMs. The Cassandra, Mongo, and
>>> Elastic adapters are not a very good example since the underlying engines
>>> do not support joins.
>>>
>>> I am not aware if there are people offering consulting services for
>>> Calcite
>>> but I guess if there are you will know already.
>>> Apart from that the project has many volunteers willing to help so if you
>>> have more questions don't hesitate to send them to this list.
>>>
>>> Best,
>>> Stamatis
>>>
>>> [1]
>>>
>>> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/adapter/jdbc/JdbcConvention.java
>>>
>>>
>>> On Tue, Apr 7, 2020, 12:22 PM Jon Pither  wrote:
>>>
>>> > Hi Calcite Devs,
>>> >
>>> > Firstly, thank you to all of you for building this fantastic tool.
>>> >
>>> > I'm currently experimenting with using Calcite on top of our document
>>> > database Crux (opencrux.com) offering bitemporal features using a
>>> Datalog
>>> > query language. You can see our efforts here, written in Clojure!
>>> >
>>> >
>>> >
>>> https://github.com/juxt/crux/blob/jp/calcite/crux-calcite/src/crux/calcite.clj
>>> >
>>> >
>>> https://github.com/juxt/crux/blob/jp/calcite/crux-test/test/crux/calcite_test.clj
>>> >
>>> > So far we've been impressed at the power Calcite gives, with such
>>> little
>>> > amount of integration code needed.
>>> >
>>> > We now have an initial MVP working using the ProjectableFilterableTable
>>> > route. The adapter is basically constructing a Datalog query that we
>>> then
>>> > execute against our DB.
>>> >
>>> > So far so good, and now I have some initial questions:
>>> >
>>> > Firstly, in this code we're making use of ProjectableFilterableTable
>>> to get
>>> > us up and running. I've looked at the Mongo and 

Re: [DISCUSS] Deprecate grouped window functions

2020-04-30 Thread Viliam Durina
What is the status of polymorphic table functions? We'd like to use them.

Viliam


On Thu, 30 Apr 2020 at 16:16, Timo Walther  wrote:

> Thanks for considering our needs.
>
> I'm pretty sure that windows are in almost every streaming pipeline with
> aggregations. Unlike regular Java API, SQL syntax is very difficult to
> deprecate.
>
> We usually give Flink user 1-2 releases time to update their code. Once
> Calcite supports polymorphic table functions, I think 6 months would be
> helpful otherwise we need to maintain our own fork which we could mostly
> prevent so far.
>
> Regards,
> Timo
>
> On 29.04.20 00:49, Rui Wang wrote:
> > Agreed. I would like to get more feedback to have a
> > reasonable accommodation for users.
> >
> >
> > -Rui
> >
> > On Mon, Apr 27, 2020 at 11:50 AM Julian Hyde  wrote:
> >
> >> Changing my +1 to +0. We have to make reasonable accommodations for our
> >> users. Glad we had this discussion.
> >>
> >>> On Apr 24, 2020, at 11:10 AM, Rui Wang  wrote:
> >>>
> >>> Hi Timo,
> >>>
> >>> My intention is to fully drop concepts such as SqlGroupedWindowFunction
> >> and
> >>> auxiliary group functions, which include relevant code in
> parser/syntax,
> >>> operator, planner, etc.
> >>>
> >>> Since you mentioned the need for more time to migrate. How many Calcite
> >>> releases that you think can probably leave enough buffer time? (Calcite
> >>> schedules 4 releases a year. So say 2 releases will give 6 months)
> >>>
> >>>
> >>> -Rui
> >>>
> >>> On Fri, Apr 24, 2020 at 1:50 AM Timo Walther 
> wrote:
> >>>
>  Hi everyone,
> 
>  so far Apache Flink depends on this feature. We are fine with
> improving
>  the SQL compliance and eventually dropping GROUP BY TUMBLE/HOP/SESSION
>  in the future. However, we would like to give our users some time to
>  migrate their existing pipelines.
> 
>  What does dropping mean for Calcite? Will users of Calcite be able to
>  still support this syntax? In particular, are you intending to also
> drop
>  concepts such as SqlGroupedWindowFunction and auxiliary group
> functions?
>  Or are you intending to just remove entries from Calcite's default
>  operator table?
> 
>  Regards,
>  Timo
> 
> 
>  On 24.04.20 10:30, Julian Hyde wrote:
> > +1
> >
> > Let’s remove TUMBLE etc from the GROUP BY clause. Since this is a SQL
>  change, not an API change, I don’t we need to give notice. Let’s just
> >> do it.
> >
> > Julian
> >
> >> On Apr 22, 2020, at 4:05 PM, Rui Wang  wrote:
> >>
> >> Made a mistake on the example above, and update it as follows:
> >>
> >> // Table function windowing syntax.
> >> SELECT
> >> product_id, count(*), window_start
> >> FROM TABLE(TUMBLE(order, DESCRIPTOR(rowtime), INTERVAL '1' hour))
> >> GROUP BY product_id, window_start
> >>
> >>> On Wed, Apr 22, 2020 at 2:31 PM Rui Wang 
> >> wrote:
> >>>
> >>> Hi community,
> >>>
> >>> I want to kick off a discussion about deprecating grouped window
>  functions
> >>> (GROUP BY TUMBLE/HOP/SESSION) as the table function windowing
> support
> >>> becomes a thing [1] (FROM TABLE(TUMBLE/HOP/SESSION)). The current
>  stage of
> >>> table function windowing is TUMBLE support is checked in. HOP and
>  SESSION
> >>> support is likely to be merged in 1.23.0.
> >>>
> >>> A briefly example of two different windowing syntax:
> >>>
> >>> // Grouped window functions.
> >>> SELECT
> >>>product_id, count(*), TUMBLE_START() as window_start
> >>> FROM order
> >>> GROUP BY product_id, TUMBLE(rowtime, INTERVAL '1' hour); // an hour
>  long
> >>> fixed window size.
> >>>
> >>> // Table function windowing syntax.
> >>> SELECT
> >>> product_id, count(*), window_start
> >>> FROM TABLE(TUMBLE(order, DESCRIPTOR(.rowtime), INTERVAL '1' hour)
> >>> GROUP BY product_id
> >>>
> >>> I am giving a short, selective comparison as the following:
> >>>
> >>> The places that table function windowing behaves better
> >>> 1) no GROUPING/GROUP BY enforced. It becomes a problem in streaming
>  JOIN.
> >>> For example, one use case is for each hour, apply a JOIN on two
>  streams. In
> >>> this case, no GROUP BY is needed.
> >>> 2) grouped window functions allow multiple calls in GROUP BY. For
>  example,
> >>> from SQL syntax perspective, GROUP BY TUMBLE(...), HOP(...),
>  SESSION(...)
> >>> is not wrong, but it is an illegal query.
> >>> 3) Calcite includes an Enumerable implementation of table function
> >>> windowing, while grouped window functions do not have that.
> >>>
> >>>
> >>> The places that table function windowing behaves worse
> >>> 1) table function windowing adds "window_start", "window_end" into
>  table
> >>> directly, which increases the volume of data (number of 

Re: [ANNOUNCE] New committer: Vineet Garg

2020-04-30 Thread Michael Mior
Congratulations Vineet!
--
Michael Mior
mm...@apache.org

Le dim. 26 avr. 2020 à 17:38, Vineet G  a écrit :
>
> Thanks a lot guys!
>
> Just to briefly introduce myself - I work with Cloudera (Hortonworks before) 
> on Hive and I am a Hive PMC member. As Stamatis noted I have been involved in 
> calcite since 2017. It is great honor to be part of this community. I am very 
> excited to become committer and I look forward to contributing more.
>
> Regards,
> Vineet Garg
>
> > On Apr 26, 2020, at 2:26 PM, Jesus Camacho Rodriguez  
> > wrote:
> >
> > Congrats Vineet, well deserved!
> >
> > -Jesús
> >
> > On Sun, Apr 26, 2020 at 3:09 AM Leonard Xu  wrote:
> >
> >> Congratulations, Vineet!
> >>
> >> Best,
> >> Leonard Xu
> >>> 在 2020年4月26日,18:07,xu  写道:
> >>>
> >>> Congrats, Vineet!
> >>>
> >>> Danny Chan  于2020年4月26日周日 下午4:52写道:
> >>>
>  Congrats, Vineet!
> 
>  Best,
>  Danny Chan
>  在 2020年4月26日 +0800 PM1:55,dev@calcite.apache.org,写道:
> >
> > Congrats, Vineet!
> 
> >>>
> >>>
> >>> --
> >>>
> >>> Best regards,
> >>>
> >>> Xu
> >>
> >>
>


Re: [ANNOUNCE] New committer: Jin Xing

2020-04-30 Thread Michael Mior
Congrats Jin!
--
Michael Mior
mm...@apache.org

Le mer. 29 avr. 2020 à 08:07, XING JIN  a écrit :
>
> Thanks a lot ~
> Calcite is a great project and it's great honor for me to work with you
> guys. I really appreciate the help from community.
> I'm working in Alibaba. My team builds big data system to optimize batch
> and streaming jobs. We use Calcite to process Sql queries and accommodate
> to different physical engines.
> I'm very excited to become Calcite committer and looking forward to make
> more contributions.
>
> Best regards,
> Jin
>
>
> Zoltan Haindrich  于2020年4月29日周三 下午1:58写道:
>
> > Congrats!
> >
> > On 4/29/20 7:32 AM, Enrico Olivelli wrote:
> > > Congratulations!
> > >
> > > Enrico
> > >
> > > Il Mer 29 Apr 2020, 04:51 Feng Zhu  ha scritto:
> > >
> > >>   Congrations!
> > >>
> > >> best,
> > >> Feng
> > >>
> > >> Chunwei Lei  于2020年4月29日周三 上午10:16写道:
> > >>
> > >>> Congrats, Jin!
> > >>>
> > >>>
> > >>> Best,
> > >>> Chunwei
> > >>>
> > >>>
> > >>> On Wed, Apr 29, 2020 at 10:07 AM Forward Xu 
> > >>> wrote:
> > >>>
> >  Congrats
> > 
> > 
> >  best,
> > 
> >  Forward
> > 
> >  953396112 <953396...@qq.com> 于2020年4月29日周三 上午8:21写道:
> > 
> > > Congrats, Jin Xing!
> > >
> > >
> > > ---Original---
> > > From: "Stamatis Zampetakis" > > Date: Wed, Apr 29, 2020 05:47 AM
> > > To: "dev" > > Subject: [ANNOUNCE] New committer: Jin Xing
> > >
> > >
> > > Apache Calcite's Project Management Committee (PMC) has invited Jin
> > >>> Xing
> >  to
> > > become a committer, and we are pleased to announce that he has
> > >>> accepted.
> > >
> > > Jin has contributed a lot of code in the project and many
> > > recent improvements in
> > > materialized view matching have his signature on them. Apart from
> > >> code
> > > contributions, Jin provides valuable help to the community by doing
> >  reviews
> > > and
> > > answering questions in the devlist.
> > >
> > > Jin, welcome, thank you for your contributions, and we look forward
> > >> to
> >  your
> > > further interactions with the community! If you wish, please feel
> > >> free
> > >>> to
> > > tell
> > > us more about yourself and what you are working on.
> > >
> > > Stamatis (on behalf of the Apache Calcite PMC)
> > 
> > >>>
> > >>
> > >
> >


Re: Re: [ANNOUNCE] New committer: Wang Yanlin

2020-04-30 Thread Michael Mior
Congratulations Wang!
--
Michael Mior
mm...@apache.org

Le mer. 29 avr. 2020 à 08:50, Wang Yanlin <1989yanlinw...@163.com> a écrit :
>
> Hi, guys, thanks for your warm welcome.
>
>
>
> I'm working in Ant Finical, Alibaba  Group. Currently my team is working on 
> building a system to process big data in form of sql.
> We use calcite to parse sql, optimize Relnode and rewrite SqlNode to execute 
> on different engines, like Spark,MaxCompute, HBase and so on.
> Calcite is really a great community, and it's really an honor for me to 
> become calcite committer, hops to make more contribution to calcite.
>
>
> Thanks again.
>
> --
>
> Best,
> Wang Yanlin
>
>
>
>
>
> 在 2020-04-29 13:58:35,"Zoltan Haindrich"  写道:
> >Congratulations!
> >
> >On 4/29/20 7:32 AM, Enrico Olivelli wrote:
> >> Congrats!
> >>
> >> Enrico
> >>
> >> Il Mer 29 Apr 2020, 04:51 Feng Zhu  ha scritto:
> >>
> >>>   Congrations! Yanlin!
> >>>
> >>> best,
> >>> Feng
> >>>
> >>> Chunwei Lei  于2020年4月29日周三 上午10:16写道:
> >>>
>  Congrats, Yanlin!
> 
> 
>  Best,
>  Chunwei
> 
> 
>  On Wed, Apr 29, 2020 at 10:07 AM Forward Xu 
>  wrote:
> 
> > Congrats
> >
> >
> > Best,
> >
> > Forward
> >
> > 953396112 <953396...@qq.com> 于2020年4月29日周三 上午8:26写道:
> >
> >> Congrats, Wang Yanlin!
> >>
> >>
> >>
> >>
> >> ---Original---
> >> From: "Stamatis Zampetakis" >> Date: Wed, Apr 29, 2020 05:51 AM
> >> To: "dev" >> Subject: [ANNOUNCE] New committer: Wang Yanlin
> >>
> >>
> >> Apache Calcite's Project Management Committee (PMC) has invited Wang
> > Yanlin
> >> to
> >> become a committer, and we are pleased to announce that he has
>  accepted.
> >>
> >> Wang has pushed numerous fixes and improvements to the project,
> >>> landing
> > in
> >> total
> >> the impressive number of 30 commits to the master. Among other
> >>> things,
>  he
> >> contributed some important features in the Interpreter.
> >>
> >> Wang, welcome, thank you for your contributions, and we look forward
>  your
> >> further interactions with the community! If you wish, please feel
> >>> free
>  to
> >> tell
> >> us more about yourself and what you are working on.
> >>
> >> Stamatis (on behalf of the Apache Calcite PMC)
> >
> 
> >>>
> >>


Re: [ANNOUNCE] New committer: Forward Xu

2020-04-30 Thread Michael Mior
Congratulations Forward!
--
Michael Mior
mm...@apache.org

Le mer. 29 avr. 2020 à 08:34, Forward Xu  a écrit :
>
> Thank you everyone for your warm welcome!
> I'm working in the TBDS team of Tencent in Shenzhen. TBDS (Tencent Big Data
> Suite) is similar to Alibaba's EMR, TBDS is a big data ecosystem. I am
> responsible for Oceanus(flink streaming jobs) and Tdbank (Tencent real-time
> data collection system). I‘m very happy to become calcite committer and
> looking forward to make more contributions.
>
> Best,
> Forward
>
> Zoltan Haindrich  于2020年4月29日周三 下午1:58写道:
>
> > Congratulations!
> >
> > On 4/29/20 7:31 AM, Enrico Olivelli wrote:
> > > Congrats!
> > >
> > > Enrico
> > >
> > > Il Mer 29 Apr 2020, 04:52 Feng Zhu  ha scritto:
> > >
> > >>   Congrations! Forward!
> > >>
> > >> best,
> > >> Feng
> > >>
> > >> Chunwei Lei  于2020年4月29日周三 上午10:17写道:
> > >>
> > >>> Congrats, Forward!
> > >>>
> > >>>
> > >>>
> > >>> Best,
> > >>> Chunwei
> > >>>
> > >>>
> > >>> On Wed, Apr 29, 2020 at 6:46 AM Rui Wang  wrote:
> > >>>
> >  Congrats!
> > 
> > 
> >  -Rui
> > 
> >  On Tue, Apr 28, 2020 at 3:04 PM Francis Chuang <
> > >> francischu...@apache.org
> > 
> >  wrote:
> > 
> > > Congrats, Forward!
> > >
> > > Francis
> > >
> > > On 29/04/2020 7:53 am, Stamatis Zampetakis wrote:
> > >> Apache Calcite's Project Management Committee (PMC) has invited
> > >>> Forward
> > > Xu
> > >> to
> > >> become a committer, and we are pleased to announce that he has
> >  accepted.
> > >>
> > >> Forward has been helping the project for some time now. He added
> > >> many
> >  new
> > >> SQL
> > >> functions to the project and is one of our JSON experts. On top of
> >  that,
> > > and
> > >> other fixes, he is the one who added the Redis adapter to the
> > >>> project.
> > >>
> > >> Forward, welcome, thank you for your contributions, and we look
> > >>> forward
> > > to
> > >> your
> > >> further interactions with the community! If you wish, please feel
> > >>> free
> >  to
> > >> tell
> > >> us more about yourself and what you are working on.
> > >>
> > >> Stamatis (on behalf of the Apache Calcite PMC)
> > >>
> > >
> > 
> > >>>
> > >>
> > >
> >


Re: [DISCUSS] Deprecate grouped window functions

2020-04-30 Thread Timo Walther

Thanks for considering our needs.

I'm pretty sure that windows are in almost every streaming pipeline with 
aggregations. Unlike regular Java API, SQL syntax is very difficult to 
deprecate.


We usually give Flink user 1-2 releases time to update their code. Once 
Calcite supports polymorphic table functions, I think 6 months would be 
helpful otherwise we need to maintain our own fork which we could mostly 
prevent so far.


Regards,
Timo

On 29.04.20 00:49, Rui Wang wrote:

Agreed. I would like to get more feedback to have a
reasonable accommodation for users.


-Rui

On Mon, Apr 27, 2020 at 11:50 AM Julian Hyde  wrote:


Changing my +1 to +0. We have to make reasonable accommodations for our
users. Glad we had this discussion.


On Apr 24, 2020, at 11:10 AM, Rui Wang  wrote:

Hi Timo,

My intention is to fully drop concepts such as SqlGroupedWindowFunction

and

auxiliary group functions, which include relevant code in parser/syntax,
operator, planner, etc.

Since you mentioned the need for more time to migrate. How many Calcite
releases that you think can probably leave enough buffer time? (Calcite
schedules 4 releases a year. So say 2 releases will give 6 months)


-Rui

On Fri, Apr 24, 2020 at 1:50 AM Timo Walther  wrote:


Hi everyone,

so far Apache Flink depends on this feature. We are fine with improving
the SQL compliance and eventually dropping GROUP BY TUMBLE/HOP/SESSION
in the future. However, we would like to give our users some time to
migrate their existing pipelines.

What does dropping mean for Calcite? Will users of Calcite be able to
still support this syntax? In particular, are you intending to also drop
concepts such as SqlGroupedWindowFunction and auxiliary group functions?
Or are you intending to just remove entries from Calcite's default
operator table?

Regards,
Timo


On 24.04.20 10:30, Julian Hyde wrote:

+1

Let’s remove TUMBLE etc from the GROUP BY clause. Since this is a SQL

change, not an API change, I don’t we need to give notice. Let’s just

do it.


Julian


On Apr 22, 2020, at 4:05 PM, Rui Wang  wrote:

Made a mistake on the example above, and update it as follows:

// Table function windowing syntax.
SELECT
product_id, count(*), window_start
FROM TABLE(TUMBLE(order, DESCRIPTOR(rowtime), INTERVAL '1' hour))
GROUP BY product_id, window_start


On Wed, Apr 22, 2020 at 2:31 PM Rui Wang 

wrote:


Hi community,

I want to kick off a discussion about deprecating grouped window

functions

(GROUP BY TUMBLE/HOP/SESSION) as the table function windowing support
becomes a thing [1] (FROM TABLE(TUMBLE/HOP/SESSION)). The current

stage of

table function windowing is TUMBLE support is checked in. HOP and

SESSION

support is likely to be merged in 1.23.0.

A briefly example of two different windowing syntax:

// Grouped window functions.
SELECT
   product_id, count(*), TUMBLE_START() as window_start
FROM order
GROUP BY product_id, TUMBLE(rowtime, INTERVAL '1' hour); // an hour

long

fixed window size.

// Table function windowing syntax.
SELECT
product_id, count(*), window_start
FROM TABLE(TUMBLE(order, DESCRIPTOR(.rowtime), INTERVAL '1' hour)
GROUP BY product_id

I am giving a short, selective comparison as the following:

The places that table function windowing behaves better
1) no GROUPING/GROUP BY enforced. It becomes a problem in streaming

JOIN.

For example, one use case is for each hour, apply a JOIN on two

streams. In

this case, no GROUP BY is needed.
2) grouped window functions allow multiple calls in GROUP BY. For

example,

from SQL syntax perspective, GROUP BY TUMBLE(...), HOP(...),

SESSION(...)

is not wrong, but it is an illegal query.
3) Calcite includes an Enumerable implementation of table function
windowing, while grouped window functions do not have that.


The places that table function windowing behaves worse
1) table function windowing adds "window_start", "window_end" into

table

directly, which increases the volume of data (number of rows *
sizeof(timestamp) * 2).


I want to focus on discussing two questions in this thread:
1) Do people support deprecating grouped window functions?
2) By which version people prefer to make grouped window functions
completely removed?(if 1) is yes).



[1]: https://jira.apache.org/jira/browse/CALCITE-3271


-Rui













Re: [DISCUSS] Towards Cascades Optimizer

2020-04-30 Thread Jinpeng Wu
Sure. I will add more cases to my PR.

I did not design more cases because our own product has a test frameworks,
which contains thousands of actual user queries.
Calcite's code base is quite different. I cannot just migrate cases to
calcite.  So it may take some time.

On Wed, Apr 29, 2020 at 4:27 AM Roman Kondakov 
wrote:

> Hi Jinpeng,
>
> I went through your PR and it seemed very impressive to me. It is very
> similar to what I did, but you've reused many existing logic from the
> Volcano planner. We should definitely stay in sync in our experiments. I
> believe the future Cascades planner will be the kind combination of our
> works.
>
> Is there any way to run tests that are close to the real system query
> execution? May be with Enumerable convention, or, better, with
> convention that supports distribution trait? I just want to look through
> your planner's optimization steps more thoroughly. I've found some tests
> in org.apache.calcite.plan.volcano package, but they use synthetic
> conventions and nodes. May be I missed something.
>
> Thank you for sharing your work!
>
> --
> Kind Regards
> Roman Kondakov
>
>
> On 28.04.2020 15:19, Jinpeng Wu wrote:
> > Hi, Roman. It's great to see your proposal. Actually my team has also
> been
> > working on a cascade planner based on calcite.  And we already have some
> > outcome as well.  Maybe we can combine our works.
> >
> > I've pushed my code as https://github.com/apache/calcite/pull/1950 .
> >
> > Our works have many places in common. We both developed a new
> > CascadePlanner and avoid modifying the old VolcanoPlanner directly. We
> > both implemented the top-down search strategy according to the
> > Columnbia optimizer
> > generator
> > <
> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
> >。But
> > we also have some differences.
> >
> > The first difference is that I try to reuse the existing VolcanoPlanner
> as
> > much as possible. My CascadePlanner inherits from the existing
> > VolcanoPlanner. Except that it overwrites ruleQueue and findBestPlan
> method
> > to rearrange rule applies, most logic generally inherit from
> > VolcanoPlanner. For example,
> >   - It reuses the RelSet and RelSubset class and the register method
> >   - Rules are fired as soon as a RelNode is registered (In the
> > Columnbia optimizer generator, rules are not fired until exploring). The
> > ApplyRule task controls when to invoke the onMatch method of a RuleMatch.
> > This design have a benefit that we do not need to worry about missing a
> > rule or firing a rule multiple times.
> >   - It leverages AbstractConverter to pass traits requirements down.  So
> > currently AC is still essential in my code.
> > This makes the new planner highly compatible with the old VolcanoPlanner.
> > Features like MV and Hints can apply to it directly.  And I tried to
> change
> > VolcanoPlanner to the new CascadePlanner in tests. Most tests passed.
> > Several cases did fail. I know the reason and how to fix them. But I am
> > still thinking about making them as "won't fix" as the ruleset violates
> > some basic principles of top-down trait requests.
> >
> > The second difference is that our design have the ability for space
> > pruning. Currently it contains a simply LowerBoundCost metadata to
> compute
> > the lower bound of a RelNdoe. Because logical properties like cardinality
> > of a RelSet is not stable across exploring, it is required that a group
> to
> > be fully explored (implementation rules and enforcement rules should
> never
> > modify the logical properties) before it can provide a valid lower bound
> > cost. Because of that, logical search space pruning is not supported now.
> > It can only pruned out implementation rules and enforcement rules.
> Testing
> > with cases in our own product, the new planner saves about 10% rule
> > applies. I am still considering how to support logical space pruning,
> > looking forwards to have more improvements.
> >
> > Hope my code will help.
> >
> > Thanks,
> > Jinpeng
> >
> >
> > On Tue, Apr 28, 2020 at 11:22 AM Xiening Dai 
> wrote:
> >
> >> For #1, aside from that we need to be able to build physical nodes based
> >> on a convention. For example, if we merge two EnumerableProject, we
> would
> >> want to create an EnumerableProject as a result, instead of
> LogicalProject.
> >> The RelBuilder change I work on would help this case.
> >>
> >> For #2, I don’t think it’s just a bug. If the physical cost cannot be
> >> reliable before transformation is finished, we should probably delay the
> >> physical cost calculation, or we risk doing it over again. The other
> way is
> >> to complete RelSet transformation before implementing it - which is a
> >> common practice in industry, including Orca.
> >>
> >> The multi-convention is a key scenario, and I agree we should support.
> My
> >> thinking is more about seperating logical one (Conventions.NONE) from
> >> others.
> >>
> >>
> >>> On Apr 27, 

Re: Re: [ANNOUNCE] New committer: Forward Xu

2020-04-30 Thread Danny Chan
Congrations! Forward!

Best,
Danny Chan
在 2020年4月30日 +0800 PM12:14,Fan Liya ,写道:
> Congratulations, Forward!
>
> Best,
> Liya Fan
>
> On Wed, Apr 29, 2020 at 8:51 PM Wang Yanlin <1989yanlinw...@163.com> wrote:
>
> > Congrations! Forward!--
> >
> > Best,
> > Wang Yanlin
> >
> >
> >
> >
> >
> > At 2020-04-29 10:52:25, "Feng Zhu"  wrote:
> > > Congrations! Forward!
> > >
> > > best,
> > > Feng
> > >
> > > Chunwei Lei  于2020年4月29日周三 上午10:17写道:
> > >
> > > > Congrats, Forward!
> > > >
> > > >
> > > >
> > > > Best,
> > > > Chunwei
> > > >
> > > >
> > > > On Wed, Apr 29, 2020 at 6:46 AM Rui Wang  wrote:
> > > >
> > > > > Congrats!
> > > > >
> > > > >
> > > > > -Rui
> > > > >
> > > > > On Tue, Apr 28, 2020 at 3:04 PM Francis Chuang <
> > francischu...@apache.org
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Congrats, Forward!
> > > > > >
> > > > > > Francis
> > > > > >
> > > > > > On 29/04/2020 7:53 am, Stamatis Zampetakis wrote:
> > > > > > > Apache Calcite's Project Management Committee (PMC) has invited
> > > > Forward
> > > > > > Xu
> > > > > > > to
> > > > > > > become a committer, and we are pleased to announce that he has
> > > > > accepted.
> > > > > > >
> > > > > > > Forward has been helping the project for some time now. He added
> > many
> > > > > new
> > > > > > > SQL
> > > > > > > functions to the project and is one of our JSON experts. On top of
> > > > > that,
> > > > > > and
> > > > > > > other fixes, he is the one who added the Redis adapter to the
> > > > project.
> > > > > > >
> > > > > > > Forward, welcome, thank you for your contributions, and we look
> > > > forward
> > > > > > to
> > > > > > > your
> > > > > > > further interactions with the community! If you wish, please feel
> > > > free
> > > > > to
> > > > > > > tell
> > > > > > > us more about yourself and what you are working on.
> > > > > > >
> > > > > > > Stamatis (on behalf of the Apache Calcite PMC)
> > > > > > >
> > > > > >
> > > > >
> > > >
> >


Calcite-Master - Build # 1726 - Still Failing

2020-04-30 Thread Apache Jenkins Server
The Apache Jenkins build system has built Calcite-Master (build #1726)

Status: Still Failing

Check console output at https://builds.apache.org/job/Calcite-Master/1726/ to 
view the results.