Re: How to contribute to Streaming Table API and StreamSQL

Jark Wu Tue, 14 Jun 2016 02:03:57 -0700

Hi Fabian, 

It’s great to hear that we are going to start it!


I’m glad to share our current Streaming Table API [1]. I find that that all 
aggregation functions are scoped to the defined window in Flink Stream Table 
API design [2] and Calcite StreamSQL desgin [3]. I’m thinking that do we need 
global aggregation? The global aggregation means that aggregation is applied 
only on grouped key not including window which is supported in DataStream 
`datastream.keyBy(f1).sum(f2)`.  

Since the window syntax of StreamSQL is not implemented yet, so will we help 
Calcite community with that first or work code for window+agg Table API first ? 


[1] 
https://docs.google.com/document/d/1KMUzvBAWSyQ39T8MyxUi0zNHyvLUnyGMPA7_RLSDpFw/edit?usp=sharing
 
<https://docs.google.com/document/d/1KMUzvBAWSyQ39T8MyxUi0zNHyvLUnyGMPA7_RLSDpFw/edit?usp=sharing>
[2] 
https://docs.google.com/document/d/19kSOAOINKCSWLBCKRq2WvNtmuaA9o3AyCh2ePqr3V5E/edit#
 
<https://docs.google.com/document/d/19kSOAOINKCSWLBCKRq2WvNtmuaA9o3AyCh2ePqr3V5E/edit#>
[3] https://calcite.apache.org/docs/stream.html#tumbling-windows 
<https://calcite.apache.org/docs/stream.html#tumbling-windows>


- Jark Wu 

> 在 2016年6月14日，上午1:10，Fabian Hueske <[email protected]> 写道：
> 
> Hi Jark,
> 
> wow, that's good news!
> You are right, the streaming Table API is currently very limited. In the
> last month's we changed the internal architecture and put everything on top
> of Apache Calcite.
> For the upcoming 1.1 release, we won't add new features to the Table API /
> SQL. However for the 1.2 release, it we plan to focus on the streaming
> Table API and Stream SQL to add support for windowed aggregates and joins,
> which corresponds to Task 7 and 9 in the design document. You are
> completely right, that we should start to add tickets to JIRA for this. I
> will do that tomorrow.
> 
> It is great that you have already working code for windowed aggregates and
> joins! Here is a link to our current API draft for windows in the Table API
> [1]. Would be great if you could share how your API looks like. As you
> said, the internals have changed a lot by now, but we might want to reuse
> your API for Table API windows and maybe the code of the runtime. However,
> we need to go through Calcite for optimization and SQL support, so some
> parts need to be definitely changed. Stream SQL is also on the roadmap of
> the Calcite community, but it might be that some features that we will need
> are not completed yet. So, maybe we help the Calcite community with that as
> well.
> 
> If you want to contribute, you should first read the how to contribute
> guide [2] and guide for code contributions [3].
> The general rule is to first open a JIRA and later a pull request. Changes
> that are extensive or modify current behavior (except bugs) should be
> discussed before starting to work on them.
> 
> Looking forward to work with you on Flink,
> Fabian
> 
> [1]
> https://docs.google.com/document/d/19kSOAOINKCSWLBCKRq2WvNtmuaA9o3AyCh2ePqr3V5E/edit#heading=h.3iw7frfjdcb2
> [2] http://flink.apache.org/how-to-contribute.html
> [3] http://flink.apache.org/contribute-code.html
> 
> 
> 2016-06-13 11:31 GMT+02:00 Jark Wu <[email protected]>:
> 
>> Hi,
>> 
>> We have a great interest in the new Table API & SQL. In Alibaba, we have
>> added a lot of features (groupBy, agg, window, join, UDF …) to Streaming
>> Table API (base on Flink 1.0). Now, many jobs run on Table API in
>> production environment. But we want to keep pace with the community, and we
>> have noticed that Flink Community reworked the Table API and also supported
>> SQL. That is really cool. However, the Streaming Table API is still so
>> weak. So we want to contribute to accelerate the Streaming Table API and
>> StreamSQL growth.
>> 
>> It seems that we have complete Task-5 and Task-6 mentioned in the Work
>> Plan <
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#>.
>> So can we start Task-7 and Task-9 now? Is there any more specific plans? I
>> think it’s better to create an umbrella JIRA like FLINK-3221 to make the
>> develop plan clearer.
>> 
>> If I want to contribute code for groupBy and agg function, what should I
>> do? As I didn’t find related JIRAs, can I create JIRA and pull a request
>> directly?
>> 
>> Sorry for so many questions at a time.
>> 
>> 
>> 
>> - Jark Wu (wuchong)
>> 
>>

Re: How to contribute to Streaming Table API and StreamSQL

Reply via email to