Re: SQL for Flink

Timo Walther Thu, 15 Sep 2016 02:18:00 -0700

Hi Radu,

thanks for continuing the discussion we had at the conference here. Yourproposals are all valid. If you have a look at the inital designdocument [1] for Table API/SQL we plan to add a SQL client at somepoint, but first we should focus on extending the set of supportedoperations. A first step regarding windows and aggregations on streamscan be found in the current FLIP-11 [2]. However, it only describes theTable API so far. How Stream SQL should extactly look like is still upfor discussion (together with the Calcite guys). In a long term view theTable API could become a DataSet++ or DataStream++. We could add supportfor UDFs and operations such as map/reduce. If customization/replacementof existing rules is required we can add a Jira issue for that.

The development just started so there is a lot to improve and to add.New contibutions, discussions on certain features and design documentsare always welcome.

Btw. this discussion should actually be continued on the dev mailing list.

Timo

[1]https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.4vdi2v1tlg8h[2]https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations



Am 14/09/16 um 15:07 schrieb Deepak Sharma:


Thanks Greg .
I will start picking some of them.

Thanks
Deepak

On 14 Sep 2016 6:31 pm, "Greg Hogan" <c...@greghogan.com<mailto:c...@greghogan.com>> wrote:


    Hi Deepak,

    There are many open tickets for Flink's SQL API. Documentation is
    at
    https://ci.apache.org/projects/flink/flink-docs-master/dev/table_api.html
    <https://ci.apache.org/projects/flink/flink-docs-master/dev/table_api.html>.

    
https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20%22Table%20API%20%26%20SQL%22%20ORDER%20BY%20priority%20DESC
    
<https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20%22Table%20API%20%26%20SQL%22%20ORDER%20BY%20priority%20DESC>

    Greg

    On Wed, Sep 14, 2016 at 12:27 PM, Deepak Sharma
    <deepakmc...@gmail.com <mailto:deepakmc...@gmail.com>> wrote:

        +1
        Yes.I agree to having SQL for Flink.
        I can take up some tasks as well once this starts.

        Thanks
        Deepak

        On Wed, Sep 14, 2016 at 3:47 PM, Radu Tudoran
        <radu.tudo...@huawei.com <mailto:radu.tudo...@huawei.com>> wrote:

            Hi,

            As a follow up to multiple discussions that happened
            during Flink Forward about how SQL should be supported by
            Flink, I was thinking to make a couple of proposals.

            Disclaimer: I do not claim I have managed to synthesized
            all the discussions and probably a great deal of things
            are still missing

            *Why supporting SQL for Flink?*

            -A goal to support SQL for Flink should be to enable
            larger adoption of Flink – particularly for data
            scientists / data engineers who might not want/know how to
            program against the existing APIs

            -The main implication as I see from this is that SQL
            should serve as a translation tool of the data processing
            processing flow to a stream topology that will be executed
            by Flink

            -This would require to support rather soon an SQL client
            for Flink

            *How many features should be supported?*

            -In order to enable a (close to ) full benefit of the
            processing capabilities of Flink, I believe most of the
            processing types should be supported – this includes all
            different types of windows, aggregations, transformations,
            joins….

            -I would propose that UDFs should also be supported such
            that one can easily add more complex computation if needed

            -In the spirit of the extensibility that Flink supports
            for the operators, functions… such custom operators should
            be supported to replace the default implementations of the
            SQL logical operators

            *How much customization should be enabled?*

            -Regarding customization this could be provided by
            configuration files. Such a configuration can cover the
            policies for how the triggers, evictors, parallelization …
             will be done for the specific translation of the SQL
            query into Flink code

            -In order to support the integration of custom operators
            for specific SQL logical operators, the users should be
            enabled also to provide translation RULES that will
            replace the default ones  (e.g. if a user want to define
            their own CUSTOM_TABLE_SCAN, it should be able to provide
            something like
            configuration.replaceRule(DataStreamScanRule.INSTANCE ,
            CUSTOM_TABLE_SCAN_Rule.INSTANCE) – or if the selection of
            the new translation rule can be handled from the cost than
            simply configuration.addRule( CUSTOM_TABLE_SCAN_Rule.INSTANCE)

            What do you think?

            Dr. Radu Tudoran

            Senior Research Engineer - Big Data Expert

            IT R&D Division

            cid:image007.jpg@01CD52EB.AD060EE0

            HUAWEI TECHNOLOGIES Duesseldorf GmbH

            European Research Center

            Riesstrasse 25, 80992 München

            E-mail: _radu.tudo...@huawei.com
            <mailto:radu.tudo...@huawei.com>_

            Mobile: +49 15209084330 <tel:%2B49%2015209084330>

            Telephone: +49 891588344173 <tel:%2B49%20891588344173>

            HUAWEI TECHNOLOGIES Duesseldorf GmbH
            Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com
            <http://www.huawei.com/>
            Registered Office: Düsseldorf, Register Court Düsseldorf,
            HRB 56063,
            Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
            Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf,
            HRB 56063,
            Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN

            This e-mail and its attachments contain confidential
            information from HUAWEI, which is intended only for the
            person or entity whose address is listed above. Any use of
            the information contained herein in any way (including,
            but not limited to, total or partial disclosure,
            reproduction, or dissemination) by persons other than the
            intended recipient(s) is prohibited. If you receive this
            e-mail in error, please notify the sender by phone or
            email immediately and delete it!

--Thanks

        Deepak
        www.bigdatabig.com <http://www.bigdatabig.com>
        www.keosha.net <http://www.keosha.net>



--
Freundliche Grüße / Kind Regards

Timo Walther

Follow me: @twalthr
https://www.linkedin.com/in/twalthr

Re: SQL for Flink

Reply via email to