There are still a few comments and thoughts outstanding. Is this proposal good to go for a first pass to implement?
Anyone should be able to comment with this link. https://docs.google.com/document/d/1PxLCyR_H-mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing On Mon, Feb 26, 2018 at 10:07 PM Yaliang Wang <[email protected]> wrote: > Josh, > > Totally agree with your concern. I was bringing that idea into > conversation and thought that as a back up solution. Since Heron is getting > more and more popular, it would be really nice to have SQL support. I think > having a built-in Heron SQL can shorten the development iteration since we > will have less concern of abstraction and generalization in implementation. > > Best, > Yaliang > > > On Feb 26, 2018, at 7:17 PM, Josh Fischer <[email protected]> wrote: > > > > Yaliang, > > > > I think this is a fantastic idea and I agree about the code maintenance > > being a cost. I have a concern that creating a smaller project may get > > abandoned, especially if it had a smaller following. One of the nice > > things about Heron is the large community and list of core contributors > > behind it. But, I don't want to abandon this idea. I think, for me at > > least, that it would make sense to get Storm SQL running in Heron and > take > > what we learned from that experience and apply it to a third part project > > if there is a need/demand for it. What do you think? > > > > -Josh > > > > On Mon, Feb 26, 2018 at 6:51 PM, Yaliang Wang > <[email protected]> > > wrote: > > > >> Sounds like a very great feature to have. A question I have: will it be > >> feasible to start a separate project to support SQL on Heron-like > streaming? > >> > >> - I’m imaging that there will be a lot code similar/same to Storm SQL. > >> - Only the last step of the three steps(parse sql -> logical/physical > plan > >> -> heron topology) you mentioned is specified for Heron. The first two > >> steps can be shared for other heron-like streaming vendors. > >> - The native support for SQL inside the Heron project will give extra > >> advertising/marketing bonus but with an increase of the code maintenance > >> cost, especially, if it requires APIs that not very popular and may be > >> changed over time. However, a separate project can target a specific > >> version of Heron. > >> > >> Best, > >> Yaliang > >> > >>> On Feb 26, 2018, at 12:48 PM, Eren Avsarogullari < > >> [email protected]> wrote: > >>> > >>> +1 for Heron SQL Support. Thanks Josh. > >>> > >>> On 26 February 2018 at 18:42, Karthik Ramasamy <[email protected]> > >> wrote: > >>> > >>>> Thanks Josh for initiating this. It will be a great feature to add for > >>>> Heron. > >>>> > >>>> cheers > >>>> /karthik > >>>> > >>>>> On Feb 26, 2018, at 11:11 AM, Josh Fischer <[email protected]> > >> wrote: > >>>>> > >>>>> Jerry, > >>>>> > >>>>> Great point. Lets keep things simple for the migration to make sure > >> the > >>>>> implementation is correct. Then we can modify from there. > >>>>> > >>>>> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng < > >>>> [email protected]> > >>>>> wrote: > >>>>> > >>>>>> Thanks Josh for taking the initiative to get this start! SQL on > Heron > >>>>>> will be a great feature! The plan sounds great to me. Lets first > get > >>>>>> an initial version of the Heron SQL out and then we can worry about > >>>>>> custom / user defined sources and sinks. We can even start talking > >>>>>> about UDFs (User defined functions) at that point! > >>>>>> > >>>>>> Best, > >>>>>> > >>>>>> Jerry > >>>>>> > >>>>>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <[email protected]> > >>>> wrote: > >>>>>>> Please see this google drive link for adding comments. I will copy > >> and > >>>>>>> paste the drive doc below as well. > >>>>>>> > >>>>>>> https://docs.google.com/document/d/1PxLCyR_H- > >>>>>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing > >>>>>>> > >>>>>>> > >>>>>>> Proposal Below > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> *I am writing this document to propose changes and to start > >>>> conversations > >>>>>>> on adding functionality similar to Storm SQL to Heron. We would > call > >>>> it > >>>>>>> Heron SQL. After reviewing how the code is structured in Storm I > >> have > >>>>>> some > >>>>>>> suggestions and questions relating to the implementation into the > >> Heron > >>>>>>> code base. - High Level Overview Of Code Workflow (Keeping Similar > to > >>>>>>> Storm)- We would parse the sql with calcite to create the logical > and > >>>>>>> physical plans- We would then convert the logical and physical > plans > >>>> to a > >>>>>>> Heron Topology- We would then submit the Heron Topology into the > >> Heron > >>>>>>> System - Some thoughts on code structure and overall > functionality- I > >>>>>> think > >>>>>>> we should place the Heron SQL code base as a top level directory in > >> the > >>>>>>> repo. - I will have to add the command “sql” to the Heron command > >> line > >>>>>> code > >>>>>>> in python.- As a first pass implementation users can interact with > >>>> Heron > >>>>>>> SQL via the following command - heron sql <sql-file> > <topology-name>- > >>>> We > >>>>>>> will also support the explain command for displaying the query > plan, > >>>> this > >>>>>>> will not deploy the topology- heron sql <sql-file> --explain- After > >> the > >>>>>>> first pass implementation is working smoothly, we can then add an > >>>>>>> interactive command line interface to accept sql on the fly by > >> omitting > >>>>>> the > >>>>>>> sql file argument- Heron sql <topology-name>- We would support all > of > >>>> the > >>>>>>> existing functionality in Storm SQL today with the exception of > being > >>>>>>> dependent on trident. We would use Storm SQL as a way to deploy > >>>>>> topologies > >>>>>>> into Heron. Similar to how you deploy topologies with the > Streamlet, > >>>>>>> Topology, and ECO APIs- Questions- Do we see any issue with this > plan > >>>> to > >>>>>>> implement?- I believe we would have to supply an external jar at > >> times > >>>> to > >>>>>>> connect to external data sources, such as reuse of kafka libraries > or > >>>>>>> database drivers. I see that Storm has few external connectors for > >>>>>> mongo, > >>>>>>> kafka, redis and hdfs. Do we want to limit users to what we decide > >> to > >>>>>>> build as connectors or do we want to give them the ability to load > >>>>>> external > >>>>>>> jars at submit time? I don’t think heron offers the ability to pass > >>>> extra > >>>>>>> jars to via the “--jars” or “--artifacts” flags like Storm does > >> today. > >>>>>>> Would this be the correct way to pull in external jars? Does > anyone > >>>>>> have a > >>>>>>> different idea? I’m thinking that this might be a v2 feature after > >> we > >>>>>> get > >>>>>>> Heron sql working well. Ideas, thoughts or concerns?- Is there > >>>> anything > >>>>>> I > >>>>>>> missed?* > >>>>>> > >>>> > >>>> > >> > >> > > -- Sent from A Mobile Device
