Good for a first pass. Otherwise, if you have comments, please address them in the document.
On Thu, Mar 1, 2018 at 7:53 AM, Josh Fischer <[email protected]> wrote: > There are still a few comments and thoughts outstanding. Is this proposal > good to go for a first pass to implement? > > Anyone should be able to comment with this link. > https://docs.google.com/document/d/1PxLCyR_H- > mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing > > On Mon, Feb 26, 2018 at 10:07 PM Yaliang Wang <[email protected] > > > wrote: > > > Josh, > > > > Totally agree with your concern. I was bringing that idea into > > conversation and thought that as a back up solution. Since Heron is > getting > > more and more popular, it would be really nice to have SQL support. I > think > > having a built-in Heron SQL can shorten the development iteration since > we > > will have less concern of abstraction and generalization in > implementation. > > > > Best, > > Yaliang > > > > > On Feb 26, 2018, at 7:17 PM, Josh Fischer <[email protected]> wrote: > > > > > > Yaliang, > > > > > > I think this is a fantastic idea and I agree about the code maintenance > > > being a cost. I have a concern that creating a smaller project may > get > > > abandoned, especially if it had a smaller following. One of the nice > > > things about Heron is the large community and list of core contributors > > > behind it. But, I don't want to abandon this idea. I think, for me at > > > least, that it would make sense to get Storm SQL running in Heron and > > take > > > what we learned from that experience and apply it to a third part > project > > > if there is a need/demand for it. What do you think? > > > > > > -Josh > > > > > > On Mon, Feb 26, 2018 at 6:51 PM, Yaliang Wang > > <[email protected]> > > > wrote: > > > > > >> Sounds like a very great feature to have. A question I have: will it > be > > >> feasible to start a separate project to support SQL on Heron-like > > streaming? > > >> > > >> - I’m imaging that there will be a lot code similar/same to Storm SQL. > > >> - Only the last step of the three steps(parse sql -> logical/physical > > plan > > >> -> heron topology) you mentioned is specified for Heron. The first two > > >> steps can be shared for other heron-like streaming vendors. > > >> - The native support for SQL inside the Heron project will give extra > > >> advertising/marketing bonus but with an increase of the code > maintenance > > >> cost, especially, if it requires APIs that not very popular and may be > > >> changed over time. However, a separate project can target a specific > > >> version of Heron. > > >> > > >> Best, > > >> Yaliang > > >> > > >>> On Feb 26, 2018, at 12:48 PM, Eren Avsarogullari < > > >> [email protected]> wrote: > > >>> > > >>> +1 for Heron SQL Support. Thanks Josh. > > >>> > > >>> On 26 February 2018 at 18:42, Karthik Ramasamy <[email protected]> > > >> wrote: > > >>> > > >>>> Thanks Josh for initiating this. It will be a great feature to add > for > > >>>> Heron. > > >>>> > > >>>> cheers > > >>>> /karthik > > >>>> > > >>>>> On Feb 26, 2018, at 11:11 AM, Josh Fischer <[email protected]> > > >> wrote: > > >>>>> > > >>>>> Jerry, > > >>>>> > > >>>>> Great point. Lets keep things simple for the migration to make > sure > > >> the > > >>>>> implementation is correct. Then we can modify from there. > > >>>>> > > >>>>> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng < > > >>>> [email protected]> > > >>>>> wrote: > > >>>>> > > >>>>>> Thanks Josh for taking the initiative to get this start! SQL on > > Heron > > >>>>>> will be a great feature! The plan sounds great to me. Lets first > > get > > >>>>>> an initial version of the Heron SQL out and then we can worry > about > > >>>>>> custom / user defined sources and sinks. We can even start > talking > > >>>>>> about UDFs (User defined functions) at that point! > > >>>>>> > > >>>>>> Best, > > >>>>>> > > >>>>>> Jerry > > >>>>>> > > >>>>>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer < > [email protected]> > > >>>> wrote: > > >>>>>>> Please see this google drive link for adding comments. I will > copy > > >> and > > >>>>>>> paste the drive doc below as well. > > >>>>>>> > > >>>>>>> https://docs.google.com/document/d/1PxLCyR_H- > > >>>>>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing > > >>>>>>> > > >>>>>>> > > >>>>>>> Proposal Below > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> *I am writing this document to propose changes and to start > > >>>> conversations > > >>>>>>> on adding functionality similar to Storm SQL to Heron. We would > > call > > >>>> it > > >>>>>>> Heron SQL. After reviewing how the code is structured in Storm I > > >> have > > >>>>>> some > > >>>>>>> suggestions and questions relating to the implementation into the > > >> Heron > > >>>>>>> code base. - High Level Overview Of Code Workflow (Keeping > Similar > > to > > >>>>>>> Storm)- We would parse the sql with calcite to create the logical > > and > > >>>>>>> physical plans- We would then convert the logical and physical > > plans > > >>>> to a > > >>>>>>> Heron Topology- We would then submit the Heron Topology into the > > >> Heron > > >>>>>>> System - Some thoughts on code structure and overall > > functionality- I > > >>>>>> think > > >>>>>>> we should place the Heron SQL code base as a top level directory > in > > >> the > > >>>>>>> repo. - I will have to add the command “sql” to the Heron command > > >> line > > >>>>>> code > > >>>>>>> in python.- As a first pass implementation users can interact > with > > >>>> Heron > > >>>>>>> SQL via the following command - heron sql <sql-file> > > <topology-name>- > > >>>> We > > >>>>>>> will also support the explain command for displaying the query > > plan, > > >>>> this > > >>>>>>> will not deploy the topology- heron sql <sql-file> --explain- > After > > >> the > > >>>>>>> first pass implementation is working smoothly, we can then add an > > >>>>>>> interactive command line interface to accept sql on the fly by > > >> omitting > > >>>>>> the > > >>>>>>> sql file argument- Heron sql <topology-name>- We would support > all > > of > > >>>> the > > >>>>>>> existing functionality in Storm SQL today with the exception of > > being > > >>>>>>> dependent on trident. We would use Storm SQL as a way to deploy > > >>>>>> topologies > > >>>>>>> into Heron. Similar to how you deploy topologies with the > > Streamlet, > > >>>>>>> Topology, and ECO APIs- Questions- Do we see any issue with this > > plan > > >>>> to > > >>>>>>> implement?- I believe we would have to supply an external jar at > > >> times > > >>>> to > > >>>>>>> connect to external data sources, such as reuse of kafka > libraries > > or > > >>>>>>> database drivers. I see that Storm has few external connectors > for > > >>>>>> mongo, > > >>>>>>> kafka, redis and hdfs. Do we want to limit users to what we > decide > > >> to > > >>>>>>> build as connectors or do we want to give them the ability to > load > > >>>>>> external > > >>>>>>> jars at submit time? I don’t think heron offers the ability to > pass > > >>>> extra > > >>>>>>> jars to via the “--jars” or “--artifacts” flags like Storm does > > >> today. > > >>>>>>> Would this be the correct way to pull in external jars? Does > > anyone > > >>>>>> have a > > >>>>>>> different idea? I’m thinking that this might be a v2 feature > after > > >> we > > >>>>>> get > > >>>>>>> Heron sql working well. Ideas, thoughts or concerns?- Is there > > >>>> anything > > >>>>>> I > > >>>>>>> missed?* > > >>>>>> > > >>>> > > >>>> > > >> > > >> > > > > -- > Sent from A Mobile Device >
