Re: Proposing Changes To Heron

Josh Fischer Thu, 01 Mar 2018 07:54:01 -0800

There are still a few comments and thoughts outstanding.  Is this proposal
good to go for a first pass to implement?


Anyone should be able to comment with this link.
https://docs.google.com/document/d/1PxLCyR_H-mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing

On Mon, Feb 26, 2018 at 10:07 PM Yaliang Wang <[email protected]>
wrote:

> Josh,
>
> Totally agree with your concern. I was bringing that idea into
> conversation and thought that as a back up solution. Since Heron is getting
> more and more popular, it would be really nice to have SQL support. I think
> having a built-in Heron SQL can shorten the development iteration since we
> will have less concern of abstraction and generalization in implementation.
>
> Best,
> Yaliang
>
> > On Feb 26, 2018, at 7:17 PM, Josh Fischer <[email protected]> wrote:
> >
> > Yaliang,
> >
> > I think this is a fantastic idea and I agree about the code maintenance
> > being a cost.   I have a concern that creating a smaller project may get
> > abandoned, especially if it had a smaller following.   One of the nice
> > things about Heron is the large community and list of core contributors
> > behind it.  But, I don't want to abandon this idea.  I think, for me at
> > least, that it would make sense to get Storm SQL running in Heron and
> take
> > what we learned from that experience and apply it to a third part project
> > if there is a need/demand for it.  What do you think?
> >
> > -Josh
> >
> > On Mon, Feb 26, 2018 at 6:51 PM, Yaliang Wang
> <[email protected]>
> > wrote:
> >
> >> Sounds like a very great feature to have. A question I have: will it be
> >> feasible to start a separate project to support SQL on Heron-like
> streaming?
> >>
> >> - I’m imaging that there will be a lot code similar/same to Storm SQL.
> >> - Only the last step of the three steps(parse sql -> logical/physical
> plan
> >> -> heron topology) you mentioned is specified for Heron. The first two
> >> steps can be shared for other heron-like streaming vendors.
> >> - The native support for SQL inside the Heron project will give extra
> >> advertising/marketing bonus but with an increase of the code maintenance
> >> cost, especially, if it requires APIs that not very popular and may be
> >> changed over time. However, a separate project can target a specific
> >> version of Heron.
> >>
> >> Best,
> >> Yaliang
> >>
> >>> On Feb 26, 2018, at 12:48 PM, Eren Avsarogullari <
> >> [email protected]> wrote:
> >>>
> >>> +1 for Heron SQL Support. Thanks Josh.
> >>>
> >>> On 26 February 2018 at 18:42, Karthik Ramasamy <[email protected]>
> >> wrote:
> >>>
> >>>> Thanks Josh for initiating this. It will be a great feature to add for
> >>>> Heron.
> >>>>
> >>>> cheers
> >>>> /karthik
> >>>>
> >>>>> On Feb 26, 2018, at 11:11 AM, Josh Fischer <[email protected]>
> >> wrote:
> >>>>>
> >>>>> Jerry,
> >>>>>
> >>>>> Great point.  Lets keep things simple for the migration to make sure
> >> the
> >>>>> implementation is correct.  Then we can modify from there.
> >>>>>
> >>>>> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <
> >>>> [email protected]>
> >>>>> wrote:
> >>>>>
> >>>>>> Thanks Josh for taking the initiative to get this start!  SQL on
> Heron
> >>>>>> will be a great feature! The plan sounds great to me.  Lets first
> get
> >>>>>> an initial version of the Heron SQL out and then we can worry about
> >>>>>> custom / user defined sources and sinks.  We can even start talking
> >>>>>> about UDFs (User defined functions) at that point!
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> Jerry
> >>>>>>
> >>>>>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <[email protected]>
> >>>> wrote:
> >>>>>>> Please see this google drive link for adding comments.  I will copy
> >> and
> >>>>>>> paste the drive doc below as well.
> >>>>>>>
> >>>>>>> https://docs.google.com/document/d/1PxLCyR_H-
> >>>>>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
> >>>>>>>
> >>>>>>>
> >>>>>>> Proposal Below
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> *I am writing this document to propose changes and to start
> >>>> conversations
> >>>>>>> on adding functionality similar to Storm SQL to Heron.  We would
> call
> >>>> it
> >>>>>>> Heron SQL.  After reviewing how the code is structured in Storm I
> >> have
> >>>>>> some
> >>>>>>> suggestions and questions relating to the implementation into the
> >> Heron
> >>>>>>> code base. - High Level Overview Of Code Workflow (Keeping Similar
> to
> >>>>>>> Storm)- We would parse the sql with calcite to create the logical
> and
> >>>>>>> physical plans- We would then convert the logical and physical
> plans
> >>>> to a
> >>>>>>> Heron Topology- We would then submit the Heron Topology into the
> >> Heron
> >>>>>>> System - Some thoughts on code structure and overall
> functionality- I
> >>>>>> think
> >>>>>>> we should place the Heron SQL code base as a top level directory in
> >> the
> >>>>>>> repo. - I will have to add the command “sql” to the Heron command
> >> line
> >>>>>> code
> >>>>>>> in python.- As a first pass implementation users  can interact with
> >>>> Heron
> >>>>>>> SQL via the following command - heron sql <sql-file>
> <topology-name>-
> >>>> We
> >>>>>>> will also support the explain command for displaying the query
> plan,
> >>>> this
> >>>>>>> will not deploy the topology- heron sql <sql-file> --explain- After
> >> the
> >>>>>>> first pass implementation is working smoothly, we can then add an
> >>>>>>> interactive command line interface to accept sql on the fly by
> >> omitting
> >>>>>> the
> >>>>>>> sql file argument- Heron sql <topology-name>- We would support all
> of
> >>>> the
> >>>>>>> existing functionality in Storm SQL today with the exception of
> being
> >>>>>>> dependent on trident.  We would use Storm SQL as a way to deploy
> >>>>>> topologies
> >>>>>>> into Heron.  Similar to how you deploy topologies with the
> Streamlet,
> >>>>>>> Topology, and ECO APIs- Questions- Do we see any issue with this
> plan
> >>>> to
> >>>>>>> implement?- I believe we would have to supply an external jar at
> >> times
> >>>> to
> >>>>>>> connect to external data sources, such as reuse of kafka libraries
> or
> >>>>>>> database drivers.  I see that Storm has few external connectors for
> >>>>>> mongo,
> >>>>>>> kafka, redis and hdfs.  Do we want to limit users to what we decide
> >> to
> >>>>>>> build as connectors or do we want to give them the ability to load
> >>>>>> external
> >>>>>>> jars at submit time? I don’t think heron offers the ability to pass
> >>>> extra
> >>>>>>> jars to via the “--jars” or “--artifacts” flags like Storm does
> >> today.
> >>>>>>> Would this be the correct way to pull in external jars?  Does
> anyone
> >>>>>> have a
> >>>>>>> different idea?  I’m thinking that this might be a v2 feature after
> >> we
> >>>>>> get
> >>>>>>> Heron sql working well.  Ideas, thoughts or concerns?- Is there
> >>>> anything
> >>>>>> I
> >>>>>>> missed?*
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
> --
Sent from A Mobile Device

Re: Proposing Changes To Heron

Reply via email to