Re: Proposing Changes To Heron

Karthik Ramasamy Thu, 01 Mar 2018 09:53:32 -0800

Good for a first pass. Otherwise, if you have comments, please address them
in the document.


On Thu, Mar 1, 2018 at 7:53 AM, Josh Fischer <[email protected]> wrote:

> There are still a few comments and thoughts outstanding.  Is this proposal
> good to go for a first pass to implement?
>
> Anyone should be able to comment with this link.
> https://docs.google.com/document/d/1PxLCyR_H-
> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
>
> On Mon, Feb 26, 2018 at 10:07 PM Yaliang Wang <[email protected]
> >
> wrote:
>
> > Josh,
> >
> > Totally agree with your concern. I was bringing that idea into
> > conversation and thought that as a back up solution. Since Heron is
> getting
> > more and more popular, it would be really nice to have SQL support. I
> think
> > having a built-in Heron SQL can shorten the development iteration since
> we
> > will have less concern of abstraction and generalization in
> implementation.
> >
> > Best,
> > Yaliang
> >
> > > On Feb 26, 2018, at 7:17 PM, Josh Fischer <[email protected]> wrote:
> > >
> > > Yaliang,
> > >
> > > I think this is a fantastic idea and I agree about the code maintenance
> > > being a cost.   I have a concern that creating a smaller project may
> get
> > > abandoned, especially if it had a smaller following.   One of the nice
> > > things about Heron is the large community and list of core contributors
> > > behind it.  But, I don't want to abandon this idea.  I think, for me at
> > > least, that it would make sense to get Storm SQL running in Heron and
> > take
> > > what we learned from that experience and apply it to a third part
> project
> > > if there is a need/demand for it.  What do you think?
> > >
> > > -Josh
> > >
> > > On Mon, Feb 26, 2018 at 6:51 PM, Yaliang Wang
> > <[email protected]>
> > > wrote:
> > >
> > >> Sounds like a very great feature to have. A question I have: will it
> be
> > >> feasible to start a separate project to support SQL on Heron-like
> > streaming?
> > >>
> > >> - I’m imaging that there will be a lot code similar/same to Storm SQL.
> > >> - Only the last step of the three steps(parse sql -> logical/physical
> > plan
> > >> -> heron topology) you mentioned is specified for Heron. The first two
> > >> steps can be shared for other heron-like streaming vendors.
> > >> - The native support for SQL inside the Heron project will give extra
> > >> advertising/marketing bonus but with an increase of the code
> maintenance
> > >> cost, especially, if it requires APIs that not very popular and may be
> > >> changed over time. However, a separate project can target a specific
> > >> version of Heron.
> > >>
> > >> Best,
> > >> Yaliang
> > >>
> > >>> On Feb 26, 2018, at 12:48 PM, Eren Avsarogullari <
> > >> [email protected]> wrote:
> > >>>
> > >>> +1 for Heron SQL Support. Thanks Josh.
> > >>>
> > >>> On 26 February 2018 at 18:42, Karthik Ramasamy <[email protected]>
> > >> wrote:
> > >>>
> > >>>> Thanks Josh for initiating this. It will be a great feature to add
> for
> > >>>> Heron.
> > >>>>
> > >>>> cheers
> > >>>> /karthik
> > >>>>
> > >>>>> On Feb 26, 2018, at 11:11 AM, Josh Fischer <[email protected]>
> > >> wrote:
> > >>>>>
> > >>>>> Jerry,
> > >>>>>
> > >>>>> Great point.  Lets keep things simple for the migration to make
> sure
> > >> the
> > >>>>> implementation is correct.  Then we can modify from there.
> > >>>>>
> > >>>>> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <
> > >>>> [email protected]>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Thanks Josh for taking the initiative to get this start!  SQL on
> > Heron
> > >>>>>> will be a great feature! The plan sounds great to me.  Lets first
> > get
> > >>>>>> an initial version of the Heron SQL out and then we can worry
> about
> > >>>>>> custom / user defined sources and sinks.  We can even start
> talking
> > >>>>>> about UDFs (User defined functions) at that point!
> > >>>>>>
> > >>>>>> Best,
> > >>>>>>
> > >>>>>> Jerry
> > >>>>>>
> > >>>>>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <
> [email protected]>
> > >>>> wrote:
> > >>>>>>> Please see this google drive link for adding comments.  I will
> copy
> > >> and
> > >>>>>>> paste the drive doc below as well.
> > >>>>>>>
> > >>>>>>> https://docs.google.com/document/d/1PxLCyR_H-
> > >>>>>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Proposal Below
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> *I am writing this document to propose changes and to start
> > >>>> conversations
> > >>>>>>> on adding functionality similar to Storm SQL to Heron.  We would
> > call
> > >>>> it
> > >>>>>>> Heron SQL.  After reviewing how the code is structured in Storm I
> > >> have
> > >>>>>> some
> > >>>>>>> suggestions and questions relating to the implementation into the
> > >> Heron
> > >>>>>>> code base. - High Level Overview Of Code Workflow (Keeping
> Similar
> > to
> > >>>>>>> Storm)- We would parse the sql with calcite to create the logical
> > and
> > >>>>>>> physical plans- We would then convert the logical and physical
> > plans
> > >>>> to a
> > >>>>>>> Heron Topology- We would then submit the Heron Topology into the
> > >> Heron
> > >>>>>>> System - Some thoughts on code structure and overall
> > functionality- I
> > >>>>>> think
> > >>>>>>> we should place the Heron SQL code base as a top level directory
> in
> > >> the
> > >>>>>>> repo. - I will have to add the command “sql” to the Heron command
> > >> line
> > >>>>>> code
> > >>>>>>> in python.- As a first pass implementation users  can interact
> with
> > >>>> Heron
> > >>>>>>> SQL via the following command - heron sql <sql-file>
> > <topology-name>-
> > >>>> We
> > >>>>>>> will also support the explain command for displaying the query
> > plan,
> > >>>> this
> > >>>>>>> will not deploy the topology- heron sql <sql-file> --explain-
> After
> > >> the
> > >>>>>>> first pass implementation is working smoothly, we can then add an
> > >>>>>>> interactive command line interface to accept sql on the fly by
> > >> omitting
> > >>>>>> the
> > >>>>>>> sql file argument- Heron sql <topology-name>- We would support
> all
> > of
> > >>>> the
> > >>>>>>> existing functionality in Storm SQL today with the exception of
> > being
> > >>>>>>> dependent on trident.  We would use Storm SQL as a way to deploy
> > >>>>>> topologies
> > >>>>>>> into Heron.  Similar to how you deploy topologies with the
> > Streamlet,
> > >>>>>>> Topology, and ECO APIs- Questions- Do we see any issue with this
> > plan
> > >>>> to
> > >>>>>>> implement?- I believe we would have to supply an external jar at
> > >> times
> > >>>> to
> > >>>>>>> connect to external data sources, such as reuse of kafka
> libraries
> > or
> > >>>>>>> database drivers.  I see that Storm has few external connectors
> for
> > >>>>>> mongo,
> > >>>>>>> kafka, redis and hdfs.  Do we want to limit users to what we
> decide
> > >> to
> > >>>>>>> build as connectors or do we want to give them the ability to
> load
> > >>>>>> external
> > >>>>>>> jars at submit time? I don’t think heron offers the ability to
> pass
> > >>>> extra
> > >>>>>>> jars to via the “--jars” or “--artifacts” flags like Storm does
> > >> today.
> > >>>>>>> Would this be the correct way to pull in external jars?  Does
> > anyone
> > >>>>>> have a
> > >>>>>>> different idea?  I’m thinking that this might be a v2 feature
> after
> > >> we
> > >>>>>> get
> > >>>>>>> Heron sql working well.  Ideas, thoughts or concerns?- Is there
> > >>>> anything
> > >>>>>> I
> > >>>>>>> missed?*
> > >>>>>>
> > >>>>
> > >>>>
> > >>
> > >>
> >
> > --
> Sent from A Mobile Device
>

Re: Proposing Changes To Heron

Reply via email to