Re: Proposing Changes To Heron

Karthik Ramasamy Mon, 26 Feb 2018 10:42:35 -0800

Thanks Josh for initiating this. It will be a great feature to add for Heron.


cheers
/karthik

> On Feb 26, 2018, at 11:11 AM, Josh Fischer <[email protected]> wrote:
> 
> Jerry,
> 
> Great point.  Lets keep things simple for the migration to make sure the
> implementation is correct.  Then we can modify from there.
> 
> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <[email protected]>
> wrote:
> 
>> Thanks Josh for taking the initiative to get this start!  SQL on Heron
>> will be a great feature! The plan sounds great to me.  Lets first get
>> an initial version of the Heron SQL out and then we can worry about
>> custom / user defined sources and sinks.  We can even start talking
>> about UDFs (User defined functions) at that point!
>> 
>> Best,
>> 
>> Jerry
>> 
>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <[email protected]> wrote:
>>> Please see this google drive link for adding comments.  I will copy and
>>> paste the drive doc below as well.
>>> 
>>> https://docs.google.com/document/d/1PxLCyR_H-
>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
>>> 
>>> 
>>> Proposal Below
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> *I am writing this document to propose changes and to start conversations
>>> on adding functionality similar to Storm SQL to Heron.  We would call it
>>> Heron SQL.  After reviewing how the code is structured in Storm I have
>> some
>>> suggestions and questions relating to the implementation into the Heron
>>> code base. - High Level Overview Of Code Workflow (Keeping Similar to
>>> Storm)- We would parse the sql with calcite to create the logical and
>>> physical plans- We would then convert the logical and physical plans to a
>>> Heron Topology- We would then submit the Heron Topology into the Heron
>>> System - Some thoughts on code structure and overall functionality- I
>> think
>>> we should place the Heron SQL code base as a top level directory in the
>>> repo. - I will have to add the command “sql” to the Heron command line
>> code
>>> in python.- As a first pass implementation users  can interact with Heron
>>> SQL via the following command - heron sql <sql-file> <topology-name>- We
>>> will also support the explain command for displaying the query plan, this
>>> will not deploy the topology- heron sql <sql-file> --explain- After the
>>> first pass implementation is working smoothly, we can then add an
>>> interactive command line interface to accept sql on the fly by omitting
>> the
>>> sql file argument- Heron sql <topology-name>- We would support all of the
>>> existing functionality in Storm SQL today with the exception of being
>>> dependent on trident.  We would use Storm SQL as a way to deploy
>> topologies
>>> into Heron.  Similar to how you deploy topologies with the Streamlet,
>>> Topology, and ECO APIs- Questions- Do we see any issue with this plan to
>>> implement?- I believe we would have to supply an external jar at times to
>>> connect to external data sources, such as reuse of kafka libraries or
>>> database drivers.  I see that Storm has few external connectors for
>> mongo,
>>> kafka, redis and hdfs.  Do we want to limit users to what we decide to
>>> build as connectors or do we want to give them the ability to load
>> external
>>> jars at submit time? I don’t think heron offers the ability to pass extra
>>> jars to via the “--jars” or “--artifacts” flags like Storm does today.
>>> Would this be the correct way to pull in external jars?  Does anyone
>> have a
>>> different idea?  I’m thinking that this might be a v2 feature after we
>> get
>>> Heron sql working well.  Ideas, thoughts or concerns?- Is there anything
>> I
>>> missed?*
>>

Re: Proposing Changes To Heron

Reply via email to