Thanks Josh for initiating this. It will be a great feature to add for Heron.
cheers /karthik > On Feb 26, 2018, at 11:11 AM, Josh Fischer <j...@joshfischer.io> wrote: > > Jerry, > > Great point. Lets keep things simple for the migration to make sure the > implementation is correct. Then we can modify from there. > > On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <jerry.boyang.p...@gmail.com> > wrote: > >> Thanks Josh for taking the initiative to get this start! SQL on Heron >> will be a great feature! The plan sounds great to me. Lets first get >> an initial version of the Heron SQL out and then we can worry about >> custom / user defined sources and sinks. We can even start talking >> about UDFs (User defined functions) at that point! >> >> Best, >> >> Jerry >> >> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <j...@joshfischer.io> wrote: >>> Please see this google drive link for adding comments. I will copy and >>> paste the drive doc below as well. >>> >>> https://docs.google.com/document/d/1PxLCyR_H- >> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing >>> >>> >>> Proposal Below >>> >>> >>> >>> >>> >>> >>> >>> *I am writing this document to propose changes and to start conversations >>> on adding functionality similar to Storm SQL to Heron. We would call it >>> Heron SQL. After reviewing how the code is structured in Storm I have >> some >>> suggestions and questions relating to the implementation into the Heron >>> code base. - High Level Overview Of Code Workflow (Keeping Similar to >>> Storm)- We would parse the sql with calcite to create the logical and >>> physical plans- We would then convert the logical and physical plans to a >>> Heron Topology- We would then submit the Heron Topology into the Heron >>> System - Some thoughts on code structure and overall functionality- I >> think >>> we should place the Heron SQL code base as a top level directory in the >>> repo. - I will have to add the command “sql” to the Heron command line >> code >>> in python.- As a first pass implementation users can interact with Heron >>> SQL via the following command - heron sql <sql-file> <topology-name>- We >>> will also support the explain command for displaying the query plan, this >>> will not deploy the topology- heron sql <sql-file> --explain- After the >>> first pass implementation is working smoothly, we can then add an >>> interactive command line interface to accept sql on the fly by omitting >> the >>> sql file argument- Heron sql <topology-name>- We would support all of the >>> existing functionality in Storm SQL today with the exception of being >>> dependent on trident. We would use Storm SQL as a way to deploy >> topologies >>> into Heron. Similar to how you deploy topologies with the Streamlet, >>> Topology, and ECO APIs- Questions- Do we see any issue with this plan to >>> implement?- I believe we would have to supply an external jar at times to >>> connect to external data sources, such as reuse of kafka libraries or >>> database drivers. I see that Storm has few external connectors for >> mongo, >>> kafka, redis and hdfs. Do we want to limit users to what we decide to >>> build as connectors or do we want to give them the ability to load >> external >>> jars at submit time? I don’t think heron offers the ability to pass extra >>> jars to via the “--jars” or “--artifacts” flags like Storm does today. >>> Would this be the correct way to pull in external jars? Does anyone >> have a >>> different idea? I’m thinking that this might be a v2 feature after we >> get >>> Heron sql working well. Ideas, thoughts or concerns?- Is there anything >> I >>> missed?* >>