Re: Support SqlStreaming in spark

2018-12-21 Thread JackyLee
Hi wenchen I have been working at SQLStreaming for a year, and I have promoted it in company. I have seen the design for Kafka or the Calcite, and I believe my design is better than them. They support pure-SQL not table API for streaming. Users can only use the specified Streaming

Re: Support SqlStreaming in spark

2018-12-21 Thread JackyLee
Hi wenchen and Arun Mahadevan Thanks for your reply. SQLStreaming is not just a way to support pure-SQL, but also a way to define table api for Streaming. I have redefined the SQLStreaming to make it support table API. User can use sql or table API to run SQLStreaming. I will

Re: Support SqlStreaming in spark

2018-12-21 Thread Arun Mahadevan
There has been efforts to come up with a unified syntax for streaming (see [1] [2]), but I guess there will be differences based on the streaming features supported by a system. Agree it needs a detailed design and it can be as close to the Spark batch SQL syntax as possible. Also I am not sure

Re: Support SqlStreaming in spark

2018-12-21 Thread Wenchen Fan
It will be great to add pure-SQL support to structured streaming. I think it goes without saying that how important SQL support is, but we should make a completed design first. Looking at the Kafka streaming syntax , it has

Re: [DISCUSS] Default values and data sources

2018-12-21 Thread Ryan Blue
I agree with Reynold's sentiment here. We don't want to create too many capabilities because it makes everything more complicated for both sources and Spark. Let's just go with the capability to read missing columns for now and we can add support for default values if and when Spark DDL begins to

Re: [DISCUSS] Default values and data sources

2018-12-21 Thread Ryan Blue
Alessandro, yes. This was one of the use cases that motivated the capability API I proposed. After this discussion, I think we probably need a couple of capabilities. First, the capability that indicates reads will fill in some default value for missing columns. That way, Spark allows writes to

Re: [DISCUSS] Default values and data sources

2018-12-21 Thread Reynold Xin
I'd only do any of the schema evolution things as add-on on top. This is an extremely complicated area and we could risk never shipping anything because there would be a lot of different requirements. On Fri, Dec 21, 2018 at 9:46 AM, Russell Spitzer < russell.spit...@gmail.com > wrote: > > I

Re: [DISCUSS] Default values and data sources

2018-12-21 Thread Russell Spitzer
I definitely would like to have a "column can be missing" capability, allowing the underlying datasource to fill in a default if it wants to (or not). On Fri, Dec 21, 2018 at 1:40 AM Alessandro Solimando < alessandro.solima...@gmail.com> wrote: > Hello, > I agree that Spark should check whether