It's more about how State API can be introduced in SQL, the snapshot of state converts stream to a table which is very helpful. SQL keyword INSERT INTO may be an option to do that but I've no confidence so far.
On Fri, Apr 14, 2017 at 3:03 PM, Tyler Akidau <taki...@apache.org> wrote: > Tarush: I don't think it depends upon the time frame (although you may be > interested in only a specific timeframe materialized within the table). > Stream to table conversion is purely a byproduct of grouping a stream. I > have a doc I'm getting some initial reviews on currently that I hope to > send out next week to hopefully give some more background here. And > windowing is really just an additional dimension in grouping. An important > one, to be sure, but still just grouping. > > Mingmin: can you expand upon those statements? I'm not sure I fully > understand what you're saying. > > -Tyler > > On Wed, Apr 12, 2017 at 9:38 PM Mingmin Xu <mingm...@gmail.com> wrote: > > > Expose streaming snapshot via STATE is attractive in Beam model, but > doubt > > it's the right way in SQL. IMO,there's 'INSERT INTO' to persistent > > streaming output. > > > > > > On Wed, Apr 12, 2017 at 8:37 PM, tarush grover <tarushappt...@gmail.com> > > wrote: > > > > > Hi Tyler, > > > > > > Transforming stream into a table will also depend on the time frame in > > the > > > stream or what windows we choose for the stream. > > > > > > Regards, > > > Tarush > > > > > > > > > On Tue, 11 Apr 2017 at 11:29 PM, Tyler Akidau > <taki...@google.com.invalid > > > > > > wrote: > > > > > > > Hi 陈竞, > > > > > > > > I'm doubtful there will be an explicit equivalent of the State API in > > > SQL, > > > > at least not in the SQL portion of the DSL itself (it might make > sense > > to > > > > expose one within UDFs). The State API is an imperative interface for > > > > accessing an underlying persistent state table, whereas SQL operates > > more > > > > functionally. There's no good way I'm aware of to expose the > > > > characteristics provided by the State API (logic-driven, fine- and > > > > coarse-grained reads/writes of potentially multiple fields of state > > > > utilizing potentially multiple data types) in raw SQL cleanly. > > > > > > > > On the upside, SQL has the advantage of making it very easy to > > > materialize > > > > new state tables very naturally. In the proposal I'll be sharing for > > how > > > I > > > > think we should integrate streaming into SQL robustly, any time you > > > perform > > > > some grouping operation (GROUP BY, JOIN, CUBE, etc) you're > transforming > > > > your stream into a table. That table is effectively a persistent > state > > > > table. So there exists a large suite of functionality in standard SQL > > > that > > > > gives you a lot of powerful tools for creating state. > > > > > > > > It may also be possible for the different access patterns of more > > > > complicated data structures (e.g., bags or lists) to be captured by > > > > different data types supported by the underlying systems. But I don't > > > > expect there to be an imperative State access API built into SQL > > itself. > > > > > > > > All that said, I'm curious to hear ideas otherwise if anyone has > them. > > > :-) > > > > > > > > -Tyler > > > > > > > > On Mon, Apr 10, 2017 at 10:19 PM 陈竞 <cj.mag...@gmail.com> wrote: > > > > > > > > > i just want to know what the SQL State API equivalent is for SQL, > > since > > > > > beam has already support stateful processing using state DoFn > > > > > > > > > > 2017-04-11 2:12 GMT+08:00 Tyler Akidau <taki...@google.com.invalid > >: > > > > > > > > > > > 陈竞, what are you specifically curious about regarding state? Are > > you > > > > > > wanting to know what the SQL State API equivalent is for SQL? Or > > are > > > > you > > > > > > asking an operational question about where the state for a given > > SQL > > > > > > pipeline will live? > > > > > > > > > > > > -Tyler > > > > > > > > > > > > > > > > > > On Sun, Apr 9, 2017 at 12:39 PM Mingmin Xu <mingm...@gmail.com> > > > wrote: > > > > > > > > > > > > > Thanks @JB, will come out the initial PR soon. > > > > > > > > > > > > > > On Sun, Apr 9, 2017 at 12:28 PM, Jean-Baptiste Onofré < > > > > j...@nanthrax.net > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > As discussed, I created the DSL_SQL branch with the skeleton. > > > > Mingmin > > > > > > is > > > > > > > > rebasing on this branch to submit the PR. > > > > > > > > > > > > > > > > Regards > > > > > > > > JB > > > > > > > > > > > > > > > > > > > > > > > > On 04/09/2017 08:02 PM, Mingmin Xu wrote: > > > > > > > > > > > > > > > >> State is not touched yet, welcome to add it. > > > > > > > >> > > > > > > > >> On Sun, Apr 9, 2017 at 2:40 AM, 陈竞 <cj.mag...@gmail.com> > > wrote: > > > > > > > >> > > > > > > > >> how will this sql support state both in streaming and batch > > mode > > > > > > > >>> > > > > > > > >>> 2017-04-07 4:54 GMT+08:00 Mingmin Xu <mingm...@gmail.com>: > > > > > > > >>> > > > > > > > >>> @Tyler, there's no big change in the previous design doc, I > > > added > > > > > > some > > > > > > > >>>> details in chapter 'Part 2. DML( [INSERT] SELECT )' , > > > describing > > > > > > steps > > > > > > > >>>> to > > > > > > > >>>> process a query, feel free to leave a comment. > > > > > > > >>>> > > > > > > > >>>> Come through your doc of 'EMIT', it's awesome from my > > > > perspective. > > > > > > > I've > > > > > > > >>>> some tests on GroupBy with default > triggers/allowed_lateness > > > > now. > > > > > > EMIT > > > > > > > >>>> syntax can be added to fill the gap. > > > > > > > >>>> > > > > > > > >>>> On Thu, Apr 6, 2017 at 1:04 PM, Tyler Akidau < > > > > taki...@apache.org> > > > > > > > >>>> wrote: > > > > > > > >>>> > > > > > > > >>>> I'm very excited by this development as well, thanks for > > > > > continuing > > > > > > to > > > > > > > >>>>> > > > > > > > >>>> push > > > > > > > >>>> > > > > > > > >>>>> this forward, Mingmin. :-) > > > > > > > >>>>> > > > > > > > >>>>> I noticed you'd made some changes to your design doc > > > > > > > >>>>> < > > > > > https://docs.google.com/document/d/1Uc5xYTpO9qsLXtT38OfuoqSLimH_ > > > > > > > >>>>> 0a1Bz5BsCROMzCU/edit>. > > > > > > > >>>>> Is it ready for another review? How reflective is it > > > currently > > > > of > > > > > > the > > > > > > > >>>>> > > > > > > > >>>> work > > > > > > > >>>> > > > > > > > >>>>> that going into the feature branch? > > > > > > > >>>>> > > > > > > > >>>>> In parallel, I'd also like to continue helping push > forward > > > the > > > > > > > >>>>> > > > > > > > >>>> definition > > > > > > > >>>> > > > > > > > >>>>> of unified model semantics for SQL so we can get Calcite > > to a > > > > > point > > > > > > > >>>>> > > > > > > > >>>> where > > > > > > > >>> > > > > > > > >>>> it supports the full Beam model. I added a comment > > > > > > > >>>>> <https://issues.apache.org/jira/browse/BEAM-301? > > > > > > > >>>>> > > > > > > > >>>> focusedCommentId=15959621& > > > > > > > >>>> > > > > > > > >>>>> page=com.atlassian.jira.plugin.system.issuetabpanels: > > > > > > > >>>>> comment-tabpanel#comment-15959621> > > > > > > > >>>>> on the JIRA suggesting I create a doc with a > specification > > > > > proposal > > > > > > > for > > > > > > > >>>>> EMIT (and any other necessary semantic changes) that we > can > > > > then > > > > > > > >>>>> > > > > > > > >>>> iterate > > > > > > > >>> > > > > > > > >>>> on > > > > > > > >>>> > > > > > > > >>>>> in public with the Calcite folks. I already have most of > > the > > > > > > content > > > > > > > >>>>> written (and there's a significant amount of background > > > needed > > > > to > > > > > > > >>>>> > > > > > > > >>>> justify > > > > > > > >>> > > > > > > > >>>> some aspects of the proposal), so it'll mostly be a matter > > of > > > > > > pulling > > > > > > > >>>>> > > > > > > > >>>> it > > > > > > > >>> > > > > > > > >>>> all together into something coherent. Does that sound > > > reasonable > > > > > to > > > > > > > >>>>> everyone? > > > > > > > >>>>> > > > > > > > >>>>> -Tyler > > > > > > > >>>>> > > > > > > > >>>>> > > > > > > > >>>>> On Thu, Apr 6, 2017 at 10:26 AM Kenneth Knowles > > > > > > > <k...@google.com.invalid > > > > > > > >>>>> > > > > > > > >>>> > > > > > > > >>>> wrote: > > > > > > > >>>>> > > > > > > > >>>>> Very cool! I'm really excited about this integration. > > > > > > > >>>>>> > > > > > > > >>>>>> On Thu, Apr 6, 2017 at 9:39 AM, Jean-Baptiste Onofré < > > > > > > > >>>>>> > > > > > > > >>>>> j...@nanthrax.net> > > > > > > > >>> > > > > > > > >>>> wrote: > > > > > > > >>>>>> > > > > > > > >>>>>> Hi, > > > > > > > >>>>>>> > > > > > > > >>>>>>> Mingmin and I prepared a new branch to have the SQL DSL > > in > > > > > > dsls/sql > > > > > > > >>>>>>> location. > > > > > > > >>>>>>> > > > > > > > >>>>>>> Any help is welcome ! > > > > > > > >>>>>>> > > > > > > > >>>>>>> Thanks, > > > > > > > >>>>>>> Regards > > > > > > > >>>>>>> JB > > > > > > > >>>>>>> > > > > > > > >>>>>>> > > > > > > > >>>>>>> On 04/06/2017 06:36 PM, Mingmin Xu wrote: > > > > > > > >>>>>>> > > > > > > > >>>>>>> @Tarush, you're very welcome to join the effort. > > > > > > > >>>>>>>> > > > > > > > >>>>>>>> On Thu, Apr 6, 2017 at 7:22 AM, tarush grover < > > > > > > > >>>>>>>> > > > > > > > >>>>>>> tarushappt...@gmail.com> > > > > > > > >>>>> > > > > > > > >>>>>> wrote: > > > > > > > >>>>>>>> > > > > > > > >>>>>>>> Hi, > > > > > > > >>>>>>>> > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>> Can I be also part of this feature development. > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>> Regards, > > > > > > > >>>>>>>>> Tarush Grover > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>> On Thu, Apr 6, 2017 at 3:17 AM, Ted Yu < > > > > yuzhih...@gmail.com> > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>> wrote: > > > > > > > >>>> > > > > > > > >>>>> > > > > > > > >>>>>>>>> I compiled BEAM-301 branch with calcite 1.12 - > passed. > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>>> Julian tries to not break existing things, but he > will > > > if > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>> there's > > > > > > > >>> > > > > > > > >>>> a > > > > > > > >>>> > > > > > > > >>>>> > > > > > > > >>>>>>>>>> reason > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>> to do so :-) > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>>> On Wed, Apr 5, 2017 at 2:36 PM, Mingmin Xu < > > > > > > mingm...@gmail.com> > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>> wrote: > > > > > > > >>>>>> > > > > > > > >>>>>>> > > > > > > > >>>>>>>>>> @Ted, thanks for the note. I intend to stick with > one > > > > > version, > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>> Beam > > > > > > > >>>> > > > > > > > >>>>> > > > > > > > >>>>>>>>>>> 0.6.0 > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>> and Calcite 1.11 so far, unless impacted by API > change. > > > > > Before > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>> it's > > > > > > > >>>> > > > > > > > >>>>> > > > > > > > >>>>>>>>>>> merged > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>>> back to master, will upgrade to the latest version. > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> On Wed, Apr 5, 2017 at 2:14 PM, Ted Yu < > > > > > yuzhih...@gmail.com> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>> wrote: > > > > > > > >>>>> > > > > > > > >>>>>> > > > > > > > >>>>>>>>>>> Working in feature branch is good - you may want to > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>> periodically > > > > > > > >>> > > > > > > > >>>> sync > > > > > > > >>>>> > > > > > > > >>>>>> > > > > > > > >>>>>>>>>>>> up > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>>> with master. > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> I noticed that you are using 1.11.0 of calcite. > > > > > > > >>>>>>>>>>>> 1.12 is out, FYI > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> On Wed, Apr 5, 2017 at 2:05 PM, Mingmin Xu < > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>> mingm...@gmail.com> > > > > > > > >>> > > > > > > > >>>> > > > > > > > >>>>>>>>>>>> wrote: > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>>> Hi all, > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> I'm working on https://issues.apache.org/ > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> jira/browse/BEAM-301(Add > > > > > > > >>>>> > > > > > > > >>>>>> > > > > > > > >>>>>>>>>>>>> a > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>> Beam > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> SQL DSL). The skeleton is already in > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> https://github.com/XuMingmin/beam/tree/BEAM-301, > > > using > > > > > > Java > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> SDK > > > > > > > >>>> > > > > > > > >>>>> in > > > > > > > >>>>> > > > > > > > >>>>>> > > > > > > > >>>>>>>>>>>>> the > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>> back-end. The goal is to provide a SQL interface > over > > > > Beam, > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>> based > > > > > > > >>> > > > > > > > >>>> > > > > > > > >>>>>>>>>>>>> on > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>> Calcite, including: > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>>>> 1). a translator to create Beam pipeline from SQL, > > > > > > > >>>>>>>>>>>>> (SELECT/INSERT/FILTER/GROUP-BY/JOIN/...); > > > > > > > >>>>>>>>>>>>> 2). an interactive client to submit queries; > > > (All-SQL > > > > > > mode) > > > > > > > >>>>>>>>>>>>> 3). a SQL API which reduce the work to create a > > > > Pipeline; > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> (Semi-SQL > > > > > > > >>>>> > > > > > > > >>>>>> > > > > > > > >>>>>>>>>>>>> mode) > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>>> As we see many folks are interested in this > feature, > > > > would > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> like > > > > > > > >>> > > > > > > > >>>> to > > > > > > > >>>>> > > > > > > > >>>>>> > > > > > > > >>>>>>>>>>>>> create a > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> feature branch to have more involvement. > > > > > > > >>>>>>>>>>>>> Looking for comments and feedback. > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> Thanks! > > > > > > > >>>>>>>>>>>>> ---- > > > > > > > >>>>>>>>>>>>> Mingmin > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>>> > > > > > > > >>>>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> -- > > > > > > > >>>>>>>>>>> ---- > > > > > > > >>>>>>>>>>> Mingmin > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>>> > > > > > > > >>>>>>>>>> > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>> > > > > > > > >>>>>>>> > > > > > > > >>>>>>>> -- > > > > > > > >>>>>>> Jean-Baptiste Onofré > > > > > > > >>>>>>> jbono...@apache.org > > > > > > > >>>>>>> http://blog.nanthrax.net > > > > > > > >>>>>>> Talend - http://www.talend.com > > > > > > > >>>>>>> > > > > > > > >>>>>>> > > > > > > > >>>>>> > > > > > > > >>>>> > > > > > > > >>>> > > > > > > > >>>> > > > > > > > >>>> -- > > > > > > > >>>> ---- > > > > > > > >>>> Mingmin > > > > > > > >>>> > > > > > > > >>>> > > > > > > > >>> > > > > > > > >>> > > > > > > > >>> -- > > > > > > > >>> 陈竞,中科院计算技术研究所,高性能计算机中心 > > > > > > > >>> Jing Chen HPCC.ICT.AC China > > > > > > > >>> > > > > > > > >>> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > > -- > > > > > > > > Jean-Baptiste Onofré > > > > > > > > jbono...@apache.org > > > > > > > > http://blog.nanthrax.net > > > > > > > > Talend - http://www.talend.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > ---- > > > > > > > Mingmin > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > 陈竞,中科院计算技术研究所,高性能计算机中心 > > > > > Jing Chen HPCC.ICT.AC China > > > > > > > > > > > > > > > > > > > > -- > > ---- > > Mingmin > > > -- ---- Mingmin