Hi 陈竞, I'm doubtful there will be an explicit equivalent of the State API in SQL, at least not in the SQL portion of the DSL itself (it might make sense to expose one within UDFs). The State API is an imperative interface for accessing an underlying persistent state table, whereas SQL operates more functionally. There's no good way I'm aware of to expose the characteristics provided by the State API (logic-driven, fine- and coarse-grained reads/writes of potentially multiple fields of state utilizing potentially multiple data types) in raw SQL cleanly.
On the upside, SQL has the advantage of making it very easy to materialize new state tables very naturally. In the proposal I'll be sharing for how I think we should integrate streaming into SQL robustly, any time you perform some grouping operation (GROUP BY, JOIN, CUBE, etc) you're transforming your stream into a table. That table is effectively a persistent state table. So there exists a large suite of functionality in standard SQL that gives you a lot of powerful tools for creating state. It may also be possible for the different access patterns of more complicated data structures (e.g., bags or lists) to be captured by different data types supported by the underlying systems. But I don't expect there to be an imperative State access API built into SQL itself. All that said, I'm curious to hear ideas otherwise if anyone has them. :-) -Tyler On Mon, Apr 10, 2017 at 10:19 PM 陈竞 <[email protected]> wrote: > i just want to know what the SQL State API equivalent is for SQL, since > beam has already support stateful processing using state DoFn > > 2017-04-11 2:12 GMT+08:00 Tyler Akidau <[email protected]>: > > > 陈竞, what are you specifically curious about regarding state? Are you > > wanting to know what the SQL State API equivalent is for SQL? Or are you > > asking an operational question about where the state for a given SQL > > pipeline will live? > > > > -Tyler > > > > > > On Sun, Apr 9, 2017 at 12:39 PM Mingmin Xu <[email protected]> wrote: > > > > > Thanks @JB, will come out the initial PR soon. > > > > > > On Sun, Apr 9, 2017 at 12:28 PM, Jean-Baptiste Onofré <[email protected] > > > > > wrote: > > > > > > > As discussed, I created the DSL_SQL branch with the skeleton. Mingmin > > is > > > > rebasing on this branch to submit the PR. > > > > > > > > Regards > > > > JB > > > > > > > > > > > > On 04/09/2017 08:02 PM, Mingmin Xu wrote: > > > > > > > >> State is not touched yet, welcome to add it. > > > >> > > > >> On Sun, Apr 9, 2017 at 2:40 AM, 陈竞 <[email protected]> wrote: > > > >> > > > >> how will this sql support state both in streaming and batch mode > > > >>> > > > >>> 2017-04-07 4:54 GMT+08:00 Mingmin Xu <[email protected]>: > > > >>> > > > >>> @Tyler, there's no big change in the previous design doc, I added > > some > > > >>>> details in chapter 'Part 2. DML( [INSERT] SELECT )' , describing > > steps > > > >>>> to > > > >>>> process a query, feel free to leave a comment. > > > >>>> > > > >>>> Come through your doc of 'EMIT', it's awesome from my perspective. > > > I've > > > >>>> some tests on GroupBy with default triggers/allowed_lateness now. > > EMIT > > > >>>> syntax can be added to fill the gap. > > > >>>> > > > >>>> On Thu, Apr 6, 2017 at 1:04 PM, Tyler Akidau <[email protected]> > > > >>>> wrote: > > > >>>> > > > >>>> I'm very excited by this development as well, thanks for > continuing > > to > > > >>>>> > > > >>>> push > > > >>>> > > > >>>>> this forward, Mingmin. :-) > > > >>>>> > > > >>>>> I noticed you'd made some changes to your design doc > > > >>>>> < > https://docs.google.com/document/d/1Uc5xYTpO9qsLXtT38OfuoqSLimH_ > > > >>>>> 0a1Bz5BsCROMzCU/edit>. > > > >>>>> Is it ready for another review? How reflective is it currently of > > the > > > >>>>> > > > >>>> work > > > >>>> > > > >>>>> that going into the feature branch? > > > >>>>> > > > >>>>> In parallel, I'd also like to continue helping push forward the > > > >>>>> > > > >>>> definition > > > >>>> > > > >>>>> of unified model semantics for SQL so we can get Calcite to a > point > > > >>>>> > > > >>>> where > > > >>> > > > >>>> it supports the full Beam model. I added a comment > > > >>>>> <https://issues.apache.org/jira/browse/BEAM-301? > > > >>>>> > > > >>>> focusedCommentId=15959621& > > > >>>> > > > >>>>> page=com.atlassian.jira.plugin.system.issuetabpanels: > > > >>>>> comment-tabpanel#comment-15959621> > > > >>>>> on the JIRA suggesting I create a doc with a specification > proposal > > > for > > > >>>>> EMIT (and any other necessary semantic changes) that we can then > > > >>>>> > > > >>>> iterate > > > >>> > > > >>>> on > > > >>>> > > > >>>>> in public with the Calcite folks. I already have most of the > > content > > > >>>>> written (and there's a significant amount of background needed to > > > >>>>> > > > >>>> justify > > > >>> > > > >>>> some aspects of the proposal), so it'll mostly be a matter of > > pulling > > > >>>>> > > > >>>> it > > > >>> > > > >>>> all together into something coherent. Does that sound reasonable > to > > > >>>>> everyone? > > > >>>>> > > > >>>>> -Tyler > > > >>>>> > > > >>>>> > > > >>>>> On Thu, Apr 6, 2017 at 10:26 AM Kenneth Knowles > > > <[email protected] > > > >>>>> > > > >>>> > > > >>>> wrote: > > > >>>>> > > > >>>>> Very cool! I'm really excited about this integration. > > > >>>>>> > > > >>>>>> On Thu, Apr 6, 2017 at 9:39 AM, Jean-Baptiste Onofré < > > > >>>>>> > > > >>>>> [email protected]> > > > >>> > > > >>>> wrote: > > > >>>>>> > > > >>>>>> Hi, > > > >>>>>>> > > > >>>>>>> Mingmin and I prepared a new branch to have the SQL DSL in > > dsls/sql > > > >>>>>>> location. > > > >>>>>>> > > > >>>>>>> Any help is welcome ! > > > >>>>>>> > > > >>>>>>> Thanks, > > > >>>>>>> Regards > > > >>>>>>> JB > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> On 04/06/2017 06:36 PM, Mingmin Xu wrote: > > > >>>>>>> > > > >>>>>>> @Tarush, you're very welcome to join the effort. > > > >>>>>>>> > > > >>>>>>>> On Thu, Apr 6, 2017 at 7:22 AM, tarush grover < > > > >>>>>>>> > > > >>>>>>> [email protected]> > > > >>>>> > > > >>>>>> wrote: > > > >>>>>>>> > > > >>>>>>>> Hi, > > > >>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> Can I be also part of this feature development. > > > >>>>>>>>> > > > >>>>>>>>> Regards, > > > >>>>>>>>> Tarush Grover > > > >>>>>>>>> > > > >>>>>>>>> On Thu, Apr 6, 2017 at 3:17 AM, Ted Yu <[email protected]> > > > >>>>>>>>> > > > >>>>>>>> wrote: > > > >>>> > > > >>>>> > > > >>>>>>>>> I compiled BEAM-301 branch with calcite 1.12 - passed. > > > >>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> Julian tries to not break existing things, but he will if > > > >>>>>>>>>> > > > >>>>>>>>> there's > > > >>> > > > >>>> a > > > >>>> > > > >>>>> > > > >>>>>>>>>> reason > > > >>>>>>>>> > > > >>>>>>>>> to do so :-) > > > >>>>>>>>>> > > > >>>>>>>>>> On Wed, Apr 5, 2017 at 2:36 PM, Mingmin Xu < > > [email protected]> > > > >>>>>>>>>> > > > >>>>>>>>> wrote: > > > >>>>>> > > > >>>>>>> > > > >>>>>>>>>> @Ted, thanks for the note. I intend to stick with one > version, > > > >>>>>>>>>> > > > >>>>>>>>> Beam > > > >>>> > > > >>>>> > > > >>>>>>>>>>> 0.6.0 > > > >>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> and Calcite 1.11 so far, unless impacted by API change. > Before > > > >>>>>>>>>> > > > >>>>>>>>> it's > > > >>>> > > > >>>>> > > > >>>>>>>>>>> merged > > > >>>>>>>>>> > > > >>>>>>>>>> back to master, will upgrade to the latest version. > > > >>>>>>>>>>> > > > >>>>>>>>>>> On Wed, Apr 5, 2017 at 2:14 PM, Ted Yu < > [email protected]> > > > >>>>>>>>>>> > > > >>>>>>>>>> wrote: > > > >>>>> > > > >>>>>> > > > >>>>>>>>>>> Working in feature branch is good - you may want to > > > >>>>>>>>>>> > > > >>>>>>>>>> periodically > > > >>> > > > >>>> sync > > > >>>>> > > > >>>>>> > > > >>>>>>>>>>>> up > > > >>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> with master. > > > >>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> I noticed that you are using 1.11.0 of calcite. > > > >>>>>>>>>>>> 1.12 is out, FYI > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> On Wed, Apr 5, 2017 at 2:05 PM, Mingmin Xu < > > > >>>>>>>>>>>> > > > >>>>>>>>>>> [email protected]> > > > >>> > > > >>>> > > > >>>>>>>>>>>> wrote: > > > >>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>>> Hi all, > > > >>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> I'm working on https://issues.apache.org/ > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> jira/browse/BEAM-301(Add > > > >>>>> > > > >>>>>> > > > >>>>>>>>>>>>> a > > > >>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> Beam > > > >>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>>> SQL DSL). The skeleton is already in > > > >>>>>>>>>>>> > > > >>>>>>>>>>>>> https://github.com/XuMingmin/beam/tree/BEAM-301, using > > Java > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> SDK > > > >>>> > > > >>>>> in > > > >>>>> > > > >>>>>> > > > >>>>>>>>>>>>> the > > > >>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>> back-end. The goal is to provide a SQL interface over Beam, > > > >>>>>>>>>>> > > > >>>>>>>>>> based > > > >>> > > > >>>> > > > >>>>>>>>>>>>> on > > > >>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> Calcite, including: > > > >>>>>>>>>> > > > >>>>>>>>>>> 1). a translator to create Beam pipeline from SQL, > > > >>>>>>>>>>>>> (SELECT/INSERT/FILTER/GROUP-BY/JOIN/...); > > > >>>>>>>>>>>>> 2). an interactive client to submit queries; (All-SQL > > mode) > > > >>>>>>>>>>>>> 3). a SQL API which reduce the work to create a Pipeline; > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> (Semi-SQL > > > >>>>> > > > >>>>>> > > > >>>>>>>>>>>>> mode) > > > >>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>>>> As we see many folks are interested in this feature, would > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> like > > > >>> > > > >>>> to > > > >>>>> > > > >>>>>> > > > >>>>>>>>>>>>> create a > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> feature branch to have more involvement. > > > >>>>>>>>>>>>> Looking for comments and feedback. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> Thanks! > > > >>>>>>>>>>>>> ---- > > > >>>>>>>>>>>>> Mingmin > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>>> -- > > > >>>>>>>>>>> ---- > > > >>>>>>>>>>> Mingmin > > > >>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> -- > > > >>>>>>> Jean-Baptiste Onofré > > > >>>>>>> [email protected] > > > >>>>>>> http://blog.nanthrax.net > > > >>>>>>> Talend - http://www.talend.com > > > >>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > > > >>>> > > > >>>> -- > > > >>>> ---- > > > >>>> Mingmin > > > >>>> > > > >>>> > > > >>> > > > >>> > > > >>> -- > > > >>> 陈竞,中科院计算技术研究所,高性能计算机中心 > > > >>> Jing Chen HPCC.ICT.AC China > > > >>> > > > >>> > > > >> > > > >> > > > >> > > > > -- > > > > Jean-Baptiste Onofré > > > > [email protected] > > > > http://blog.nanthrax.net > > > > Talend - http://www.talend.com > > > > > > > > > > > > > > > > -- > > > ---- > > > Mingmin > > > > > > > > > -- > 陈竞,中科院计算技术研究所,高性能计算机中心 > Jing Chen HPCC.ICT.AC China >
