[PROPOSAL] design of DSL SQL interface

2017-05-12 Thread Mingmin Xu
Hi all, As you may know, we're working on BeamSQL to execute SQL queries as a Beam pipeline. This is a valuable feature, not only shipped as a packaged CLI, but also as part of the SDK to assemble a pipeline. I prepare a document[1] to list the high level APIs, to show how SQL queries can be

Re: First stable release: release candidate #3

2017-05-12 Thread Davor Bonaci
This RC is now obsolete -- see RC #4 information with a formal vote in a separate thread [1]. Davor [1] https://lists.apache.org/thread.html/8cf5d7583111f7d67bd21918eb35a29823572ad9ebc1d6df9656bc79@%3Cdev.beam.apache.org%3E On Fri, May 12, 2017 at 12:57 AM, Davor Bonaci

[VOTE] First stable release: release candidate #4

2017-05-12 Thread Davor Bonaci
Hi everyone -- After going through several release candidates, setting and validating acceptance criteria, running a hackathon, and polishing the release, now is the time to vote! Please review and vote on the release candidate #4 for the version 2.0.0, as follows: [ ] +1, Approve the release [ ]

Re: Towards a spec for robust streaming SQL, Part 1

2017-05-12 Thread Shaoxuan Wang
Tyler, Yes, dynamic table changes over time. You can find more details about dynamic table from this Flink blog ( https://flink.apache.org/news/2017/04/04/dynamic-tables.html). Fabian, me and Xiaowei posted it a week before the flink-forward@SF. "A dynamic table is a table that is continuously

Re: First stable release: Acceptance criteria

2017-05-12 Thread Chamikara Jayalath
Validated same examples Vikas mentioned for Windows. Updated the doc on acceptance criteria. Thanks, Cham On Fri, May 12, 2017 at 4:43 PM Vikas RK wrote: > Just validated wordcount and mobile gaming examples for Python SDK on > Direct and Dataflow runner. Mostly looks good

Re: First stable release: Acceptance criteria

2017-05-12 Thread Vikas RK
Just validated wordcount and mobile gaming examples for Python SDK on Direct and Dataflow runner. Mostly looks good to me, with minor changes that could be done to improve user experience, but not a blocker for FSR. (filed BEAM-2286 ) On 12 May

Re: [PROPOSAL] Apache Hive connector

2017-05-12 Thread Eugene Kirpichov
Hi! Why do you need at all to override methods like computeSplitsIfNecessary - is HCatalogIO substantially different from other HadoopInputFormat's that it can not be handled by the generic code of HadoopInputFormatIO? I looked at the implementation in your commit and it seems identical, except

Re: [PROPOSAL] Apache Hive connector

2017-05-12 Thread Seshadri Raghunathan
Hi Eugene, In order to reuse HadoopInputFormatIO, this is what I am thinking - 1. Extend HadoopInputFormatBoundedSource to create - HCatalogBoundedSource 2. Override necessary methods in HCatalogBoundedSource to perform HCatalog-specific steps. ( overriding computeSplitsIfNecessary() method

Re: Towards a spec for robust streaming SQL, Part 1

2017-05-12 Thread Tyler Akidau
Being able to support an EMIT config independent of the query itself sounds great for compatible use cases (which should be many :-). Shaoxuan, can you please refresh my memory what a dynamic table means in Flink? It's basically just a state table, right? The "dynamic" part of the name is to

Re: [DISCUSSION] using NexMark for Beam

2017-05-12 Thread Lukasz Cwik
I think these are valuable enough that we should get them into apache/master On Fri, May 12, 2017 at 4:34 AM, Jean-Baptiste Onofré wrote: > Hi, > > PR or even a feature branch could work. Up to you. > > Regards > JB > > > On 05/12/2017 10:55 AM, Etienne Chauchot wrote: > >>

Re: TextIO and .withWindowedWrites() - filenamepolicy

2017-05-12 Thread Reuven Lax
Can we simply fail WWW if windowed writes is not set? Or at least warn? On May 12, 2017 6:14 PM, "Dan Halperin" wrote: DefaultFilenamePolicy as currently written only accepts a single shard name template. if that template is windowed, it won't work for unwindowed

Re: Pull request - power function

2017-05-12 Thread Mingmin Xu
Thanks @Tarush, will also take a look. On Fri, May 12, 2017 at 7:19 AM, Jean-Baptiste Onofré wrote: > Thanks, > > we gonna take a look. > > Regards > JB > > > On 05/12/2017 04:12 PM, tarush grover wrote: > >> Hi Team, >> >> I have opened a pull request Beam-2171 power

Re: Pull request - power function

2017-05-12 Thread Jean-Baptiste Onofré
Thanks, we gonna take a look. Regards JB On 05/12/2017 04:12 PM, tarush grover wrote: Hi Team, I have opened a pull request Beam-2171 power function #3092. Kindly review and verify. Regards, Tarush -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend -

Pull request - power function

2017-05-12 Thread tarush grover
Hi Team, I have opened a pull request Beam-2171 power function #3092. Kindly review and verify. Regards, Tarush

Re: First stable release: Acceptance criteria

2017-05-12 Thread Jean-Baptiste Onofré
Enjoy Kenn ! Regards JB On 05/12/2017 03:38 PM, Kenneth Knowles wrote: I wanted to let this thread & list know that I'll be offline camping starting ~now through the weekend. So if you see any open thread with me on it, please do take it over. Things are looking pretty good, so I might not see

Re: First stable release: Acceptance criteria

2017-05-12 Thread Kenneth Knowles
I wanted to let this thread & list know that I'll be offline camping starting ~now through the weekend. So if you see any open thread with me on it, please do take it over. Things are looking pretty good, so I might not see any more action. I have a lot of faith in this community to validate the

Re: TextIO and .withWindowedWrites() - filenamepolicy

2017-05-12 Thread Reuven Lax
DefaultFilenamePolicy already contains a windowedFilename override (today it throws an exception), so I don't think there's any need for a new class. We can simply fill out the existing method. On May 12, 2017 11:34 AM, "Borisa Zivkovic" wrote: +1 for

Re: Towards a spec for robust streaming SQL, Part 1

2017-05-12 Thread Shaoxuan Wang
Thanks to Tyler and Fabian for sharing your thoughts. Regarding to the early/late update control of FLINK. IMO, each dynamic table can have an EMIT config. For FLINK table-API, this can be easily implemented in different manners, case by case. For instance, in window aggregate, we could define

Re: [DISCUSSION] using NexMark for Beam

2017-05-12 Thread Etienne Chauchot
Hi guys, I wanted to let you know that I have just submitted a PR around NexMark. This is a port of the NexMark queries to Beam, to be used as integration tests. This can also be used as A-B testing (no-regression or performance comparison between 2 versions of the same engine or of the same

Re: TextIO and .withWindowedWrites() - filenamepolicy

2017-05-12 Thread Borisa Zivkovic
Great... created this https://issues.apache.org/jira/browse/BEAM-2276 On Fri, 12 May 2017 at 09:38 Jean-Baptiste Onofré wrote: > +1 > > Borisa, if you want, we can work together on this. > > Thanks ! > Regards > JB > > On 05/12/2017 10:33 AM, Borisa Zivkovic wrote: > > +1

Re: TextIO and .withWindowedWrites() - filenamepolicy

2017-05-12 Thread Jean-Baptiste Onofré
+1 Borisa, if you want, we can work together on this. Thanks ! Regards JB On 05/12/2017 10:33 AM, Borisa Zivkovic wrote: +1 for DefaultFilenamePolicy being able to understand basic windowing... probably the most user-friendly way that would cover most of needs... in case of special needs

Re: TextIO and .withWindowedWrites() - filenamepolicy

2017-05-12 Thread Borisa Zivkovic
+1 for DefaultFilenamePolicy being able to understand basic windowing... probably the most user-friendly way that would cover most of needs... in case of special needs users can provide their own policy.. another alternative would be to have new class called DefaultWindowedFilenamePolicy in

Re: First stable release: release candidate #2

2017-05-12 Thread Davor Bonaci
This RC is now obsolete -- see RC #3 information in a separate thread [1]. Davor [1] https://lists.apache.org/thread.html/25493fded77367299ad185c308d658eeb10e4f03be725154b0e5b08d@%3Cdev.beam.apache.org%3E On Wed, May 10, 2017 at 10:28 PM, Davor Bonaci wrote: > The release

First stable release: release candidate #3

2017-05-12 Thread Davor Bonaci
The release candidate #3 for the version 2.0.0 has been built. I think this candidate could be it -- we have cleared the entire list of blocking changes [8]! I'd like to ask everyone to give this candidate a try -- this is the last opportunity we have to fix any critical issues if the final

Re: TextIO and .withWindowedWrites() - filenamepolicy

2017-05-12 Thread Reuven Lax
I believe that for most windows there is a standard stringification. However I think we could allow the user to inject a window formatter for cases where there is no good default (e.g. where the window is a complicated user-defined type, and toString() isn't good enough. Alternatively, if we