Re: [DISCUSS] Making consistent use of Optionals

2019-08-21 Thread Kenneth Knowles
As mentioned on PR, I'm not convinced by Flink's discussion and all evidence I know has shown it to have non-measurable performance impact. I'm OK either way, though, at this point. - Whatever the consensus, let us set up checkstyle/analysis so that we maintain compatible across the codebase.

[Discuss] Propose Calcite Vendor Release

2019-08-21 Thread Kai Jiang
Hi Community, As a part of effort to unblock for vendor calcite in SQL module, we broke it into pull/9333 for going through vendored dependencies release process separately. I want to propose Calcite vendor release and look for a release manager to help

Re: [DISCUSS] Multiple-triggering SQL Join with retractions support

2019-08-21 Thread Rui Wang
Kenn - Yep totally agree the first phrase should not include EMIT. Although it would be interesting to explore EMIT support in Calcite as a R work. Mingmin - Thanks for you example query, which is an interesting use case, in which two inputs are aggregations with different modes. Retraction won't

Re: [VOTE] Release 2.15.0, release candidate #2

2019-08-21 Thread Kenneth Knowles
The website build does still depend on being within a git repository. On Wed, Aug 21, 2019 at 3:13 PM Kenneth Knowles wrote: > +1 > > My mistake. I was on the wrong git commit and read the wrong gradle > version from the spec for the wrapper. > > On Wed, Aug 21, 2019 at 3:08 PM Kenneth Knowles

Re: Brief of interactive Beam

2019-08-21 Thread GMAIL
Thanks for the input, Robert! > On Aug 21, 2019, at 11:49 AM, Robert Bradshaw wrote: > > On Wed, Aug 14, 2019 at 11:29 AM Ning Kang > wrote: > Ahmet, thanks for forwarding! > > My main concern at this point is the introduction of new concepts, even > though these

Re: [VOTE] Release 2.15.0, release candidate #2

2019-08-21 Thread Kenneth Knowles
+1 My mistake. I was on the wrong git commit and read the wrong gradle version from the spec for the wrapper. On Wed, Aug 21, 2019 at 3:08 PM Kenneth Knowles wrote: > Trying to build from the actual source release, and having challenges. The > gradle wrapper is absent due to licensing, so I

Re: Query about JdbcIO.readRows()

2019-08-21 Thread Kenneth Knowles
Hi Kishor, If you could not find a Jira, would you file one? Your contribution would be very appreciated. Kenn On Tue, Aug 20, 2019 at 10:04 PM Kishor Joshi wrote: > Hi, > > This fix is still not available in the Beam 2.15.0. Is there any Jira that > has been created for this issue ? I am

Re: [Discuss] Retractions in Beam

2019-08-21 Thread Rui Wang
Thanks Kenn. Some points to answer some concerns: 1. Adding retraction won't break existing users (because it is a new accumulation mode). 2. Adding retraction won't affect existing pipeline's performance(it could be done by avoiding calling retracting mode's core components, e.g. ReduceFn, by

Re: [VOTE] Release 2.15.0, release candidate #2

2019-08-21 Thread Kenneth Knowles
Trying to build from the actual source release, and having challenges. The gradle wrapper is absent due to licensing, so I built with fresh Gradle 4.10.3. This did not succeed, with errors that I am still looking into. I copied in the wrapper from the git repo and confirmed the build does succeed

Re: [DISCUSS] Multiple-triggering SQL Join with retractions support

2019-08-21 Thread Mingmin Xu
@Rui In my cases, we have some complex queries like SELECT ... FROM ( SELECT ... FROM PRE_A GROUP BY id, TUMBLE(1 HOUR) ) A JOIN ( SELECT ... FROM PRE_B GROUP BY id, TUMBLE(1 HOUR) ) B ON A.id=B.id //A emit every minute on accumulate mode and B emit every minute on discard move. Would be

[RESULT] [VOTE] Release 2.15.0, release candidate #2

2019-08-21 Thread Yifan Zou
Hi all, I'm happy to announce that we have unanimously approved this release. There are 4 approving votes, 3 of which are binding (in order): * Ahmet (al...@google.com); * Pablo (pabl...@google.com); * Lukasz (lc...@google.com); There are no disapproving votes. Thanks everyone! Next step is

Re: Java 11 compatibility question

2019-08-21 Thread Kenneth Knowles
On Tue, Aug 20, 2019 at 8:37 AM Elliotte Rusty Harold wrote: > > > On Tue, Aug 20, 2019 at 7:51 AM Ismaël Mejía wrote: > >> a per case approach (the exception could be portable runners not based on >> Java). >> >> Of course other definitions of being Java 11 compatible are interesting >> but

Re: Support ZetaSQL as a new SQL dialect in BeamSQL

2019-08-21 Thread Rui Wang
Thanks everyone! Now Beam ZetaSQL is merged into Beam repo! -Rui On Mon, Aug 19, 2019 at 8:36 AM Ahmet Altay wrote: > Thank you both! > > On Mon, Aug 19, 2019 at 8:01 AM Kenneth Knowles wrote: > >> The i.p. clearance is complete: >>

Re: Write-through-cache in State logic

2019-08-21 Thread Maximilian Michels
> There is probably a misunderstanding here: I'm suggesting to use a worker ID instead of cache tokens, not additionally. Ah! Misread that. We need a changing token to indicate that the cache is stale, e.g. checkpoint has failed / restoring from an old checkpoint. If the _Runner_ generates a new

[DISCUSS] Making consistent use of Optionals

2019-08-21 Thread Jan Lukavský
Hi, sorry if this discussion have been already taken, but I'd like to know others opinions about how we use Optionals. The state in current master is as follows: $ git grep "import" | grep "java.util.Optional" | wc -l 85 $ git grep "import" | grep "Optional" | grep guava | wc -l 45 I'd like

Re: [DISCUSS] Making consistent use of Optionals

2019-08-21 Thread Jan Lukavský
Sorry, forgot to add link to the Flink discussion [1]. [1] https://lists.apache.org/thread.html/f5f8ce92f94c9be6774340fbd7ae5e4afe07386b6765ad3cfb13aec0@%3Cdev.flink.apache.org%3E On 8/21/19 10:08 PM, Jan Lukavský wrote: Hi, sorry if this discussion have been already taken, but I'd like to

Re: Write-through-cache in State logic

2019-08-21 Thread Maximilian Michels
> There is probably a misunderstanding here: I'm suggesting to use a worker ID > instead of cache tokens, not additionally. Ah! Misread that. We need a changing token to indicate that the cache is stale, e.g. checkpoint has failed / restoring from an old checkpoint. If the _Runner_ generates a

Re: [DISCUSS] Multiple-triggering SQL Join with retractions support

2019-08-21 Thread Kenneth Knowles
These all sound useful. One thing is that the EMIT syntax is a more early idea, and more likely subject to some changes. The problem with EMIT anywhere except the top level is that it is not very composable. It really belongs most as part of an INSERT statement, just like sink triggers. Maybe a

Re: [Discuss] Retractions in Beam

2019-08-21 Thread Kenneth Knowles
I reviewed your PR (https://github.com/apache/beam/pull/9199) and Anton's as another reference (https://github.com/apache/beam/pull/4742). Nice work. I thought I would summarize for the list a little bit. I think we have not done too much with retractions because it seems like a big job. You both

Re: Brief of interactive Beam

2019-08-21 Thread Robert Bradshaw
On Wed, Aug 14, 2019 at 11:29 AM Ning Kang wrote: > Ahmet, thanks for forwarding! > > >> My main concern at this point is the introduction of new concepts, even >> though these are not changing other parts of the Beam SDKs. It would be >> good to see at least an alternative option covered in the

Re: Python Beam pipelines on Flink on Kubernetes

2019-08-21 Thread Thomas Weise
The changes to containerize the Python SDK worker pool are nearly complete. I also updated the document for next implementation steps. The favored approach (initially targeted option) for pipeline submission is support for the (externally created) fat far. It will keep changes to the operator to

Re: [Discuss] Propose Calcite Vendor Release

2019-08-21 Thread Rui Wang
I can be the release manager to help release vendor calcite. Per [1], before we start a release, we have to reach consensus before starting a release. [1]: https://s.apache.org/beam-release-vendored-artifacts -Rui On Wed, Aug 21, 2019 at 5:00 PM Kai Jiang wrote: > Hi Community, > > As a part

Re: Write-through-cache in State logic

2019-08-21 Thread Reuven Lax
Dataflow does something like this, however since work is load balanced across workers a per-worker id doesn't work very well. Dataflow divides the keyspace up into lexicographic ranges, and creates a cache token per range. On Tue, Aug 20, 2019 at 8:35 PM Thomas Weise wrote: > Commenting here

Re: SqlTransform Metadata

2019-08-21 Thread Reza Rokni
@Kenn / @Rob has there been any other discussions on how the timestamp value can be accessed from within the SQL since this thread in May? If not my vote is for a convenience method that gives access to the timestamp as a function call within the SQL statement. Reza On Wed, 22 May 2019 at

Re: Write-through-cache in State logic

2019-08-21 Thread Maximilian Michels
Appreciate all your comments! Replying below. @Luke: > Having cache tokens per key would be very expensive indeed and I believe we > should go with a single cache token "per" bundle. Thanks for your comments on the PR. I was thinking to propose something along this lines of having cache

Re: Java 11 compatibility question

2019-08-21 Thread Ismaël Mejía
Thanks again Elliotte for the clear information and references. It seems that being compatible with Java 11 modules will be more elusive than expected considering the transitive dependencies. Do you (or someone else) knows if there there is a plugin or easy way to discover this? I think that

Re: Java 11 compatibility question

2019-08-21 Thread Łukasz Gajowy
Thank you, Elliotte for showing us the issues with JPMS. So maybe we should just announce for end users that they can run Beam pipelines in Java 11 but that for the moment Beam modules cannot be used in Java 11 module style. I know that there is already a lot of fear around Java 8 not being

Re: Write-through-cache in State logic

2019-08-21 Thread Thomas Weise
--> On Wed, Aug 21, 2019, 2:16 AM Maximilian Michels wrote: > Appreciate all your comments! Replying below. > > > @Luke: > > > Having cache tokens per key would be very expensive indeed and I believe > we should go with a single cache token "per" bundle. > > Thanks for your comments on the PR.

Re: Write-through-cache in State logic

2019-08-21 Thread Reuven Lax
On Wed, Aug 21, 2019 at 2:16 AM Maximilian Michels wrote: > Appreciate all your comments! Replying below. > > > @Luke: > > > Having cache tokens per key would be very expensive indeed and I believe > we should go with a single cache token "per" bundle. > > Thanks for your comments on the PR. I

Re: Query about JdbcIO.readRows()

2019-08-21 Thread Kishor Joshi
Hi, This fix is still not available in the Beam 2.15.0. Is there any Jira that has been created for this issue ? I am interested to contribute in that. Thanks & Regards,Kishor On Friday, August 2, 2019, 10:19:17 PM GMT+5:30, Jean-Baptiste Onofré wrote: Agree. I will fix that.

Re: Java 11 compatibility question

2019-08-21 Thread Łukasz Gajowy
https://issues.apache.org/jira/browse/BEAM-8024 I created the other issue regarding importing beam to a Java11 project that uses JPMS. I confirmed* in a pet project that this is happening (linked in the issue). *no shock here, I just wanted to play with it. Łukasz śr., 21 sie 2019 o 14:53

Re: [VOTE] Release 2.15.0, release candidate #2

2019-08-21 Thread Lukasz Cwik
+1 (binding) I validated the signatures against the key dist/release/KEYS and hashes of the source distributions and release artifacts. I also ran some of the quickstarts for Java. On Tue, Aug 20, 2019 at 3:59 PM Pablo Estrada wrote: > +1 > > I've installed from the source in apache/dist. >