pytest migration progress

2019-04-12 Thread Udi Meiri
Hi, I'm making progress on the pytest migration here: https://github.com/apache/beam/pull/7949 The PR does not replace nose (yet) - that would require more work and a verification effort to make sure no test gets left behind. - Udi smime.p7s Description: S/MIME Cryptographic Signature

Re: Accessing CSV headers

2019-04-12 Thread tejanahmedhu
I am using the latest SDK (Apache Beam 2.11.0) I have been trying to use a variety of code online, however I am aware that com.google.cloud.dataflow.sdk has deprecated but updating this library did not enable the code I found to be recognised. Deprecated:

Re: [ANNOUNCE] New committer announcement: Boyuan Zhang

2019-04-12 Thread Thomas Weise
Congrats! On Thu, Apr 11, 2019 at 6:03 PM Reuven Lax wrote: > Congratulations Boyuan! > > On Thu, Apr 11, 2019 at 4:53 PM Ankur Goenka wrote: > >> Congrats Boyuan! >> >> On Thu, Apr 11, 2019 at 4:52 PM Mark Liu wrote: >> >>> Congrats Boyuan! >>> >>> On Thu, Apr 11, 2019 at 9:53 AM Alexey

Re: [DISCUSS] change the encoding scheme of Python StrUtf8Coder

2019-04-12 Thread Lukasz Cwik
This is a minor point Robert Burke but having access to the "stream" when decoding/encoding could mean that your reading/writing from the underlying transport channel directly and not needing to copy the bytes into/from memory. On Wed, Apr 10, 2019 at 3:45 PM Kenneth Knowles wrote: > On Mon,

Re: [review?] WordCount in Kotlin

2019-04-12 Thread Ankur Goenka
Absolutely :) I took this opportunity for a general reminder. Thanks again for taking this Kotlin example to completion. On Fri, Apr 12, 2019 at 1:24 PM Pablo Estrada wrote: > I've merged via a squashed commit that references Jira and the PR. That > should be reasonable? > Best > -P. > > On

Re: [review?] WordCount in Kotlin

2019-04-12 Thread Pablo Estrada
I've merged via a squashed commit that references Jira and the PR. That should be reasonable? Best -P. On Fri, Apr 12, 2019, 12:22 PM Ankur Goenka wrote: > Thanks Pablo and Harshit. > > Just a quick reminder, please squash the "fixup" sort of commits in the PR > based on the prior discussion on

Re: [DOC] Portable Spark Runner

2019-04-12 Thread Lukasz Cwik
Thanks for the doc. On Fri, Apr 12, 2019 at 11:34 AM Kyle Weaver wrote: > Hi everyone, > > As some of you know, I've been piggybacking on the existing Spark and > Flink runners to create a portable version of the Spark runner. I wrote up > a summary of the work I've done so far and what remains

Re: [review?] WordCount in Kotlin

2019-04-12 Thread Ankur Goenka
Thanks Pablo and Harshit. Just a quick reminder, please squash the "fixup" sort of commits in the PR based on the prior discussion on the mailing list https://lists.apache.org/thread.html/6d922820d6fc352479f88e5c8737f2c8893ddb706a1e578b50d28948@%3Cdev.beam.apache.org%3E On Fri, Apr 12, 2019 at

Re: [review?] WordCount in Kotlin

2019-04-12 Thread Pablo Estrada
I've merged this here: https://github.com/apache/beam/pull/8291 Thanks for all who took a look, and to Harshit for the contribution. : ) On Thu, Apr 4, 2019 at 10:30 PM Jean-Baptiste Onofré wrote: > Thanks for the update Pablo. > > I will try to take a look during the week end. > > Regards >

[DOC] Portable Spark Runner

2019-04-12 Thread Kyle Weaver
Hi everyone, As some of you know, I've been piggybacking on the existing Spark and Flink runners to create a portable version of the Spark runner. I wrote up a summary of the work I've done so far and what remains to be done. I'll keep updating this going forward to provide a reasonably

Re: [PROPOSAL] Custom JVM initialization for Beam workers

2019-04-12 Thread Lukasz Cwik
+1 on the use cases that Ahmet pointed out and the solution that Brian put forth. I like how the change is being applied to the Beam Java SDK harness and not just Dataflow so all portable runner users get this as well. On Wed, Apr 10, 2019 at 9:03 PM Kenneth Knowles wrote: > > > On Wed, Apr 10,

Re: [DISCUSS] Side input consistency guarantees for triggers with multiple firings

2019-04-12 Thread Kenneth Knowles
The thing I dislike about this all is that a main value that Beam (& similar) bring to users is removing the concerns of classical concurrent programming. But Luke's example is convincing that we might need to have a discussion around a causality-based consistency model. On Fri, Apr 12, 2019 at

Re: [DISCUSS] Side input consistency guarantees for triggers with multiple firings

2019-04-12 Thread Lukasz Cwik
Yes, if we had such a pipeline: ParDo(A) --> PCollectionView S ... | ParDo(C) <-(side input)- PCollectionView S | ... | ParDo(D) <-(side input)- PCollectionView S | ... We could reason that ParDo(D) should see at least the same or newer contents of PCollectionView S then when

Re: [VOTE] Release 2.12.0, release candidate #3

2019-04-12 Thread Andrew Pilloud
The vote on RC3 is canceled. It seems like I'm moving a little too fast on this release, so I'll wait until Monday to cut the next RC to give more time to find and fix issues. Andrew On Wed, Apr 10, 2019 at 7:45 AM Ismaël Mejía wrote: > -1 due to dependencies leaking in 'sdks/java/core'. For

Re: [docs] Python State & Timers

2019-04-12 Thread Maximilian Michels
It would probably be pretty easy to add the corresponding code snippets to the docs as well. It's probably a bit more work because there is no section dedicated to state/timer yet in the documentation. Tracked here: https://jira.apache.org/jira/browse/BEAM-2472 I've been going over this