Re: Beam 2.4.0

2018-03-02 Thread Jean-Baptiste Onofré
Hi Ismaël, that's a good idea to show history. For me, the vote duration doesn't matter as we are in the release process already. The gap between two releases is more significant. And clearly with an average of 80 days (~ 3 months) it's two long. The idea is to reduce this clearly. I propose tw

Re: Beam 2.4.0

2018-03-02 Thread Robert Bradshaw
On Fri, Mar 2, 2018 at 12:01 AM Jean-Baptiste Onofré wrote: > Hi Ismaël, > > that's a good idea to show history. > > For me, the vote duration doesn't matter as we are in the release process > already. > A more relevant duration to track would probably be cut to final release. This both measures

Build failed in Jenkins: beam_Release_NightlySnapshot #702

2018-03-02 Thread Apache Jenkins Server
See Changes: [robertwb] [maven-release-plugin] prepare branch release-2.4.0 [robertwb] [maven-release-plugin] prepare for next development iteration [robertwb] Bump Python dev version. -

Releases and user support

2018-03-02 Thread Romain Manni-Bucau
Hi guys, I didn't find a page about beam release support. With the fast minor release rrythm which is targetted by beam (see other threads on that), I wonder what - as an end user - you should expect as breakage between versions (minor can add API but shouldn't break them typically) and how long a

Re: Releases and user support

2018-03-02 Thread Robert Bradshaw
On Fri, Mar 2, 2018 at 8:45 AM Romain Manni-Bucau wrote: > Hi guys, > I didn't find a page about beam release support. With the fast minor release rrythm which is targetted by beam (see other threads on that), I wonder what - as an end user - you should expect as breakage between versions (minor

GCS Issues running java tests

2018-03-02 Thread Robert Bradshaw
When trying to run the Java tests, I keep getting Expected: (an instance of java.lang.IllegalArgumentException and exception with message a string containing "Error constructing default value for gcpTempLocation: tempLocation is not a valid GCS path" and exception with cause exception with message

Re: Merging Python code? Help avoid Python 3 regressions with these two simple steps :)

2018-03-02 Thread Holden Karau
I agree, however I'm of the impression it's blocked on infra? (e.g. it's important but out of my hands). On Mar 1, 2018 11:05 PM, "Ahmet Altay" wrote: > I think we should prioritize the issue of installing Python 3 on the > workers (https://issues.apache.org/jira/browse/BEAM-3671). I would > app

Re: Merging Python code? Help avoid Python 3 regressions with these two simple steps :)

2018-03-02 Thread Ahmet Altay
That is my understanding as well, it is requires attention from infra. Could anyone help with this? I know we worked with infra before, what is the best way to approach this? On Fri, Mar 2, 2018 at 9:50 AM, Holden Karau wrote: > I agree, however I'm of the impression it's blocked on infra? (e.g.

Re: Merging Python code? Help avoid Python 3 regressions with these two simple steps :)

2018-03-02 Thread Alan Myrvold
I ran "python3 --version" on each worker and all showed python 3.4.3. Is that too old? On Fri, Mar 2, 2018 at 10:04 AM Ahmet Altay wrote: > That is my understanding as well, it is requires attention from infra. > Could anyone help with this? I know we worked with infra before, what is > the bes

Re: Merging Python code? Help avoid Python 3 regressions with these two simple steps :)

2018-03-02 Thread Holden Karau
3.4.3 is from Feb 2015, and for what it’s worth the minimum version of Python in Spark is 3.4. We could enable lint tests in Jenkins and see how they go? On Fri, Mar 2, 2018 at 10:06 AM Alan Myrvold wrote: > I ran "python3 --version" on each worker and all showed python 3.4.3. Is > that too old

Re: Merging Python code? Help avoid Python 3 regressions with these two simple steps :)

2018-03-02 Thread Ahmet Altay
This is great. Let's enable py3 lint tests in Jenkins. Side question, what python 3 version we should target as the minimum supported version in Beam? On Fri, Mar 2, 2018 at 10:31 AM, Holden Karau wrote: > 3.4.3 is from Feb 2015, and for what it’s worth the minimum version of > Python in Spark

Re: Merging Python code? Help avoid Python 3 regressions with these two simple steps :)

2018-03-02 Thread Robert Bradshaw
To address the first point, 3.4 is almost certainly sufficient for our needs (running lint_py3 to prevent regressions). Also, +1 that automating this is going to be much more effective than asking users to manually do extra steps. Long-term, we should definitely support 3.5+, definitely not suppor

to a modular embedded java runner to replace the direct runner?

2018-03-02 Thread Romain Manni-Bucau
Hi guys, wonder if you discussed or thought to break down what is called today the direct runner in an embedded runner which would be modular an extensible. What I have in mind is the following: 1. have a strong embedded runner implementing the whole beam API but limited to a single JVM 2. keep

is PushbackSideInputDoFnRunner useful?

2018-03-02 Thread Romain Manni-Bucau
Hi guys, what's the rational behind PushbackSideInputDoFnRunner? Why not using a DoFnRunner, OutputT>? It is the same thing I think, better represents what it does (most is delegated in general) and avoids yet another API which is not even implemented completely in 1 of the 2 implementation caus

Re: is PushbackSideInputDoFnRunner useful?

2018-03-02 Thread Reuven Lax
The point of PushbackSideInputDoFnRunner is to buffer the main input until the side input is ready (for a sometimes complicated definition of ready). One possibility is instead to add a new prior step in the graph that is responsible for buffering these inputs. That way there's no need for a spec

Re: to a modular embedded java runner to replace the direct runner?

2018-03-02 Thread Lukasz Cwik
To my knowledge, no one has discussed an extension mechanism for the direct runner but the difficulty is in how to get extensions to interact with the internals of the direct runner cleanly. Note that the direct runner currently accepts a set of flags which enable/disable validation and control how

Re: is PushbackSideInputDoFnRunner useful?

2018-03-02 Thread Lukasz Cwik
For portability reasons, the PushbackSideInputDoFnRunner will go away in the long term since the Runner will have to filter elements before sending them to the SDK for processing. Performing this filtering by a prior step within the Runner is a reasonable solution and what Dataflow has adopted inte

Re: to a modular embedded java runner to replace the direct runner?

2018-03-02 Thread Romain Manni-Bucau
Le 2 mars 2018 22:22, "Lukasz Cwik" a écrit : To my knowledge, no one has discussed an extension mechanism for the direct runner but the difficulty is in how to get extensions to interact with the internals of the direct runner cleanly. Note that the direct runner currently accepts a set of flags

Re: is PushbackSideInputDoFnRunner useful?

2018-03-02 Thread Romain Manni-Bucau
I read both answers as either a deprecate it to plan to remove it or a yes unify both, right? Rather a deletion is planned if I get it right. Sounds good to me. Le 2 mars 2018 22:27, "Lukasz Cwik" a écrit : > For portability reasons, the PushbackSideInputDoFnRunner will go away in > the long te