Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-18 Thread Eugene Kirpichov
Thanks all! Sent https://github.com/apache/beam/pull/12619 to cherrypick into 2.24. On Mon, Aug 17, 2020 at 3:37 PM Robert Bradshaw wrote: > I checked Java, it looks like the way things are structured we do not > have that bug there. > > On Mon, Aug 17, 2020 at 3:31 PM Robert Bradshaw > wrote:

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-17 Thread Robert Bradshaw
I checked Java, it looks like the way things are structured we do not have that bug there. On Mon, Aug 17, 2020 at 3:31 PM Robert Bradshaw wrote: > > +1 > > Thanks, Eugene, for finding and fixing this! > > FWIW, most use of Python from the Python Portable Runner used the > embedded environment

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-17 Thread Robert Bradshaw
+1 Thanks, Eugene, for finding and fixing this! FWIW, most use of Python from the Python Portable Runner used the embedded environment (this is the default direct runner), so dependencies are already present. On Mon, Aug 17, 2020 at 3:19 PM Daniel Oliveira wrote: > > Normally I'd say not to

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-17 Thread Daniel Oliveira
Normally I'd say not to cherry-pick this since the issue is only affecting one runner and isn't really a regression, but given that it's the last Py2 release and there won't be a follow-up release that will be able to include this fix, I think it's worth making an exception this time. There should

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-17 Thread Valentyn Tymofieiev
Will defer to the release manager; one reason to cherry-pick is that 2.24.0 will be the last release with Python 2 support, so Py2 users of Portable Python Local Runner might appreciate the fix, since they won't be able to use the next release. On Thu, Aug 13, 2020 at 6:28 PM Eugene Kirpichov

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-13 Thread Eugene Kirpichov
+Daniel as in charge of 2.24 per dev@ thread. On Thu, Aug 13, 2020 at 6:24 PM Eugene Kirpichov wrote: > The PR is merged. > > Do folks think this warrants being cherrypicked into v2.24? My hunch is > yes, cause basically one of the runners (local portable python runner) is > broken for any

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-13 Thread Eugene Kirpichov
The PR is merged. Do folks think this warrants being cherrypicked into v2.24? My hunch is yes, cause basically one of the runners (local portable python runner) is broken for any production workload (works only if your pipeline has no dependencies). On Thu, Aug 13, 2020 at 12:56 PM Eugene

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-13 Thread Eugene Kirpichov
FWIW I sent a PR to fix this https://github.com/apache/beam/pull/12571 However, I'm not up to date on the portable test infrastructure and would appreciate guidance on what tests I can add for this. On Tue, Aug 11, 2020 at 5:28 PM Eugene Kirpichov wrote: > (FYI Sam +sbrot...@gmail.com ) > > On

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-11 Thread Eugene Kirpichov
(FYI Sam +sbrot...@gmail.com ) On Tue, Aug 11, 2020 at 5:00 PM Eugene Kirpichov wrote: > Ok I found the bug, and now I don't understand how it could have possibly > ever worked. And if this was never tested, then I don't understand why it > works after fixing this one bug :) > > Basically the

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-11 Thread Eugene Kirpichov
Ok I found the bug, and now I don't understand how it could have possibly ever worked. And if this was never tested, then I don't understand why it works after fixing this one bug :) Basically the Python ArtifactStaging/RetrievalService uses FileSystems.open() to read the artifacts to be staged,

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-11 Thread Eugene Kirpichov
Hi Maximilian, Thank you - it works fine with the embedded Flink runner (per below, seems like it's not using Docker for running Python code? What is it using then?). However, the original bug appears to be wider than I thought - it is also present if I run --runner=FlinkRunner

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-11 Thread Maximilian Michels
Looks like you ran into a bug. You could just run your program without specifying any arguments, since running with Python's FnApiRunner should be enough. Alternatively, how about trying to run the same pipeline with the FlinkRunner? Use: --runner=FlinkRunner and do not specify an endpoint.

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-10 Thread Valentyn Tymofieiev
Hi Eugene, Good to hear from you. The experience you are describing on Portable Runner + Docker container in local execution mode is most certainly a bug, if you have not opened an issue on it, please do so and feel free to cc me. I can also reproduce the bug and likewise didn't see anything

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-10 Thread Eugene Kirpichov
(cc'ing Sam with whom I'm working on this atm) FWIW I'm still stumped. I've looked through Python, Go and Java code in the Beam repo having anything to do with gzipping/unzipping, and none of it appears to be used in the artifact staging/retrieval codepaths. I also can't find any mention of

Re: Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-07 Thread Eugene Kirpichov
Thanks Austin! Good stuff - though note that I am *not* using custom containers, I'm just trying to get the basic stuff to work, a Python pipeline with a simple requirements.txt file. Feels like this should work out-of-the-box, I must be doing something wrong. On Fri, Aug 7, 2020 at 6:38 PM

Staged PIP package mysteriously ungzipped, non-installable inside the worker

2020-08-07 Thread Eugene Kirpichov
Hi old Beam friends, I left Google to work on climate change and am now doing a short engagement with Pachama . Right now I'm trying to get a Beam Python