On Thu, Aug 13, 2020 at 4:31 PM Alex Amato <ajam...@google.com> wrote:

> I changed the .wdl I was passing in to:
> --sdk_location=
> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache_beam-2.25.0.dev0-cp36-cp36m-macosx_10_9_x86_64.whl
>
note that this is a MacOS  whl, so it won't run with Dataflow, Dataflow
will require a linux wheel,  such as cp36-cp36m-manylinux1_x86_64.whl.

>
>
> and also tried
>
> --sdk_location=
> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache-beam-2.25.0.dev0.zip
>
>
> python --version
>
> Python 3.6.8
>
> In both cases the same TypeError occurs.
> https://paste.googleplex.com/6275630654029824
>

Looking closer,  I see that you hit a Python 3 bug[1] in a codepath that is
not exercised frequently, and a quick fix[2] shows that this codepath does
not work for passing wheels [3].

A workaround that should work is to download the file first, and then pass
it in --sdk_location.

Btw, the cost of passing source distribution is 1-2 minutes of SDK
installation time. To pass the wheel files, you need to pass a correct
wheel taking the python version and target platform into account.

[1] https://issues.apache.org/jira/browse/BEAM-10704.
[2] https://github.com/apache/beam/pull/125791
<https://github.com/apache/beam/pull/12579>
[3] https://issues.apache.org/jira/browse/BEAM-10705

>
>
>
> On Thu, Aug 13, 2020 at 3:52 PM Valentyn Tymofieiev <valen...@google.com>
> wrote:
>
>> You are passing a python 2.7 wheel to a job that was launched on python
>> 3.6.
>>
>> You need to select a correct wheel for the platform or pass source
>> distribution (zip/tag.gz).
>>
>> On Thu, Aug 13, 2020, 15:20 Alex Amato <ajam...@google.com> wrote:
>>
>>> I was trying to use the --sdk_location parameter in a python pipeline,
>>> to allow users to run a snapshot SDK. Though it looks like it hit a type
>>> error after downloading the .wdl file.
>>>
>>> Perhaps this code is assuming that remote files downloaded are text
>>> type, not bytes type? Have I done something wrong? Or is this a bug? Any
>>> ideas?
>>>
>>> Thanks for taking a look,
>>> Alex
>>>
>>> Using the --sdk_location parameter (Full command line
>>> <https://paste.googleplex.com/5792777008840704>)
>>> --sdk_location=
>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>>
>>> INFO:apache_beam.runners.portability.stager:Failed to download Artifact
>>> from
>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>> Traceback (most recent call last):
>>>   File
>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>> line 193, in _run_module_as_main
>>>     "__main__", mod_spec)
>>>   File
>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>> line 85, in _run_code
>>>     exec(code, run_globals)
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>> line 142, in <module>
>>>     run()
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>> line 121, in run
>>>     result = p.run()
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>> line 521, in run
>>>     allow_proto_holders=True).run(False)
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>> line 534, in run
>>>     return self.runner.run_pipeline(self, self._options)
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>>> line 479, in run_pipeline
>>>     artifacts=environments.python_sdk_dependencies(options)))
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
>>> line 611, in python_sdk_dependencies
>>>     staged_name in stager.Stager.create_job_resources(options, tmp_dir))
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>> line 235, in create_job_resources
>>>     resources.extend(Stager._create_beam_sdk(sdk_remote_location,
>>> temp_dir))
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>> line 657, in _create_beam_sdk
>>>     Stager._download_file(sdk_remote_location, local_download_file)
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>> line 375, in _download_file
>>>     f.write(content)
>>> TypeError: write() argument must be str, not bytes
>>>
>>>
>>>

Reply via email to