[ 
https://issues.apache.org/jira/browse/BEAM-10705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192367#comment-17192367
 ] 

Valentyn Tymofieiev commented on BEAM-10705:
--------------------------------------------

[~aennassiri] Thanks for volunteering - feel free to to take it. Please ask 
[email protected] to give you Jira contributor permissions, then I would be 
able to assign the Jira to you. 

One aspect of this issue is that we need to preserve the name of the wheel when 
we stage it. Tarball sources are renamed to a generic name after download: 
https://github.com/apache/beam/blob/02bf081d0e86f16395af415cebee2812620aff4b/sdks/python/apache_beam/runners/portability/stager.py#L246.
  

> Passing whl files in --sdk_location does not work  for https locations.
> -----------------------------------------------------------------------
>
>                 Key: BEAM-10705
>                 URL: https://issues.apache.org/jira/browse/BEAM-10705
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-core
>            Reporter: Valentyn Tymofieiev
>            Assignee: Ayoub Ennassiri
>            Priority: P3
>              Labels: starter
>
> Sample repro:
> python -m apache_beam.examples.wordcount 
> --input=gs://dataflow-samples/shakespeare/kinglear.txt --output 
> /tmp/wordcount  --runner=DataflowRunner --project=google.com:clo
> uddfe --temp_location gs://clouddfe-valentyn/tmp/ --region=us-central1 
> --sdk_location=https://
> storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198
> 203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
> {noformat}
>   File "/home/valentyn/.pyenv/versions/3.7.3/lib/python3.7/runpy.py", line 
> 193, in _run_module_as_main
>     "__main__", mod_spec)
>   File "/home/valentyn/.pyenv/versions/3.7.3/lib/python3.7/runpy.py", line 
> 85, in _run_code
>     exec(code, run_globals)
>   File 
> "/home/valentyn/projects/beam/beam2/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 99, in <module>
>     run()
>   File 
> "/home/valentyn/projects/beam/beam2/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 94, in run
>     output | 'Write' >> WriteToText(known_args.output)
>   File 
> "/home/valentyn/projects/beam/beam2/beam/sdks/python/apache_beam/pipeline.py",
>  line 555, in __exit__
>     self.result = self.run()
>   File 
> "/home/valentyn/projects/beam/beam2/beam/sdks/python/apache_beam/pipeline.py",
>  line 521, in run
>     allow_proto_holders=True).run(False)
>   File 
> "/home/valentyn/projects/beam/beam2/beam/sdks/python/apache_beam/pipeline.py",
>  line 534, in run
>     return self.runner.run_pipeline(self, self._options)
>   File 
> "/home/valentyn/projects/beam/beam2/beam/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 479, in run_pipeline
>     artifacts=environments.python_sdk_dependencies(options)))
>   File 
> "/home/valentyn/projects/beam/beam2/beam/sdks/python/apache_beam/transforms/environments.py",
>  line 613, in python_sdk_dependencies
>     staged_name in stager.Stager.create_job_resources(options, tmp_dir))
>   File 
> "/home/valentyn/projects/beam/beam2/beam/sdks/python/apache_beam/runners/portability/stager.py",
>  line 235, in create_job_resources
>     resources.extend(Stager._create_beam_sdk(sdk_remote_location, temp_dir))
>   File 
> "/home/valentyn/projects/beam/beam2/beam/sdks/python/apache_beam/runners/portability/stager.py",
>  line 659, in _create_beam_sdk
>     sdk_remote_location)
>   File 
> "/home/valentyn/projects/beam/beam2/beam/sdks/python/apache_beam/runners/portability/stager.py",
>  line 596, in _desired_sdk_filename_in_staging_location
>     _, wheel_filename = FileSystems.split(sdk_location)
>   File 
> "/home/valentyn/projects/beam/beam2/beam/sdks/python/apache_beam/io/filesystems.py",
>  line 151, in split
>     filesystem = FileSystems.get_filesystem(path)
>   File 
> "/home/valentyn/projects/beam/beam2/beam/sdks/python/apache_beam/io/filesystems.py",
>  line 106, in get_filesystem
>     'e.g., pip install apache-beam[gcp]. Path specified: %s' % path)
> ValueError: Unable to get filesystem from specified path, please use the 
> correct path or ensure the required dependency is installed, e.g., pip 
> install apache-beam[gcp]. Path specified: 
> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to