Xiaoge created BEAM-9924:
----------------------------
Summary: Deploying a beam program as Google cloud function fails
Key: BEAM-9924
URL: https://issues.apache.org/jira/browse/BEAM-9924
Project: Beam
Issue Type: Bug
Components: dependencies
Affects Versions: 2.20.0
Environment: 2.20.0
Reporter: Xiaoge
I wrote a beam program to process data following a pipeline: read --> extract
values --> write to Google big query. However, the program needs a customized
dependency. I followed the guideline here:
[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/complete/juliaset]
to configure the customized package. However, when I deploy the program as a
Google cloud function, I always encounter the following error.
Traceback (most recent call last):
File
"/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line
383, in run_background_function
_function_handler.invoke_user_function(event_object)
File
"/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line
217, in invoke_user_function
return call_user_function(request_or_event)
File
"/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line
214, in call_user_function
event_context.Context(**request_or_event.context))
File "/user_code/main.py", line 84, in main
create_disposition = beam.io.BigQueryDisposition.CREATE_NEVER)
File "/env/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line
503, in __exit__
self.run().wait_until_finish()
File "/env/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line
496, in run
return self.runner.run_pipeline(self, self._options)
File
"/env/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
line 548, in run_pipeline
self.dataflow_client.create_job(self.job), self)
File "/env/local/lib/python3.7/site-packages/apache_beam/utils/retry.py", line
234, in wrapper
return fun(*args, **kwargs)
File
"/env/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.py",
line 624, in create_job
self.create_job_description(job)
File
"/env/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.py",
line 680, in create_job_description
resources = self._stage_resources(job.options)
File
"/env/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.py",
line 577, in _stage_resources
staging_location=google_cloud_options.staging_location)
File
"/env/local/lib/python3.7/site-packages/apache_beam/runners/portability/stager.py",
line 199, in stage_job_resources
setup_options.setup_file, temp_dir, build_setup_args)
File
"/env/local/lib/python3.7/site-packages/apache_beam/runners/portability/stager.py",
line 524, in _build_setup_package
processes.check_output(build_setup_args)
File "/env/local/lib/python3.7/site-packages/apache_beam/utils/processes.py",
line 97, in check_output
.format(traceback.format_exc(), error.output))
RuntimeError: Full trace: Traceback (most recent call last):
File "/env/local/lib/python3.7/site-packages/apache_beam/utils/processes.py",
line 85, in check_output
out = subprocess.check_output(*args, **kwargs)
File "/opt/python3.7/lib/python3.7/subprocess.py", line 411, in check_output
**kwargs).stdout
File "/opt/python3.7/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/env/bin/python3.7', 'setup.py',
'sdist', '--dist-dir', '/tmp/tmpi7hv8a6i']' returned non-zero exit status 1.
, output of the failed child process b'running sdist\nrunning egg_info\nwriting
localpackage.egg-info/PKG-INFO\n'
--
This message was sent by Atlassian Jira
(v8.3.4#803005)