Xiaoge created BEAM-9924:
----------------------------

             Summary: Deploying a beam program as Google cloud function fails
                 Key: BEAM-9924
                 URL: https://issues.apache.org/jira/browse/BEAM-9924
             Project: Beam
          Issue Type: Bug
          Components: dependencies
    Affects Versions: 2.20.0
         Environment: 2.20.0
            Reporter: Xiaoge


I wrote a beam program to process data following a pipeline: read --> extract 
values --> write to Google big query. However, the program needs a customized 
dependency. I followed the guideline here: 
[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/complete/juliaset]
 to configure the customized package. However, when I deploy the program as a 
Google cloud function, I always encounter the following error.

 

Traceback (most recent call last):
 File 
"/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 
383, in run_background_function
 _function_handler.invoke_user_function(event_object)
 File 
"/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 
217, in invoke_user_function
 return call_user_function(request_or_event)
 File 
"/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 
214, in call_user_function
 event_context.Context(**request_or_event.context))
 File "/user_code/main.py", line 84, in main
 create_disposition = beam.io.BigQueryDisposition.CREATE_NEVER)
 File "/env/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 
503, in __exit__
 self.run().wait_until_finish()
 File "/env/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 
496, in run
 return self.runner.run_pipeline(self, self._options)
 File 
"/env/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
 line 548, in run_pipeline
 self.dataflow_client.create_job(self.job), self)
 File "/env/local/lib/python3.7/site-packages/apache_beam/utils/retry.py", line 
234, in wrapper
 return fun(*args, **kwargs)
 File 
"/env/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.py",
 line 624, in create_job
 self.create_job_description(job)
 File 
"/env/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.py",
 line 680, in create_job_description
 resources = self._stage_resources(job.options)
 File 
"/env/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/internal/apiclient.py",
 line 577, in _stage_resources
 staging_location=google_cloud_options.staging_location)
 File 
"/env/local/lib/python3.7/site-packages/apache_beam/runners/portability/stager.py",
 line 199, in stage_job_resources
 setup_options.setup_file, temp_dir, build_setup_args)
 File 
"/env/local/lib/python3.7/site-packages/apache_beam/runners/portability/stager.py",
 line 524, in _build_setup_package
 processes.check_output(build_setup_args)
 File "/env/local/lib/python3.7/site-packages/apache_beam/utils/processes.py", 
line 97, in check_output
 .format(traceback.format_exc(), error.output))
RuntimeError: Full trace: Traceback (most recent call last):
 File "/env/local/lib/python3.7/site-packages/apache_beam/utils/processes.py", 
line 85, in check_output
 out = subprocess.check_output(*args, **kwargs)
 File "/opt/python3.7/lib/python3.7/subprocess.py", line 411, in check_output
 **kwargs).stdout
 File "/opt/python3.7/lib/python3.7/subprocess.py", line 512, in run
 output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/env/bin/python3.7', 'setup.py', 
'sdist', '--dist-dir', '/tmp/tmpi7hv8a6i']' returned non-zero exit status 1.
, output of the failed child process b'running sdist\nrunning egg_info\nwriting 
localpackage.egg-info/PKG-INFO\n'



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to