I have a simple apache beam project using python 3 to transform some data 
and write to big query, it uses a package called texstat, if I run locally 
everything works, but when I run on dataflow I get the following error:


NameError: name 'textstat' is not defined [while running 
'generatedPtransform-441']



This is my current setup.py file:


import setuptools REQUIRED_PACKAGES = ['textstat==0.5.6'] PACKAGE_NAME = 
'my_package' PACKAGE_VERSION = '0.0.1' setuptools.setup( name=PACKAGE_NAME, 
version=PACKAGE_VERSION, description='Example project', 
install_requires=REQUIRED_PACKAGES, packages=setuptools.find_packages(), )



and this are my pipeline args


pipeline_args = [ '--project={}'.format('etl-example'), 
'--runner={}'.format('Dataflow'), '--temp_location=gs://dataflowtemporal/', 
'--setup_file=./setup.py', ]



and I run it like this


pipeline_options = PipelineOptions(pipeline_args) 
pipeline_options.view_as(StandardOptions).streaming = True pipeline = 
beam.Pipeline(options=pipeline_options) ... pipeline.run()



I also tried with running this on the terminal before running the job:


python setup.py sdist --formats=gztar



but I get the same results of texstat not being found. Another thing I 
tries was without setup.py and only with the argument


--requirements_file=./requirements.txt



But again, texstat is not found

At this point I don't know what else to try.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/44460c42-5e16-4131-9281-9d78ec0a3086%40googlegroups.com.

Reply via email to