Hi, We are building data pipeline using Beam Python SDK and trying to run on Dataflow, but getting the below error,
*A setup error was detected in beamapp-xxxxyyyy-0322102737-03220329-8a74-harness-lm6v. Please refer to the worker-startup log for detailed information.* But could not find detailed worker-startup logs. We tried increasing memory size, worker count etc, but still getting the same error. Here is the command we use, *python run.py \* *--project=xyz \* *--runner=DataflowRunner \* *--staging_location=gs://xyz/staging \* *--temp_location=gs://xyz/temp \* *--requirements_file=requirements.txt \* *--worker_machine_type n1-standard-8 \* *--num_workers 2* pipeline snippet *data = pipeline | "load data" >> beam.io.Read( * * beam.io.BigQuerySource(query="SELECT * FROM abc_table LIMIT 100")* *)* *data | "filter data" >> beam.Filter(lambda x: x.get('column_name') == value)* Above pipeline is just loading the data from BigQuery and filtering based on some column value. This pipeline works like a charm in DirectRunner but fails on Dataflow. Are we doing any obvious setup mistake? anyone else getting the same error? We could use some help to resolve the issue. -- *Rajesh Hegde | Lead Product Developer | Datalicious* *e*: rhe...@datalicious.com | *m*: +919167571827 *a*: L-77, 15th Cross Rd, Sector 6, HSR Layout, Bangalore Karnataka- 560102 *w*: www.datalicious.com <http://www.datalicious.com/?utm_source=signaturesatori&utm_medium=email&utm_campaign=signaturesatori> *Contact supp...@datalicious.com <supp...@datalicious.com> anytime, we're keen to help!* <https://www.linkedin.com/company/datalicious-pty-ltd> <https://twitter.com/datalicious> <https://www.facebook.com/Datalicious> <https://plus.google.com/+Datalicious1> <https://www.datalicious.com/resources/facebook-people-based-measurement-attribution/?utm_source=signaturesatori&utm_medium=email&utm_campaign=signaturesatori>