Hi Valentyn, Here is my pip freeze on my machine (note that the error is in dataflow, the job runs fine in my machine)
ansiwrap==0.8.4 apache-beam==2.19.0 arrow==0.15.5 asn1crypto==1.3.0 astroid==2.3.3 astropy==3.2.3 attrs==19.3.0 avro-python3==1.9.1 azure-common==1.1.24 azure-storage-blob==2.1.0 azure-storage-common==2.1.0 backcall==0.1.0 bcolz==1.2.1 binaryornot==0.4.4 bleach==3.1.0 boto3==1.11.9 botocore==1.14.9 cachetools==3.1.1 certifi==2019.11.28 cffi==1.13.2 chardet==3.0.4 Click==7.0 cloudpickle==1.2.2 colorama==0.4.3 configparser==4.0.2 confuse==1.0.0 cookiecutter==1.7.0 crcmod==1.7 cryptography==2.8 cycler==0.10.0 daal==2019.0 datalab==1.1.5 decorator==4.4.1 defusedxml==0.6.0 dill==0.3.1.1 distro==1.0.1 docker==4.1.0 docopt==0.6.2 docutils==0.15.2 entrypoints==0.3 enum34==1.1.6 fairing==0.5.3 fastavro==0.21.24 fasteners==0.15 fsspec==0.6.2 future==0.18.2 gcsfs==0.6.0 gitdb2==2.0.6 GitPython==3.0.5 google-api-core==1.16.0 google-api-python-client==1.7.11 google-apitools==0.5.28 google-auth==1.11.0 google-auth-httplib2==0.0.3 google-auth-oauthlib==0.4.1 google-cloud-bigquery==1.17.1 google-cloud-bigtable==1.0.0 google-cloud-core==1.2.0 google-cloud-dataproc==0.6.1 google-cloud-datastore==1.7.4 google-cloud-language==1.3.0 google-cloud-logging==1.14.0 google-cloud-monitoring==0.31.1 google-cloud-pubsub==1.0.2 google-cloud-secret-manager==0.1.1 google-cloud-spanner==1.13.0 google-cloud-storage==1.25.0 google-cloud-translate==2.0.0 google-compute-engine==20191210.0 google-resumable-media==0.4.1 googleapis-common-protos==1.51.0 grpc-google-iam-v1==0.12.3 grpcio==1.26.0 h5py==2.10.0 hdfs==2.5.8 html5lib==1.0.1 htmlmin==0.1.12 httplib2==0.12.0 icc-rt==2020.0.133 idna==2.8 ijson==2.6.1 imageio==2.6.1 importlib-metadata==1.4.0 intel-numpy==1.15.1 intel-openmp==2020.0.133 intel-scikit-learn==0.19.2 intel-scipy==1.1.0 ipykernel==5.1.4 ipython==7.9.0 ipython-genutils==0.2.0 ipython-sql==0.3.9 ipywidgets==7.5.1 isort==4.3.21 jedi==0.16.0 Jinja2==2.11.0 jinja2-time==0.2.0 jmespath==0.9.4 joblib==0.14.1 json5==0.8.5 jsonschema==3.2.0 jupyter==1.0.0 jupyter-aihub-deploy-extension==0.1 jupyter-client==5.3.4 jupyter-console==6.1.0 jupyter-contrib-core==0.3.3 jupyter-contrib-nbextensions==0.5.1 jupyter-core==4.6.1 jupyter-highlight-selected-word==0.2.0 jupyter-http-over-ws==0.0.7 jupyter-latex-envs==1.4.6 jupyter-nbextensions-configurator==0.4.1 jupyterlab==1.2.6 jupyterlab-git==0.9.0 jupyterlab-server==1.0.6 keyring==10.1 keyrings.alt==1.3 kiwisolver==1.1.0 kubernetes==10.0.1 lazy-object-proxy==1.4.3 llvmlite==0.31.0 lxml==4.4.2 Markdown==3.1.1 MarkupSafe==1.1.1 matplotlib==3.0.3 mccabe==0.6.1 missingno==0.4.2 mistune==0.8.4 mkl==2019.0 mkl-fft==1.0.6 mkl-random==1.0.1.1 mock==2.0.0 monotonic==1.5 more-itertools==8.1.0 nbconvert==5.6.1 nbdime==1.1.0 nbformat==5.0.4 networkx==2.4 nltk==3.4.5 notebook==6.0.3 numba==0.47.0 numpy==1.15.1 oauth2client==3.0.0 oauthlib==3.1.0 opencv-python==4.1.2.30 oscrypto==1.2.0 packaging==20.1 pandas==0.25.3 pandas-profiling==1.4.0 pandocfilters==1.4.2 papermill==1.2.1 parso==0.6.0 pathlib2==2.3.5 pbr==5.4.4 pexpect==4.8.0 phik==0.9.8 pickleshare==0.7.5 Pillow-SIMD==6.2.2.post1 pipdeptree==0.13.2 plotly==4.5.0 pluggy==0.13.1 poyo==0.5.0 prettytable==0.7.2 prometheus-client==0.7.1 prompt-toolkit==2.0.10 protobuf==3.11.2 psutil==5.6.7 ptyprocess==0.6.0 py==1.8.1 pyarrow==0.15.1 pyasn1==0.4.8 pyasn1-modules==0.2.8 pycparser==2.19 pycrypto==2.6.1 pycryptodomex==3.9.6 pycurl==7.43.0 pydaal==2019.0.0.20180713 pydot==1.4.1 Pygments==2.5.2 pygobject==3.22.0 PyJWT==1.7.1 pylint==2.4.4 pymongo==3.10.1 pyOpenSSL==19.1.0 pyparsing==2.4.6 pyrsistent==0.15.7 pytest==5.3.4 pytest-pylint==0.14.1 python-apt==1.4.1 python-dateutil==2.8.1 pytz==2019.3 PyWavelets==1.1.1 pyxdg==0.25 PyYAML==5.3 pyzmq==18.1.1 qtconsole==4.6.0 requests==2.22.0 requests-oauthlib==1.3.0 retrying==1.3.3 rsa==4.0 s3transfer==0.3.2 scikit-image==0.15.0 scikit-learn==0.19.2 scipy==1.1.0 seaborn==0.9.1 SecretStorage==2.3.1 Send2Trash==1.5.0 simplegeneric==0.8.1 six==1.14.0 smmap2==2.0.5 snowflake-connector-python==2.2.0 SQLAlchemy==1.3.13 sqlparse==0.3.0 tbb==2019.0 tbb4py==2019.0 tenacity==6.0.0 terminado==0.8.3 testpath==0.4.4 textwrap3==0.9.2 tornado==5.1.1 tqdm==4.42.0 traitlets==4.3.3 typed-ast==1.4.1 typing==3.7.4.1 typing-extensions==3.7.4.1 unattended-upgrades==0.1 uritemplate==3.0.1 urllib3==1.24.2 virtualenv==16.7.9 wcwidth==0.1.8 webencodings==0.5.1 websocket-client==0.57.0 Werkzeug==0.16.1 whichcraft==0.6.1 widgetsnbextension==3.5.1 wrapt==1.11.2 zipp==1.1.0 On Tue, Feb 4, 2020 at 11:33 AM Valentyn Tymofieiev <valen...@google.com> wrote: > It don't think there is a mismatch between dill versions here, but > https://stackoverflow.com/questions/42960637/python-3-5-dill-pickling-unpickling-on-different-servers-keyerror-classtype > mentions > a similar error and may be related. What is the output of pip freeze on > your machine (or better: pip install pipdeptree; pipdeptree)? > > > On Tue, Feb 4, 2020 at 11:22 AM Alan Krumholz <alan.krumh...@betterup.co> > wrote: > >> Here is a test job that sometimes fails and sometimes doesn't (but most >> times do)..... >> There seems to be something stochastic that causes this as after several >> tests a couple of them did succeed.... >> >> >> def test_error( >> bq_table: str) -> str: >> >> import apache_beam as beam >> from apache_beam.options.pipeline_options import PipelineOptions >> >> class GenData(beam.DoFn): >> def process(self, _): >> for _ in range (20000): >> yield {'a':1,'b':2} >> >> >> def get_bigquery_schema(): >> from apache_beam.io.gcp.internal.clients import bigquery >> >> table_schema = bigquery.TableSchema() >> columns = [ >> ["a","integer","nullable"], >> ["b","integer","nullable"] >> ] >> >> for column in columns: >> column_schema = bigquery.TableFieldSchema() >> column_schema.name = column[0] >> column_schema.type = column[1] >> column_schema.mode = column[2] >> table_schema.fields.append(column_schema) >> >> return table_schema >> >> pipeline = beam.Pipeline(options=PipelineOptions( >> project='my-project', >> temp_location = 'gs://my-bucket/temp', >> staging_location = 'gs://my-bucket/staging', >> runner='DataflowRunner' >> )) >> #pipeline = beam.Pipeline() >> >> ( >> pipeline >> | 'Empty start' >> beam.Create(['']) >> | 'Generate Data' >> beam.ParDo(GenData()) >> #| 'print' >> beam.Map(print) >> | 'Write to BigQuery' >> beam.io.WriteToBigQuery( >> project=bq_table.split(':')[0], >> dataset=bq_table.split(':')[1].split('.')[0], >> table=bq_table.split(':')[1].split('.')[1], >> schema=get_bigquery_schema(), >> >> create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED, >> >> write_disposition=beam.io.BigQueryDisposition.WRITE_TRUNCATE) >> ) >> >> result = pipeline.run() >> result.wait_until_finish() >> >> return True >> >> test_error( >> bq_table = 'my-project:my_dataset.my_table' >> ) >> >> On Tue, Feb 4, 2020 at 10:04 AM Alan Krumholz <alan.krumh...@betterup.co> >> wrote: >> >>> I tried breaking apart my pipeline. Seems the step that breaks it is: >>> beam.io.WriteToBigQuery >>> >>> Let me see if I can create a self contained example that breaks to share >>> with you >>> >>> Thanks! >>> >>> On Tue, Feb 4, 2020 at 9:53 AM Pablo Estrada <pabl...@google.com> wrote: >>> >>>> Hm that's odd. No changes to the pipeline? Are you able to share some >>>> of the code? >>>> >>>> +Udi Meiri <eh...@google.com> do you have any idea what could be going >>>> on here? >>>> >>>> On Tue, Feb 4, 2020 at 9:25 AM Alan Krumholz <alan.krumh...@betterup.co> >>>> wrote: >>>> >>>>> Hi Pablo, >>>>> This is strange... it doesn't seem to be the last beam release as last >>>>> night it was already using 2.19.0 I wonder if it was some release from the >>>>> DataFlow team (not beam related): >>>>> Job typeBatch >>>>> Job status Succeeded >>>>> SDK version >>>>> Apache Beam Python 3.5 SDK 2.19.0 >>>>> Region >>>>> us-central1 >>>>> Start timeFebruary 3, 2020 at 9:28:35 PM GMT-8 >>>>> Elapsed time5 min 11 sec >>>>> >>>>> On Tue, Feb 4, 2020 at 9:15 AM Pablo Estrada <pabl...@google.com> >>>>> wrote: >>>>> >>>>>> Hi Alan, >>>>>> could it be that you're picking up the new Apache Beam 2.19.0 >>>>>> release? Could you try depending on beam 2.18.0 to see if the issue >>>>>> surfaces when using the new release? >>>>>> >>>>>> If something was working and no longer works, it sounds like a bug. >>>>>> This may have to do with how we pickle (dill / cloudpickle) - see this >>>>>> question >>>>>> https://stackoverflow.com/questions/42960637/python-3-5-dill-pickling-unpickling-on-different-servers-keyerror-classtype >>>>>> Best >>>>>> -P. >>>>>> >>>>>> On Tue, Feb 4, 2020 at 6:22 AM Alan Krumholz < >>>>>> alan.krumh...@betterup.co> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I was running a dataflow job in GCP last night and it was running >>>>>>> fine. >>>>>>> This morning this same exact job is failing with the following error: >>>>>>> >>>>>>> Error message from worker: Traceback (most recent call last): File >>>>>>> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", >>>>>>> line 286, in loads return dill.loads(s) File >>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 275, in >>>>>>> loads >>>>>>> return load(file, ignore, **kwds) File >>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 270, in >>>>>>> load >>>>>>> return Unpickler(file, ignore=ignore, **kwds).load() File >>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 472, in >>>>>>> load >>>>>>> obj = StockUnpickler.load(self) File >>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 577, in >>>>>>> _load_type return _reverse_typemap[name] KeyError: 'ClassType' During >>>>>>> handling of the above exception, another exception occurred: Traceback >>>>>>> (most recent call last): File >>>>>>> "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", >>>>>>> line 648, in do_work work_executor.execute() File >>>>>>> "/usr/local/lib/python3.5/site-packages/dataflow_worker/executor.py", >>>>>>> line >>>>>>> 176, in execute op.start() File >>>>>>> "apache_beam/runners/worker/operations.py", >>>>>>> line 649, in apache_beam.runners.worker.operations.DoOperation.start >>>>>>> File >>>>>>> "apache_beam/runners/worker/operations.py", line 651, in >>>>>>> apache_beam.runners.worker.operations.DoOperation.start File >>>>>>> "apache_beam/runners/worker/operations.py", line 652, in >>>>>>> apache_beam.runners.worker.operations.DoOperation.start File >>>>>>> "apache_beam/runners/worker/operations.py", line 261, in >>>>>>> apache_beam.runners.worker.operations.Operation.start File >>>>>>> "apache_beam/runners/worker/operations.py", line 266, in >>>>>>> apache_beam.runners.worker.operations.Operation.start File >>>>>>> "apache_beam/runners/worker/operations.py", line 597, in >>>>>>> apache_beam.runners.worker.operations.DoOperation.setup File >>>>>>> "apache_beam/runners/worker/operations.py", line 602, in >>>>>>> apache_beam.runners.worker.operations.DoOperation.setup File >>>>>>> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", >>>>>>> line 290, in loads return dill.loads(s) File >>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 275, in >>>>>>> loads >>>>>>> return load(file, ignore, **kwds) File >>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 270, in >>>>>>> load >>>>>>> return Unpickler(file, ignore=ignore, **kwds).load() File >>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 472, in >>>>>>> load >>>>>>> obj = StockUnpickler.load(self) File >>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 577, in >>>>>>> _load_type return _reverse_typemap[name] KeyError: 'ClassType' >>>>>>> >>>>>>> >>>>>>> If I use a local runner it still runs fine. >>>>>>> Anyone else experiencing something similar today? (or know how to >>>>>>> fix this?) >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>