Akash Patel created BEAM-3403:
---------------------------------
Summary: Ingesting json file ValidationError: Expected type <type
'unicode'>
Key: BEAM-3403
URL: https://issues.apache.org/jira/browse/BEAM-3403
Project: Beam
Issue Type: Bug
Components: sdk-py-core
Affects Versions: 2.2.0
Reporter: Akash Patel
Assignee: Ahmet Altay
Reading a json file from GCS file pattern using Beam Python SDK 2.2.0 in
Dataflow yields the following warning:
{code:bash}
Retry with exponential backoff: waiting for 4.21317187833 seconds before
retrying report_completion_status because we caught exception: ValidationError:
Expected type <type 'unicode'> for field name, found
s05-s34-reify20-process-msecs (type <class
'apache_beam.utils.counters.CounterName'>) Traceback for above exception (most
recent call last): File
"/usr/local/lib/python2.7/dist-packages/apache_beam/utils/retry.py", line 175,
in wrapper return fun(*args, **kwargs) File
"/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line
491, in report_completion_status exception_details=exception_details) File
"/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line
299, in report_status work_executor=self._work_executor) File
"/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py",
line 316, in report_status append_counter(work_item_status, counter,
tentative=not completed) File
"/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py",
line 43, in append_counter status_object, counter.name, kind,
counter.accumulator, setter) File
"/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py",
line 95, in append_counter_update add_unstructured_name_and_kind(metric_update,
metric_name, kind) File
"/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py",
line 63, in add_unstructured_name_and_kind metric_update.nameAndKind.name =
metric_name File
"/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
line 973, in __setattr__ object.__setattr__(self, name, value) File
"/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
line 1299, in __set__ value = self.validate(value) File
"/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
line 1406, in validate return self.__validate(value, self.validate_element)
File
"/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
line 1364, in __validate return validate_element(value) File
"/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
line 1549, in validate_element return super(StringField,
self).validate_element(value) File
"/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
line 1346, in validate_element (self.type, name, value, type(value)))
{code}
The job does not fail but rather gets stuck on trying to read the file. The
above warning is thrown for every retry read.
However running the job with Beam Python SDK 2.1.1 works perfectly fine.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)