Hello,
When trying to run the coders.py cookbook example [1], it raises an
exception
```
PicklingError : _pickle.PicklingError: Can't pickle <class
'__main__.JsonCoder'>: it's not the same object as __main__.JsonCoder
[while running 'read/Read/Map(<lambda at iobase.py:899>)']
```
The example pipeline works when it's run using the coders_test.py file [2].
In the existing test, the DAG creates the input collection instead of
reading from a file.
I am using Python 3.7.9, the Python Beam SDK 2.29.0 and DirectRunner, under
Darwin.
To reproduce the issue, run:
```
python coders.py --input input.ndjson --output output.txt
```
Where the input file (input.ndjson) has the same values as coders_test.py:
```
{"host": ["Germany", 1], "guest": ["Italy", 0]}
{"host": ["Germany", 1], "guest": ["Brasil", 3]}
{"host": ["Brasil", 1], "guest": ["Italy", 0]}
```
The full stack trace is at the bottom of the email.
Questions:
1. Am I doing something wrong?
2. Is a coder recommended to decode NDJSON files or is it advised to just
use ParDos / Maps to deserialize them?
I really appreciate any help you can provide.
Kind regards,
Tatiana
[1]
https://github.com/apache/beam/blob/35bac6a62f1dc548ee908cfeff7f73ffcac38e6f/sdks/python/apache_beam/examples/cookbook/coders.py
[2]
https://github.com/apache/beam/blob/35bac6a62f1dc548ee908cfeff7f73ffcac38e6f/sdks/python/apache_beam/examples/cookbook/coders_test.py
```
Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1233, in
apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 581, in
apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1395, in
apache_beam.runners.common._OutputProcessor.process_outputs
File "apache_beam/runners/worker/operations.py", line 219, in
apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 183, in
apache_beam.runners.worker.operations.ConsumerSet.update_counters_start
File "apache_beam/runners/worker/opcounters.py", line 217, in
apache_beam.runners.worker.opcounters.OperationCounters.update_from
File "apache_beam/runners/worker/opcounters.py", line 255, in
apache_beam.runners.worker.opcounters.OperationCounters.do_sample
File "apache_beam/coders/coder_impl.py", line 1311, in
apache_beam.coders.coder_impl.WindowedValueCoderImpl.get_estimated_size_and_observables
File "apache_beam/coders/coder_impl.py", line 1322, in
apache_beam.coders.coder_impl.WindowedValueCoderImpl.get_estimated_size_and_observables
File "apache_beam/coders/coder_impl.py", line 354, in
apache_beam.coders.coder_impl.FastPrimitivesCoderImpl.get_estimated_size_and_observables
File "apache_beam/coders/coder_impl.py", line 418, in
apache_beam.coders.coder_impl.FastPrimitivesCoderImpl.encode_to_stream
File "apache_beam/coders/coder_impl.py", line 261, in
apache_beam.coders.coder_impl.CallbackCoderImpl.encode_to_stream
File
"/private/tmp/dataflow_sandbox/lib/python3.7/site-packages/apache_beam/coders/coders.py",
line 749, in <lambda>
lambda x: dumps(x, protocol), pickle.loads)
_pickle.PicklingError: Can't pickle <class '__main__.JsonCoder'>: it's not
the same object as __main__.JsonCoder
```