[
https://issues.apache.org/jira/browse/BEAM-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293759#comment-17293759
]
Pete Boothroyd commented on BEAM-11826:
---------------------------------------
This is caused by a circular import. Before coders.py is fully evaluated the
call to `from future.moves import pickle` causes the sibling file test.py
(/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/test.py) to be
found and loaded by this line
[https://github.com/PythonCharmers/python-future/blob/4657ad23de79a541ebcc7a06f1b9ad60172ad3c4/src/future/standard_library/__init__.py#L781|https://github.com/PythonCharmers/python-future/blob/4657ad23de79a541ebcc7a06f1b9ad60172ad3c4/src/future/standard_library/__init__.py#L781,]
(because the filename collides with the builtin test module defined in the
TOP_LEVEL_MODULES list). The second file causes apache_beam to be imported
again and this time transitively imports `typecoders.py`. This file then tries
to import coders.py and load symbols from it which have not yet been evaluated.
I believe you could solve this problem by renaming that problematic file from
test.py, or by invoking your script in such a way that future will resolve the
test module to the builtin rather than your version (eg. running as a module as
python -m ...).
Given this is caused by the future module which should no longer be needed (as
beam no longer supports py2.7), I wonder if the inbuilt pickle module could be
used instead, which hopefully doesn't suffer from this confusing behaviour.
> AttributeError: module 'apache_beam.coders.coders' has no attribute
> 'VarIntCoder'
> ---------------------------------------------------------------------------------
>
> Key: BEAM-11826
> URL: https://issues.apache.org/jira/browse/BEAM-11826
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Affects Versions: 2.27.0
> Environment: OS: Ubuntu 18.04.5 LTS
> Python: 3.6.10
> GCC: 7.5.0
> Reporter: Lautaro Quiroz
> Priority: P2
>
> Hi, I'm getting the following error when using apache-beam 2.27.0 in a
> Python3 virtualenv:
> {code:bash}
> airflow@airflow-worker-7fb797d459-nf8gh:~$
> /tmp/dataflow-venvtflya9ij/bin/python
> /home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/dummy.py
> Traceback (most recent call last):
> File
> "/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/dummy.py", line
> 1, in <module>
> import apache_beam
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/__init__.py",
> line 95, in <module>
> from apache_beam import coders
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/__init__.py",
> line 19, in <module>
> from apache_beam.coders.coders import *
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/coders.py",
> line 43, in <module>
> from future.moves import pickle
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/moves/__init__.py",
> line 8, in <module>
> import_top_level_modules()
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/standard_library/__init__.py",
> line 810, in import_top_level_modules
> with exclude_local_folder_imports(*TOP_LEVEL_MODULES):
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/future/standard_library/__init__.py",
> line 781, in __enter__
> module = __import__(m, level=0)
> File
> "/home/airflow/gcs/dags/advisor/create_dataset/beam_pipelines/test.py", line
> 3, in <module>
> from apache_beam.options.pipeline_options import PipelineOptions
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/options/pipeline_options.py",
> line 41, in <module>
> from apache_beam.transforms.display import HasDisplayData
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/__init__.py",
> line 23, in <module>
> from apache_beam.transforms import combiners
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/combiners.py",
> line 45, in <module>
> from apache_beam.transforms import core
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/transforms/core.py",
> line 40, in <module>
> from apache_beam.coders import typecoders
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py",
> line 198, in <module>
> registry = CoderRegistry()
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py",
> line 91, in __init__
> self.register_standard_coders(fallback_coder)
> File
> "/tmp/dataflow-venvtflya9ij/lib/python3.6/site-packages/apache_beam/coders/typecoders.py",
> line 95, in register_standard_coders
> self._register_coder_internal(int, coders.VarIntCoder)
> AttributeError: module 'apache_beam.coders.coders' has no attribute
> 'VarIntCoder'
> {code}
> My `dummy.py` file consists of only:
> {code:python}
> import apache_beam
> if __name__ == '__main__': print('MAIN')
> {code}
> Strangely to me, I do not get the error when running the venv python
> interactively and executing the `import apache_beam` statement inside:
> {code:bash}
> airflow@airflow-worker-7fb797d459-nf8gh:~$
> /tmp/dataflow-venvtflya9ij/bin/python
> Python 3.6.10 (default, Feb 1 2021, 12:07:35)
> [GCC 7.5.0] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import apache_beam
> >>>
> {code}
> Note: In both cases the `sys.path` packages dirs are exactly the same.
> Equally strangely to me, the script runs in both scenarios (script &
> interactive) when I use the previous version `apache-beam==0.26.0`.
> ---
> A bit of context.
> I got to this error cause Im using a Google Cloud Composer (Airflow) operator
> ({{airflow.providers.google.cloud.operators.dataflow.DataflowCreatePythonJobOperator}})
> that launches the creation of a virtualenv & installs apache-beam, and
> executes the beam script in order the trigger run the Beam pipeline on Google
> Cloud DataFlow.
> I'm being able to reproduce these issue by connecting to the Kubernetes pod
> running this operator and manually executing the steps.
> —
> In order to reproduce this issue, you can:
> 1. create a python3 virtualenv: `virtualenv /tmp/venv --python=python3`.
> 2. create the dummy.py file with the following code inside:
> {code}
> import apache_beam
> {code}
> 3. install apache-beam 2.27.0: `/tmp/venv/bin/pip install
> apache-beam==2.27.0`.
> 4. run the script: `/tmp/venv/bin/python dummy.py`.
> 5. check it does not happen with `apache-beam==2.26.0`.
> —
> Also reported in:
> [https://stackoverflow.com/questions/66243327/inconsistent-behaviour-when-importing-a-package-interactively-vs-running-as-scri]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)