Rishi created BEAM-12512:
----------------------------
Summary: Parquetio.py throws "ValueError: invalid literal" when
ARROW_MAJOR_VERSION contains alpha-numeric
Key: BEAM-12512
URL: https://issues.apache.org/jira/browse/BEAM-12512
Project: Beam
Issue Type: Bug
Components: beam-community
Environment: Ubuntu 18.04
Reporter: Rishi
When Apache Arrow is built from Git branch the resulting version is similar to:
/==================/
/home/arrow/python# python3 setup.py --version
*2.0.0.dev0+g478286658.d20210618*
/==================/
This causes exception in apache_beam code at the following [line
|https://github.com/apache/beam/blob/9af555d9ccdb0d7a378dbea456cdeefe2e781d6d/sdks/python/apache_beam/io/parquetio.py#L53]due
to presence of alpha-numerics in the generated code:
/==================/
# python3
Python 3.6.9 (default, Jan 26 2021, 15:33:00)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> *import apache_beam as beam*
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File
"/usr/local/lib/python3.6/dist-packages/apache_beam-2.30.0-py3.6-linux-x86_64.egg/apache_beam/__init__.py",
line 96, in <module>
from apache_beam import io
File
"/usr/local/lib/python3.6/dist-packages/apache_beam-2.30.0-py3.6-linux-x86_64.egg/apache_beam/io/__init__.py",
line 28, in <module>
from apache_beam.io.parquetio import *
File
"/usr/local/lib/python3.6/dist-packages/apache_beam-2.30.0-py3.6-linux-x86_64.egg/apache_beam/io/parquetio.py",
line 53, in <module>
ARROW_MAJOR_VERSION, _, _ = map(int, pa.__version__.split('.'))
*ValueError: invalid literal for int() with base 10: 'dev0+g478286658'*
/==================/
Perhaps, determination of ARROW_MAJOR_VERSION can be modified to account for
such use cases.
Thanks.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)