Rishi created BEAM-12512:
----------------------------

             Summary: Parquetio.py throws "ValueError: invalid literal" when 
ARROW_MAJOR_VERSION contains alpha-numeric
                 Key: BEAM-12512
                 URL: https://issues.apache.org/jira/browse/BEAM-12512
             Project: Beam
          Issue Type: Bug
          Components: beam-community
         Environment: Ubuntu 18.04
            Reporter: Rishi


When Apache Arrow is built from Git branch the resulting version is similar to:

/==================/

/home/arrow/python# python3 setup.py --version
 *2.0.0.dev0+g478286658.d20210618*

/==================/

 This causes exception in apache_beam code at the following [line 
|https://github.com/apache/beam/blob/9af555d9ccdb0d7a378dbea456cdeefe2e781d6d/sdks/python/apache_beam/io/parquetio.py#L53]due
 to presence of alpha-numerics in the generated code:

/==================/
 # python3
 Python 3.6.9 (default, Jan 26 2021, 15:33:00)
 [GCC 8.4.0] on linux
 Type "help", "copyright", "credits" or "license" for more information.
 >>> *import apache_beam as beam*
 Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File 
"/usr/local/lib/python3.6/dist-packages/apache_beam-2.30.0-py3.6-linux-x86_64.egg/apache_beam/__init__.py",
 line 96, in <module>
 from apache_beam import io
 File 
"/usr/local/lib/python3.6/dist-packages/apache_beam-2.30.0-py3.6-linux-x86_64.egg/apache_beam/io/__init__.py",
 line 28, in <module>
 from apache_beam.io.parquetio import *
 File 
"/usr/local/lib/python3.6/dist-packages/apache_beam-2.30.0-py3.6-linux-x86_64.egg/apache_beam/io/parquetio.py",
 line 53, in <module>
 ARROW_MAJOR_VERSION, _, _ = map(int, pa.__version__.split('.'))
 *ValueError: invalid literal for int() with base 10: 'dev0+g478286658'*

/==================/

Perhaps, determination of ARROW_MAJOR_VERSION can be modified to account for 
such use cases.

 

Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to