[ https://issues.apache.org/jira/browse/ARROW-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney updated ARROW-1445: -------------------------------- Summary: [Python] Segfault when using libhdfs3 in pyarrow using latest API (was: Python: Segfault when using libhdfs3 in pyarrow using latest API) > [Python] Segfault when using libhdfs3 in pyarrow using latest API > ----------------------------------------------------------------- > > Key: ARROW-1445 > URL: https://issues.apache.org/jira/browse/ARROW-1445 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.6.0 > Reporter: James Porritt > Priority: Major > > I'm encoutering a segfault when using libhdfs3 with pyarrow. > My script is: > {code} > import pyarrow > def main(): > hdfs = pyarrow.hdfs.connect("<host>", <port>, "<username>", > driver='libhdfs') > print hdfs.ls('<my path>') > hdfs3a = pyarrow.HdfsClient("<host>", <port>, "<username>", > driver='libhdfs3') > print hdfs3a.ls('<my path>') > hdfs3b = pyarrow.hdfs.connect("<host>", <port>, "<username>", > driver='libhdfs3') > print hdfs3b.ls('<my path>') > main() > {code} > The first two hdfs connections yield the correct list. The third yields: > {noformat} > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x00007f69c0c8b57f, pid=88070, tid=140092200666880 > # > # JRE version: Java(TM) SE Runtime Environment (8.0_60-b27) (build > 1.8.0_60-b27) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.60-b23 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # C [libc.so.6+0x13357f] __strlen_sse42+0xf > {noformat} > It dumps an error report file too. > I created my conda environment with: > {noformat} > conda create -n parquet > source activate parquet > conda install pyarrow libhdfs3 -c conda-forge > {noformat} > The packages used are: > {noformat} > arrow-cpp 0.6.0 np113py27_1 conda-forge > boost-cpp 1.64.0 1 conda-forge > bzip2 1.0.6 1 conda-forge > ca-certificates 2017.7.27.1 0 conda-forge > certifi 2017.7.27.1 py27_0 conda-forge > curl 7.54.1 0 conda-forge > icu 58.1 1 conda-forge > krb5 1.14.2 0 conda-forge > libgcrypt 1.8.0 0 conda-forge > libgpg-error 1.27 0 conda-forge > libgsasl 1.8.0 1 conda-forge > libhdfs3 2.3 0 conda-forge > libiconv 1.14 4 conda-forge > libntlm 1.4 0 conda-forge > libssh2 1.8.0 1 conda-forge > libuuid 1.0.3 1 conda-forge > libxml2 2.9.4 4 conda-forge > mkl 2017.0.3 0 > ncurses 5.9 10 conda-forge > numpy 1.13.1 py27_0 > openssl 1.0.2l 0 conda-forge > pandas 0.20.3 py27_1 conda-forge > parquet-cpp 1.3.0.pre 1 conda-forge > pip 9.0.1 py27_0 conda-forge > protobuf 3.3.2 py27_0 conda-forge > pyarrow 0.6.0 np113py27_1 conda-forge > python 2.7.13 1 conda-forge > python-dateutil 2.6.1 py27_0 conda-forge > pytz 2017.2 py27_0 conda-forge > readline 6.2 0 conda-forge > setuptools 36.2.2 py27_0 conda-forge > six 1.10.0 py27_1 conda-forge > sqlite 3.13.0 1 conda-forge > tk 8.5.19 2 conda-forge > wheel 0.29.0 py27_0 conda-forge > xz 5.2.3 0 conda-forge > zlib 1.2.11 0 conda-forge > {noformat} > I've set my ARROW_LIBHDFS_DIR to point at the location of the libhdfs3.so > file. > I've populated my CLASSPATH as per the documentation. > Please advise. -- This message was sent by Atlassian JIRA (v7.6.3#76005)