[ https://issues.apache.org/jira/browse/ARROW-4802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Micah Kornfield updated ARROW-4802: ----------------------------------- Summary: [Python] Hadoop classpath discovery broken HADOOP_HOME is a symlink (was: [Python] Hadoop classpath discovery broken in 0.12 when HADOOP_HOME is a symlin) > [Python] Hadoop classpath discovery broken HADOOP_HOME is a symlink > ------------------------------------------------------------------- > > Key: ARROW-4802 > URL: https://issues.apache.org/jira/browse/ARROW-4802 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Reporter: Micah Kornfield > Priority: Major > > From [https://github.com/apache/arrow/issues/3748]: > CLASSPATH discovery was recently changed in > [{{d911850}}|https://github.com/apache/arrow/commit/d91185000945cec96abad41a230d05d3cdefff93] > to resolve ARROW-2113 and ARROW-3768. > Specifically, the logic used to find all jars under HADOOP_HOME uses the find > command directly > [arrow/python/pyarrow/hdfs.py|https://github.com/apache/arrow/blob/d91185000945cec96abad41a230d05d3cdefff93/python/pyarrow/hdfs.py#L144] > Line 144 in > [d911850|https://github.com/apache/arrow/commit/d91185000945cec96abad41a230d05d3cdefff93] > | |find_args = ('find', os.environ['HADOOP_HOME'], '-name', '*.jar')| > This will not work when HADOOP_HOME is a symlink, in which case '-L' needs to > be passed to the find command. > CLASSPATH can still be set explicitly, but this is a change in behavior as > HADOOP_HOME symlinks worked without issue before. -- This message was sent by Atlassian JIRA (v7.6.3#76005)