[jira] [Updated] (ARROW-5049) [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark
[ https://issues.apache.org/jira/browse/ARROW-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5049: -- Labels: pull-request-available (was: ) > [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow > FileSystem used in spark > -- > > Key: ARROW-5049 > URL: https://issues.apache.org/jira/browse/ARROW-5049 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.12.0, 0.13.0, 0.12.1 >Reporter: Tiger068 >Assignee: Tiger068 >Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > > when i init pyarrow filesystem to connect hdfs clusfter in spark,the libhdfs > throws error: > {code:java} > org/apache/hadoop/fs/FileSystem class not found > {code} > I print out the CLASSPATH, the classpath value is wildcard mode > {code:java} > ../share/hadoop/hdfs;spark/spark-2.0.2-bin-hadoop2.7/jars... > {code} > The value is set by spark,but libhdfs must load class from jar files. > > Root cause is: > In hdfs.py we just check the string ''hadoop" in classpath,but not jar file > {code:java} > def _maybe_set_hadoop_classpath(): > if 'hadoop' in os.environ.get('CLASSPATH', ''): > return{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5049) [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark
[ https://issues.apache.org/jira/browse/ARROW-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tiger068 updated ARROW-5049: Affects Version/s: 0.13.0 > [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow > FileSystem used in spark > -- > > Key: ARROW-5049 > URL: https://issues.apache.org/jira/browse/ARROW-5049 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.12.0, 0.13.0, 0.12.1 >Reporter: Tiger068 >Assignee: Tiger068 >Priority: Major > Fix For: 0.14.0 > > > when i init pyarrow filesystem to connect hdfs clusfter in spark,the libhdfs > throws error: > {code:java} > org/apache/hadoop/fs/FileSystem class not found > {code} > I print out the CLASSPATH, the classpath value is wildcard mode > {code:java} > ../share/hadoop/hdfs;spark/spark-2.0.2-bin-hadoop2.7/jars... > {code} > The value is set by spark,but libhdfs must load class from jar files. > > Root cause is: > In hdfs.py we just check the string ''hadoop" in classpath,but not jar file > {code:java} > def _maybe_set_hadoop_classpath(): > if 'hadoop' in os.environ.get('CLASSPATH', ''): > return{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5049) [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark
[ https://issues.apache.org/jira/browse/ARROW-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou updated ARROW-5049: Fix Version/s: (was: 0.13.0) 0.14.0 > [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow > FileSystem used in spark > -- > > Key: ARROW-5049 > URL: https://issues.apache.org/jira/browse/ARROW-5049 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.12.0, 0.12.1 >Reporter: Tiger068 >Assignee: Tiger068 >Priority: Major > Fix For: 0.14.0 > > > when i init pyarrow filesystem to connect hdfs clusfter in spark,the libhdfs > throws error: > {code:java} > org/apache/hadoop/fs/FileSystem class not found > {code} > I print out the CLASSPATH, the classpath value is wildcard mode > {code:java} > ../share/hadoop/hdfs;spark/spark-2.0.2-bin-hadoop2.7/jars... > {code} > The value is set by spark,but libhdfs must load class from jar files. > > Root cause is: > In hdfs.py we just check the string ''hadoop" in classpath,but not jar file > {code:java} > def _maybe_set_hadoop_classpath(): > if 'hadoop' in os.environ.get('CLASSPATH', ''): > return{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5049) [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark
[ https://issues.apache.org/jira/browse/ARROW-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tiger068 updated ARROW-5049: Priority: Major (was: Minor) > [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow > FileSystem used in spark > -- > > Key: ARROW-5049 > URL: https://issues.apache.org/jira/browse/ARROW-5049 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.12.0, 0.12.1 >Reporter: Tiger068 >Assignee: Tiger068 >Priority: Major > Fix For: 0.13.0 > > > when i init pyarrow filesystem to connect hdfs clusfter in spark,the libhdfs > throws error: > {code:java} > org/apache/hadoop/fs/FileSystem class not found > {code} > I print out the CLASSPATH, the classpath value is wildcard mode > {code:java} > ../share/hadoop/hdfs;spark/spark-2.0.2-bin-hadoop2.7/jars... > {code} > The value is set by spark,but libhdfs must load class from jar files. > > Root cause is: > In hdfs.py we just check the string ''hadoop" in classpath,but not jar file > {code:java} > def _maybe_set_hadoop_classpath(): > if 'hadoop' in os.environ.get('CLASSPATH', ''): > return{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5049) [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark
[ https://issues.apache.org/jira/browse/ARROW-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tiger068 updated ARROW-5049: Description: when i init pyarrow filesystem to connect hdfs clusfter in spark,the libhdfs throws error: {code:java} org/apache/hadoop/fs/FileSystem class not found {code} I print out the CLASSPATH, the classpath value is wildcard mode {code:java} ../share/hadoop/hdfs;spark/spark-2.0.2-bin-hadoop2.7/jars... {code} The value is set by spark,but libhdfs must load class from jar files. Root cause is: we just check the string ''hadoop" in classpath,but not jar file {code:java} def _maybe_set_hadoop_classpath(): if 'hadoop' in os.environ.get('CLASSPATH', ''): return{code} was: when i init pyarrow filesystem to connect hdfs clusfter in spark,the libhdfs throws error: {code:java} org/apache/hadoop/fs/FileSystem class not found {code} I print out the CLASSPATH, the classpath value is wildcard mode {code:java} ../share/hadoop/hdfs;spark/spark-2.0.2-bin-hadoop2.7/jars... {code} Than value is set by spark,but libhdfs must load class from jar files. Root cause is: we just check the string ''hadoop" in classpath,but not jar file {code:java} def _maybe_set_hadoop_classpath(): if 'hadoop' in os.environ.get('CLASSPATH', ''): return{code} > [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow > FileSystem used in spark > -- > > Key: ARROW-5049 > URL: https://issues.apache.org/jira/browse/ARROW-5049 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.12.0, 0.12.1 >Reporter: Tiger068 >Assignee: Tiger068 >Priority: Minor > Fix For: 0.13.0 > > > when i init pyarrow filesystem to connect hdfs clusfter in spark,the libhdfs > throws error: > {code:java} > org/apache/hadoop/fs/FileSystem class not found > {code} > I print out the CLASSPATH, the classpath value is wildcard mode > {code:java} > ../share/hadoop/hdfs;spark/spark-2.0.2-bin-hadoop2.7/jars... > {code} > The value is set by spark,but libhdfs must load class from jar files. > > Root cause is: > we just check the string ''hadoop" in classpath,but not jar file > {code:java} > def _maybe_set_hadoop_classpath(): > if 'hadoop' in os.environ.get('CLASSPATH', ''): > return{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-5049) [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark
[ https://issues.apache.org/jira/browse/ARROW-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tiger068 updated ARROW-5049: Description: when i init pyarrow filesystem to connect hdfs clusfter in spark,the libhdfs throws error: {code:java} org/apache/hadoop/fs/FileSystem class not found {code} I print out the CLASSPATH, the classpath value is wildcard mode {code:java} ../share/hadoop/hdfs;spark/spark-2.0.2-bin-hadoop2.7/jars... {code} The value is set by spark,but libhdfs must load class from jar files. Root cause is: In hdfs.py we just check the string ''hadoop" in classpath,but not jar file {code:java} def _maybe_set_hadoop_classpath(): if 'hadoop' in os.environ.get('CLASSPATH', ''): return{code} was: when i init pyarrow filesystem to connect hdfs clusfter in spark,the libhdfs throws error: {code:java} org/apache/hadoop/fs/FileSystem class not found {code} I print out the CLASSPATH, the classpath value is wildcard mode {code:java} ../share/hadoop/hdfs;spark/spark-2.0.2-bin-hadoop2.7/jars... {code} The value is set by spark,but libhdfs must load class from jar files. Root cause is: we just check the string ''hadoop" in classpath,but not jar file {code:java} def _maybe_set_hadoop_classpath(): if 'hadoop' in os.environ.get('CLASSPATH', ''): return{code} > [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow > FileSystem used in spark > -- > > Key: ARROW-5049 > URL: https://issues.apache.org/jira/browse/ARROW-5049 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.12.0, 0.12.1 >Reporter: Tiger068 >Assignee: Tiger068 >Priority: Minor > Fix For: 0.13.0 > > > when i init pyarrow filesystem to connect hdfs clusfter in spark,the libhdfs > throws error: > {code:java} > org/apache/hadoop/fs/FileSystem class not found > {code} > I print out the CLASSPATH, the classpath value is wildcard mode > {code:java} > ../share/hadoop/hdfs;spark/spark-2.0.2-bin-hadoop2.7/jars... > {code} > The value is set by spark,but libhdfs must load class from jar files. > > Root cause is: > In hdfs.py we just check the string ''hadoop" in classpath,but not jar file > {code:java} > def _maybe_set_hadoop_classpath(): > if 'hadoop' in os.environ.get('CLASSPATH', ''): > return{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)