[jira] [Comment Edited] (ARROW-5236) [Python] hdfs.connect() is trying to load libjvm in windows

2019-06-04 Thread Urmila (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855674#comment-16855674
 ] 

Urmila edited comment on ARROW-5236 at 6/4/19 1:07 PM:
---

Hi, I am also facing same issue. I have conda and spark installed on my local 
windows machine and trying to connect HDFS (unix) as mentioned below

import pyarrow as pa
fs = pa.hdfs.connect('hostname.xx.xx.com', port_number, user='a...@xyx.com', 
kerb_ticket='local machine path')
Traceback (most recent call last):
File "", line 1, in 
File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line
183, in connect
extra_conf=extra_conf)
File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line
37, in init
self._connect(host, port, user, kerb_ticket, driver, extra_conf)
File "pyarrow\io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect
File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Unable to load libjvm


was (Author: urmilarv):
Hi, I am also facing same issue. I have conda and spark installed on my local 
machine and trying to connect HDFS as mentioned below

import pyarrow as pa
fs = pa.hdfs.connect('hostname.xx.xx.com', port_number, user='a...@xyx.com', 
kerb_ticket='local machine path')
Traceback (most recent call last):
File "", line 1, in 
File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line
183, in connect
extra_conf=extra_conf)
File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line
37, in init
self._connect(host, port, user, kerb_ticket, driver, extra_conf)
File "pyarrow\io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect
File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Unable to load libjvm

> [Python] hdfs.connect() is trying to load libjvm in windows
> ---
>
> Key: ARROW-5236
> URL: https://issues.apache.org/jira/browse/ARROW-5236
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
> Environment: Windows 7 Enterprise, pyarrow 0.13.0
>Reporter: Kamaraju
>Priority: Major
>  Labels: hdfs
>
> This issue was originally reported at 
> [https://github.com/apache/arrow/issues/4215] . Raising a Jira as per Wes 
> McKinney's request.
> Summary:
>  The following script
> {code}
> $ cat expt2.py
> import pyarrow as pa
> fs = pa.hdfs.connect()
> {code}
> tries to load libjvm in windows 7 which is not expected.
> {noformat}
> $ python ./expt2.py
> Traceback (most recent call last):
>   File "./expt2.py", line 3, in 
> fs = pa.hdfs.connect()
>   File 
> "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py",
>  line 183, in connect
> extra_conf=extra_conf)
>   File 
> "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py",
>  line 37, in __init__
> self._connect(host, port, user, kerb_ticket, driver, extra_conf)
>   File "pyarrow\io-hdfs.pxi", line 89, in 
> pyarrow.lib.HadoopFileSystem._connect
>   File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status
> pyarrow.lib.ArrowIOError: Unable to load libjvm
> {noformat}
> There is no libjvm file in Windows Java installation.
> {noformat}
> $ echo $JAVA_HOME
> C:\Progra~1\Java\jdk1.8.0_141
> $ find $JAVA_HOME -iname '*libjvm*'
> 
> {noformat}
> I see the libjvm error with both 0.11.1 and 0.13.0 versions of pyarrow.
> Steps to reproduce the issue (with more details):
> Create the environment
> {noformat}
> $ cat scratch_py36_pyarrow.yml
> name: scratch_py36_pyarrow
> channels:
>   - defaults
> dependencies:
>   - python=3.6.8
>   - pyarrow
> {noformat}
> {noformat}
> $ conda env create -f scratch_py36_pyarrow.yml
> {noformat}
> Apply the following patch to lib/site-packages/pyarrow/hdfs.py . I had to do 
> this since the Hadoop installation that comes with MapR <[https://mapr.com/]> 
> windows client only has $HADOOP_HOME/bin/hadoop.cmd . There is no file named 
> $HADOOP_HOME/bin/hadoop and so the subsequent subprocess.check_output call 
> fails with FileNotFoundError if this patch is not applied.
> {noformat}
> $ cat ~/x/patch.txt
> 131c131
> < hadoop_bin = '{0}/bin/hadoop'.format(os.environ['HADOOP_HOME'])
> ---
> > hadoop_bin = '{0}/bin/hadoop.cmd'.format(os.environ['HADOOP_HOME'])
> $ patch 
> /c/ProgramData/Continuum/Anaconda/envs/scratch_py36_pyarrow/lib/site-packages/pyarrow/hdfs.py
>  ~/x/patch.txt
> patching file 
> /c/ProgramData/Continuum/Anaconda/envs/scratch_py36_pyarrow/lib/site-packages/pyarrow/hdfs.py
> {noformat}
> Activate the environment
> {noformat}
> $ source activate scratch_py36_pyarrow
> {noformat}
> Sample script
> {noformat}
> $ cat expt2.py
> import pyarrow as pa
> fs = pa.hdfs.connect()
> {noformat}

[jira] [Comment Edited] (ARROW-5236) [Python] hdfs.connect() is trying to load libjvm in windows

2019-06-04 Thread Urmila (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855674#comment-16855674
 ] 

Urmila edited comment on ARROW-5236 at 6/4/19 1:06 PM:
---

Hi, I am also facing same issue. I have conda and spark installed on my local 
machine and trying to connect HDFS as mentioned below

import pyarrow as pa
fs = pa.hdfs.connect('hostname.xx.xx.com', port_number, user='a...@xyx.com', 
kerb_ticket='local machine path')
Traceback (most recent call last):
File "", line 1, in 
File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line
183, in connect
extra_conf=extra_conf)
File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line
37, in init
self._connect(host, port, user, kerb_ticket, driver, extra_conf)
File "pyarrow\io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect
File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Unable to load libjvm


was (Author: urmilarv):
Hi, I am also facing same issue, but could not find issue fix details anya ny 
of JIRA ARROW-5236 OR 4215.
Please help. I have conda and spark installed on my local machine and trying to 
connect HDFS as mentioned below

import pyarrow as pa
fs = pa.hdfs.connect('hostname.xx.xx.com', port_number, user='a...@xyx.com', 
kerb_ticket='local machine path')
Traceback (most recent call last):
File "", line 1, in 
File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line
183, in connect
extra_conf=extra_conf)
File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line
37, in init
self._connect(host, port, user, kerb_ticket, driver, extra_conf)
File "pyarrow\io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect
File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Unable to load libjvm

> [Python] hdfs.connect() is trying to load libjvm in windows
> ---
>
> Key: ARROW-5236
> URL: https://issues.apache.org/jira/browse/ARROW-5236
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
> Environment: Windows 7 Enterprise, pyarrow 0.13.0
>Reporter: Kamaraju
>Priority: Major
>  Labels: hdfs
>
> This issue was originally reported at 
> [https://github.com/apache/arrow/issues/4215] . Raising a Jira as per Wes 
> McKinney's request.
> Summary:
>  The following script
> {code}
> $ cat expt2.py
> import pyarrow as pa
> fs = pa.hdfs.connect()
> {code}
> tries to load libjvm in windows 7 which is not expected.
> {noformat}
> $ python ./expt2.py
> Traceback (most recent call last):
>   File "./expt2.py", line 3, in 
> fs = pa.hdfs.connect()
>   File 
> "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py",
>  line 183, in connect
> extra_conf=extra_conf)
>   File 
> "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py",
>  line 37, in __init__
> self._connect(host, port, user, kerb_ticket, driver, extra_conf)
>   File "pyarrow\io-hdfs.pxi", line 89, in 
> pyarrow.lib.HadoopFileSystem._connect
>   File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status
> pyarrow.lib.ArrowIOError: Unable to load libjvm
> {noformat}
> There is no libjvm file in Windows Java installation.
> {noformat}
> $ echo $JAVA_HOME
> C:\Progra~1\Java\jdk1.8.0_141
> $ find $JAVA_HOME -iname '*libjvm*'
> 
> {noformat}
> I see the libjvm error with both 0.11.1 and 0.13.0 versions of pyarrow.
> Steps to reproduce the issue (with more details):
> Create the environment
> {noformat}
> $ cat scratch_py36_pyarrow.yml
> name: scratch_py36_pyarrow
> channels:
>   - defaults
> dependencies:
>   - python=3.6.8
>   - pyarrow
> {noformat}
> {noformat}
> $ conda env create -f scratch_py36_pyarrow.yml
> {noformat}
> Apply the following patch to lib/site-packages/pyarrow/hdfs.py . I had to do 
> this since the Hadoop installation that comes with MapR <[https://mapr.com/]> 
> windows client only has $HADOOP_HOME/bin/hadoop.cmd . There is no file named 
> $HADOOP_HOME/bin/hadoop and so the subsequent subprocess.check_output call 
> fails with FileNotFoundError if this patch is not applied.
> {noformat}
> $ cat ~/x/patch.txt
> 131c131
> < hadoop_bin = '{0}/bin/hadoop'.format(os.environ['HADOOP_HOME'])
> ---
> > hadoop_bin = '{0}/bin/hadoop.cmd'.format(os.environ['HADOOP_HOME'])
> $ patch 
> /c/ProgramData/Continuum/Anaconda/envs/scratch_py36_pyarrow/lib/site-packages/pyarrow/hdfs.py
>  ~/x/patch.txt
> patching file 
> /c/ProgramData/Continuum/Anaconda/envs/scratch_py36_pyarrow/lib/site-packages/pyarrow/hdfs.py
> {noformat}
> Activate the environment
> {noformat}
> $ source activate scratch_py36_pyarrow
> {noformat}
> Sample script
> {noformat}
> $ 

[jira] [Commented] (ARROW-5236) [Python] hdfs.connect() is trying to load libjvm in windows

2019-06-04 Thread Urmila (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855674#comment-16855674
 ] 

Urmila commented on ARROW-5236:
---

Hi, I am also facing same issue, but could not find issue fix details anya ny 
of JIRA ARROW-5236 OR 4215.
Please help. I have conda and spark installed on my local machine and trying to 
connect HDFS as mentioned below

import pyarrow as pa
fs = pa.hdfs.connect('hostname.xx.xx.com', port_number, user='a...@xyx.com', 
kerb_ticket='local machine path')
Traceback (most recent call last):
File "", line 1, in 
File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line
183, in connect
extra_conf=extra_conf)
File "C:\Users\vishurm\opt\miniconda3\lib\site-packages\pyarrow\hdfs.py", line
37, in init
self._connect(host, port, user, kerb_ticket, driver, extra_conf)
File "pyarrow\io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect
File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Unable to load libjvm

> [Python] hdfs.connect() is trying to load libjvm in windows
> ---
>
> Key: ARROW-5236
> URL: https://issues.apache.org/jira/browse/ARROW-5236
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
> Environment: Windows 7 Enterprise, pyarrow 0.13.0
>Reporter: Kamaraju
>Priority: Major
>  Labels: hdfs
>
> This issue was originally reported at 
> [https://github.com/apache/arrow/issues/4215] . Raising a Jira as per Wes 
> McKinney's request.
> Summary:
>  The following script
> {code}
> $ cat expt2.py
> import pyarrow as pa
> fs = pa.hdfs.connect()
> {code}
> tries to load libjvm in windows 7 which is not expected.
> {noformat}
> $ python ./expt2.py
> Traceback (most recent call last):
>   File "./expt2.py", line 3, in 
> fs = pa.hdfs.connect()
>   File 
> "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py",
>  line 183, in connect
> extra_conf=extra_conf)
>   File 
> "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py",
>  line 37, in __init__
> self._connect(host, port, user, kerb_ticket, driver, extra_conf)
>   File "pyarrow\io-hdfs.pxi", line 89, in 
> pyarrow.lib.HadoopFileSystem._connect
>   File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status
> pyarrow.lib.ArrowIOError: Unable to load libjvm
> {noformat}
> There is no libjvm file in Windows Java installation.
> {noformat}
> $ echo $JAVA_HOME
> C:\Progra~1\Java\jdk1.8.0_141
> $ find $JAVA_HOME -iname '*libjvm*'
> 
> {noformat}
> I see the libjvm error with both 0.11.1 and 0.13.0 versions of pyarrow.
> Steps to reproduce the issue (with more details):
> Create the environment
> {noformat}
> $ cat scratch_py36_pyarrow.yml
> name: scratch_py36_pyarrow
> channels:
>   - defaults
> dependencies:
>   - python=3.6.8
>   - pyarrow
> {noformat}
> {noformat}
> $ conda env create -f scratch_py36_pyarrow.yml
> {noformat}
> Apply the following patch to lib/site-packages/pyarrow/hdfs.py . I had to do 
> this since the Hadoop installation that comes with MapR <[https://mapr.com/]> 
> windows client only has $HADOOP_HOME/bin/hadoop.cmd . There is no file named 
> $HADOOP_HOME/bin/hadoop and so the subsequent subprocess.check_output call 
> fails with FileNotFoundError if this patch is not applied.
> {noformat}
> $ cat ~/x/patch.txt
> 131c131
> < hadoop_bin = '{0}/bin/hadoop'.format(os.environ['HADOOP_HOME'])
> ---
> > hadoop_bin = '{0}/bin/hadoop.cmd'.format(os.environ['HADOOP_HOME'])
> $ patch 
> /c/ProgramData/Continuum/Anaconda/envs/scratch_py36_pyarrow/lib/site-packages/pyarrow/hdfs.py
>  ~/x/patch.txt
> patching file 
> /c/ProgramData/Continuum/Anaconda/envs/scratch_py36_pyarrow/lib/site-packages/pyarrow/hdfs.py
> {noformat}
> Activate the environment
> {noformat}
> $ source activate scratch_py36_pyarrow
> {noformat}
> Sample script
> {noformat}
> $ cat expt2.py
> import pyarrow as pa
> fs = pa.hdfs.connect()
> {noformat}
> Execute the script
> {noformat}
> $ python ./expt2.py
> Traceback (most recent call last):
>   File "./expt2.py", line 3, in 
> fs = pa.hdfs.connect()
>   File 
> "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py",
>  line 183, in connect
> extra_conf=extra_conf)
>   File 
> "C:\ProgramData\Continuum\Anaconda\envs\scratch_py36_pyarrow\lib\site-packages\pyarrow\hdfs.py",
>  line 37, in __init__
> self._connect(host, port, user, kerb_ticket, driver, extra_conf)
>   File "pyarrow\io-hdfs.pxi", line 89, in 
> pyarrow.lib.HadoopFileSystem._connect
>   File "pyarrow\error.pxi", line 83, in pyarrow.lib.check_status
> pyarrow.lib.ArrowIOError: Unable to load libjvm
> {noformat}



--
This message was sent by Atlassian JIRA