You said you are on "Windows 10", but then you mention "libhdfs.so".
Might that be the cause? I think that on Windows Arrow looks for "hdfs.dll"
not for "libhdfs.so"

On Thu, May 6, 2021 at 3:51 PM Abdulrahman M. Selim <
[email protected]> wrote:

> Hello,
> I wanted to ask a question on GitHub but was shown to ask via this email
> instead.
> I want to use PyArrow with HDFS to read and write files, I have an issue
> with using it.
> I am currently using PyArrow on windows 10, Hadoop itself is installed
> correctly & Java is also installed, but I keep getting "OSError: Unable to
> load libhdfs: The specified module could not be found." when using
> fs.HadoopFileSystem().
> I did what all the threads in GitHub & Stackoverflow discussed on
> specifically stating the location of
> {ARROW_LIBHDFS_DIR=$HADOOP_HOME/lib/native} and the location has both
> libhdfs.so & libhdfs.so.0.0.0 inside it. However many times I tried and
> explicitly stated it, it doesn't work and I keep getting the same error.
> Thank you in advance.
> Yours sincerely,
> Abdulrahman
>
>
> --
>
> Abdulrahman Mohamed Selim
>
> Computer Science Master's student at Saarland University.
> Biomedical Engineering Bachelor's, Class of 2019.
> LinkedIn Account: https://www.linkedin.com/in/abdulrahmanselim/
> <https://www.linkedin.com/in/abdelrahmanramzy/>
>
  • Python Issue Abdulrahman M. Selim
    • Re: Python Issue Alessandro Molina

Reply via email to