You said you are on "Windows 10", but then you mention "libhdfs.so". Might that be the cause? I think that on Windows Arrow looks for "hdfs.dll" not for "libhdfs.so"
On Thu, May 6, 2021 at 3:51 PM Abdulrahman M. Selim < [email protected]> wrote: > Hello, > I wanted to ask a question on GitHub but was shown to ask via this email > instead. > I want to use PyArrow with HDFS to read and write files, I have an issue > with using it. > I am currently using PyArrow on windows 10, Hadoop itself is installed > correctly & Java is also installed, but I keep getting "OSError: Unable to > load libhdfs: The specified module could not be found." when using > fs.HadoopFileSystem(). > I did what all the threads in GitHub & Stackoverflow discussed on > specifically stating the location of > {ARROW_LIBHDFS_DIR=$HADOOP_HOME/lib/native} and the location has both > libhdfs.so & libhdfs.so.0.0.0 inside it. However many times I tried and > explicitly stated it, it doesn't work and I keep getting the same error. > Thank you in advance. > Yours sincerely, > Abdulrahman > > > -- > > Abdulrahman Mohamed Selim > > Computer Science Master's student at Saarland University. > Biomedical Engineering Bachelor's, Class of 2019. > LinkedIn Account: https://www.linkedin.com/in/abdulrahmanselim/ > <https://www.linkedin.com/in/abdelrahmanramzy/> >
