Hello, you don't to go through HDFS or Java to access ADLS Gen 2. This is simply an improved API for Azure Storage Blob and thus you can use the blob APIs of https://azure-storage.readthedocs.io/ to access the relevant containers. I've previously used https://github.com/blue-yonder/storefact and `pyarrow.parquet` to read Parquet files reliably from there.
Cheers Uwe On Wed, Jul 17, 2019, at 11:10 AM, Игорь Кравченко wrote: > Hello, > Yesterday I have opened an issue on GitHub, and I have received an advice to > ask this question here, link to the issue : > https://github.com/apache/arrow/issues/4888 > Generally I have Storage Account in Azure and a virtual machine, from which I > want to connect to Data Lake, and I am trying to do that with PyArrow. As I > wrote I was trying to access it using different drivers - "libhdfs" and > "libhdfs3", and constantly getting the same error - Timeout. One of the > options of authorization to Storage is Shared Key, and when I was using > terminal commands "hdfs" I worked just fine, scripts that I used : *hdfs dfs > -get abfss://[email protected]/new-direc > /home/adminello/files/* > *ABFSS - is special driver, created to access ALDS Gen2, which I currently > have* > *(*https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-abfs-driver), > and hdfs works with this driver only from 3.2.0 version, BUT in folder > hadoop-3.2.0/lib/native there was a lack of file "libhdfs", which really > surprised me, because in hadoop-3.1.2 it was present, although it doesn't > work with *abfss.* Just details, maybe You need them. > > But my company wants to access it through Python, because of further Data > Analysis, which is relatively easy to make using Pandas, at that point I > started trying to connect via Python. > I didn't find any example in google, that's why I am not sure if my script is > correct, I made lots of combinations, changing ports, replacing file system > name, adding "https" and "abfss" in the beginning of hostname and at this > point I stuck, maybe Somebody can help me please. > Thanks
