Tom-Newton commented on issue #29847: URL: https://github.com/apache/arrow/issues/29847#issuecomment-1646564153
It looks like the azure file system skeleton has merged https://github.com/apache/arrow/issues/35903. So I think this is probably the next step. https://github.com/apache/arrow/pull/12914 did a lot of the work but when I came to using this I had to change a few things: 1. https://github.com/apache/arrow/pull/12914 created an external project for each relevant component of the Azure C++ SDK. However each release of the Azure SDK actually contains the entire Azure C++ SDK, despite the confusing naming convention. For example [azure-identity_1.5.1](https://github.com/Azure/azure-sdk-for-cpp/releases/tag/azure-identity_1.5.1) is actually just a newer release of [azure-core_1.10.1](https://github.com/Azure/azure-sdk-for-cpp/releases/tag/azure-core_1.10.1). In the original PR it was actually a race condition for which version of the Azure SDK was used. 2. Its valuable to use an extremely recent version of the Azure SDK. [azure-identity_1.5.1](https://github.com/Azure/azure-sdk-for-cpp/releases/tag/azure-identity_1.5.1) from 2023-07-06 fixed a bug in managed identity authentication. Managed identity is supposed to be the best auth method for Azure so I think its important that it works. Additionally the build seemed broken for the released used by https://github.com/apache/arrow/pull/12914 due to a new release of https://www.zlib.net/ leading to https://zlib.net/zlib-1.2.12.tar.gz no longer existing. 3. There is a new dependency on `xml2`. This can be installed straightforwardly from `apt` in the ubuntu docker builds but I did not find a simple solution for the manylinux build. Not sure its a good solution but, I was able to get this working by adding a bundled cmake build for `xml2`. I mostly copied how this was done with `nlohmann_json` and `crc32c` for the google cloud SDK. 4. There are difficulties around OpenSSL versions. The Azure C++ SDK seems to build for OpenSSL 3 by default and the way to configure it is quite painful https://github.com/Azure/azure-sdk-for-cpp/blob/main/README.md#openssl-version. I feel like there must be a better way to do this but I have not found it. For builds using vcpkg to install OpenSSL we get version 3 so that works but I had problems with the more pure cmake build since we just have OpenSSL>=1.0.2 I will rebase my changes described above onto main and open a draft PR. I have basically no experience with C++ but I have built a python wheel using these changes that we've been using in production for a while now. If someone who knows what they are talking about can point me in the right direction maybe I can get something ready for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
