Tom-Newton commented on issue #29847:
URL: https://github.com/apache/arrow/issues/29847#issuecomment-1646564153

   It looks like the azure file system skeleton has merged 
https://github.com/apache/arrow/issues/35903. So I think this is probably the 
next step. 
   
   https://github.com/apache/arrow/pull/12914 did a lot of the work but when I 
came to using this I had to change a few things:
   1. https://github.com/apache/arrow/pull/12914 created an external project 
for each relevant component of the Azure C++ SDK. However each release of the 
Azure SDK actually contains the entire Azure C++ SDK, despite the confusing 
naming convention. For example 
[azure-identity_1.5.1](https://github.com/Azure/azure-sdk-for-cpp/releases/tag/azure-identity_1.5.1)
 is actually just a newer release of 
[azure-core_1.10.1](https://github.com/Azure/azure-sdk-for-cpp/releases/tag/azure-core_1.10.1).
 In the original PR it was actually a race condition for which version of the 
Azure SDK was used. 
   2. Its valuable to use an extremely recent version of the Azure SDK. 
[azure-identity_1.5.1](https://github.com/Azure/azure-sdk-for-cpp/releases/tag/azure-identity_1.5.1)
 from 2023-07-06 fixed a bug in managed identity authentication. Managed 
identity is supposed to be the best auth method for Azure so I think its 
important that it works. Additionally the build seemed broken for the released 
used by https://github.com/apache/arrow/pull/12914 due to a new release of 
https://www.zlib.net/ leading to https://zlib.net/zlib-1.2.12.tar.gz no longer 
existing.
   3. There is a new dependency on `xml2`. This can be installed 
straightforwardly from `apt` in the ubuntu docker builds but I did not find a 
simple solution for the manylinux build. Not sure its a good solution but, I 
was able to get this working by adding a bundled cmake build for `xml2`. I 
mostly copied how this was done with `nlohmann_json` and `crc32c` for the 
google cloud SDK.
   4. There are difficulties around OpenSSL versions. The Azure C++ SDK seems 
to build for OpenSSL 3 by default and the way to configure it is quite painful 
https://github.com/Azure/azure-sdk-for-cpp/blob/main/README.md#openssl-version. 
I feel like there must be a better way to do this but I have not found it. For 
builds using vcpkg to install OpenSSL we get version 3 so that works but I had 
problems with the more pure cmake build since we just have OpenSSL>=1.0.2
   
   I will rebase my changes described above onto main and open a draft PR. I 
have basically no experience with C++ but I have built a python wheel using 
these changes that we've been using in production for a while now. If someone 
who knows what they are talking about can point me in the right direction maybe 
I can get something ready for review. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to