raulcd commented on issue #48894:
URL: https://github.com/apache/arrow/issues/48894#issuecomment-3768682285

   ok, I think I've found the problem. From the gdb backtrace:
   ```c++
   (gdb) backtrace
   #0  __pthread_kill_implementation (no_tid=0, signo=6, 
threadid=140426842515264) at ./nptl/pthread_kill.c:44
   #1  __pthread_kill_internal (signo=6, threadid=140426842515264) at 
./nptl/pthread_kill.c:78
   #2  __GI___pthread_kill (threadid=140426842515264, signo=signo@entry=6) at 
./nptl/pthread_kill.c:89
   #3  0x00007fb7ac151476 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/posix/raise.c:26
   #4  0x00007fb7ac1377f3 in __GI_abort () at ./stdlib/abort.c:79
   #5  0x00007fb7a6db6aa4 in __gnu_cxx::__verbose_terminate_handler () at 
../../../../libstdc++-v3/libsupc++/vterminate.cc:95
   #6  0x00007fb7a6dc8ffa in __cxxabiv1::__terminate (handler=<optimized out>) 
at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
   #7  0x00007fb7a6db662e in std::terminate () at 
../../../../libstdc++-v3/libsupc++/eh_terminate.cc:58
   #8  0x00007fb7a6db6645 in __cxxabiv1::__cxa_rethrow () at 
../../../../libstdc++-v3/libsupc++/eh_throw.cc:136
   #9  0x00007fb7a51e9b5f in 
Azure::Core::Http::Policies::_internal::RetryPolicy::Send(Azure::Core::Http::Request&,
 Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) 
const [clone .cold] ()
      from /opt/conda/envs/arrow/lib/libazure-core.so.1.16.1
   #10 0x00007fb7a521abeb in 
Azure::Core::Http::Policies::NextHttpPolicy::Send(Azure::Core::Http::Request&, 
Azure::Core::Context const&) () from 
/opt/conda/envs/arrow/lib/libazure-core.so.1.16.1
   #11 0x00007fb7a5224d3c in 
Azure::Core::Http::Policies::_internal::TelemetryPolicy::Send(Azure::Core::Http::Request&,
 Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) 
const ()
      from /opt/conda/envs/arrow/lib/libazure-core.so.1.16.1
   #12 0x00007fb7a521abeb in 
Azure::Core::Http::Policies::NextHttpPolicy::Send(Azure::Core::Http::Request&, 
Azure::Core::Context const&) () from 
/opt/conda/envs/arrow/lib/libazure-core.so.1.16.1
   #13 0x00007fb7a54c37d7 in 
Azure::Core::Http::Policies::_internal::RequestIdPolicy::Send(Azure::Core::Http::Request&,
 Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) 
const ()
      from /opt/conda/envs/arrow/lib/libazure-storage-files-datalake.so.12.14.0
   #14 0x00007fb7a521abeb in 
Azure::Core::Http::Policies::NextHttpPolicy::Send(Azure::Core::Http::Request&, 
Azure::Core::Context const&) () from 
/opt/conda/envs/arrow/lib/libazure-core.so.1.16.1
   #15 0x00007fb7a54c2870 in 
Azure::Storage::_internal::StorageServiceVersionPolicy::Send(Azure::Core::Http::Request&,
 Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) 
const ()
      from /opt/conda/envs/arrow/lib/libazure-storage-files-datalake.so.12.14.0
   #16 0x00007fb7a53f8756 in 
Azure::Storage::Blobs::_detail::BlobContainerClient::Create(Azure::Core::Http::_internal::HttpPipeline&,
 Azure::Core::Url const&, 
Azure::Storage::Blobs::_detail::BlobContainerClient::CreateBlobContainerOptions 
const&, Azure::Core::Context const&) () from 
/opt/conda/envs/arrow/lib/libazure-storage-blobs.so.12.16.0
   #17 0x00007fb7a536425f in 
Azure::Storage::Blobs::BlobContainerClient::Create(Azure::Storage::Blobs::CreateBlobContainerOptions
 const&, Azure::Core::Context const&) const ()
      from /opt/conda/envs/arrow/lib/libazure-storage-blobs.so.12.16.0
   #18 0x00007fb7a5368b10 in 
Azure::Storage::Blobs::BlobContainerClient::CreateIfNotExists(Azure::Storage::Blobs::CreateBlobContainerOptions
 const&, Azure::Core::Context const&) const ()
      from /opt/conda/envs/arrow/lib/libazure-storage-blobs.so.12.16.0
   #19 0x00007fb7a86cad4e in arrow::fs::(anonymous 
namespace)::CreateContainerIfNotExists<Azure::Storage::Blobs::BlobContainerClient>
 (container_name=..., container_client=...)
       at /arrow/cpp/src/arrow/filesystem/azurefs.cc:1453
   #20 0x00007fb7a86bbe04 in arrow::fs::AzureFileSystem::CreateDir 
(this=0x55a60261ad00, path=..., recursive=true) at 
/arrow/cpp/src/arrow/filesystem/azurefs.cc:3283
   ```
   We seem to be catching only `Storage::StorageException` on 
`CreateContainerIfNotExists`:
   
   
https://github.com/apache/arrow/blob/e4c9ed298909502993153d5ba858ddd89a02a2ea/cpp/src/arrow/filesystem/azurefs.cc#L1452-L1458
   
   but azure seems to raise `Azure::Core::Http::TransportException` on the 
Retry Policy:
   
https://github.com/Azure/azure-sdk-for-cpp/blob/3302b5b2ddc4829ba0efb7ff7ab110f748116e72/sdk/core/azure-core/src/http/retry_policy.cpp#L152-L163
   
   If I apply the following diff:
   ```diff
   diff --git a/cpp/src/arrow/filesystem/azurefs.cc 
b/cpp/src/arrow/filesystem/azurefs.cc
   index a3a162616e..daa0d5077a 100644
   --- a/cpp/src/arrow/filesystem/azurefs.cc
   +++ b/cpp/src/arrow/filesystem/azurefs.cc
   @@ -1455,6 +1455,9 @@ Status CreateContainerIfNotExists(const std::string& 
container_name,
      } catch (const Storage::StorageException& exception) {
        return ExceptionToStatus(exception, "Failed to create a container: ", 
container_name,
                                 ": ", container_client.GetUrl());
   +  } catch (const Azure::Core::Http::TransportException& exception) {
   +    return ExceptionToStatus(exception, "Failed to create a container: ", 
container_name,
   +                             ": ", container_client.GetUrl());
      }
    }
   ```
   I get a test error (as expected) but not a crash anymore:
   ```
   
opt/conda/envs/arrow/lib/python3.13/site-packages/pyarrow/tests/test_fs.py:329: 
in azurefs
       fs.create_dir(container)
   pyarrow/_fs.pyx:638: in pyarrow._fs.FileSystem.create_dir
       ???
   _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
   
   >   ???
   E   OSError: Failed to create a container: pyarrow-filesystem: 
http://127.0.0.1:54861/devstoreaccount1/pyarrow-filesystem Azure Error: [] Fail 
to get a new connection for: http://127.0.0.1:54861. Could not connect to server
   
   pyarrow/error.pxi:92: OSError
   ```
   
   I am wondering whether this is something we should investigate further as we 
have 40 occurrences where we catch `Storage::StorageException` but a single 
place where we catch `Azure::Core::Http::TransportException` and it was also 
due to a bug being reported:
   - https://github.com/apache/arrow/issues/44269
   
   @pitrou what do you think? Should I update the usages of 
`Storage::StorageException` to `Azure::Core::RequestFailedException` as both 
inherit from it?
   
https://github.com/Azure/azure-sdk-for-cpp/blob/3302b5b2ddc4829ba0efb7ff7ab110f748116e72/sdk/core/azure-core/inc/azure/core/http/http.hpp#L57
   
   
https://github.com/Azure/azure-sdk-for-cpp/blob/3302b5b2ddc4829ba0efb7ff7ab110f748116e72/sdk/storage/azure-storage-common/inc/azure/storage/common/storage_exception.hpp#L19
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to