We run our tests with some flags on linux that help discovery illegal memory access (e.g. MALLOC_PERTURB_=90). Any test that imports pyarrow is segfaulting because of an issue discovered and fixed in the AWS SDK. It is fixed as of 1.9.214 but arrow is currently built with version 1.8.133 (as per https://github.com/apache/arrow/blob/ea3480033e57947ae59e7862ed35b8bf3335ea9f/cpp/thirdparty/versions.txt ).
My question is more procedural than technical. Is there a way to request that pyarrow/arrow gets built with the fixed version of the AWS SDK for the builds published to pypi in the next week or two or is this unrealistic? If so, is the next best option for us to git clone the source and build with the fixed version of the AWS SDK? Thanks in advance for your advice, -Joe *Technical details:* - This occurs because of an incorrect resource handling that segfaults on shutdown in the AWS SDK - pyarrow's fs module calls arrow::fs::InitializeS3() on import which sets up the conditions to segfault on exit (depending on how the memory allocations all work out, which is why this is easier to reproduce with MALLOC_PETURB_) - This fix for this is ( https://github.com/aws/aws-sdk-cpp/commit/a2512bd02addd77515430ac74d7ee5f37343ec99 ) - The first release tag I see for this commit in aws-sdk-cpp is in version 1.9.214 Sent via Superhuman <https://sprh.mn/[email protected]>
