Procedurally, you should create a JIRA ticket with your request and once the fix is complete and merged into master you can request a release on the mailing list. For an example, see [1].
> Is there a way to request that pyarrow/arrow gets built with the fixed > version of the AWS SDK for the builds published to pypi in the next week or > two or is this unrealistic? It is possible that the JIRA is addressed in the next week or two, especially if you are able to provide a pull request and help address feedback on the PR and work to get it merged. Note that we actually target version 1.8.133 specifically to avoid a crash on shutdown bug in AWS. More details in [2]. Are you certain that you are installing from pip? It is probably unrealistic to expect that a release with this fix will happen in the next week or two. Releasing the python libraries requires a considerable amount of manual work and we haven't generally done this to rush fixes (typically patch releases have been to address significant regressions or blocking issues). The next major release of the python/C++/R/java implementations will be sometime around April and we are getting close enough to that date that I don't think there would be much enthusiasm to undertake a patch release. > If so, is the next best option for us to git clone the source and build with > the fixed version of the AWS SDK? That is one option. If your fix is merged then another option could be to use the nightly wheels. This process is described in more detail at [3]. [1] https://lists.apache.org/thread/4dyhqjcdf066bz6fw2w2l86yfbh4h452 [2] https://issues.apache.org/jira/browse/ARROW-15141 [3] https://arrow.apache.org/docs/python/install.html#installing-nightly-packages On Tue, Mar 15, 2022 at 7:07 AM Joseph Smith <[email protected]> wrote: > > We run our tests with some flags on linux that help discovery illegal memory > access (e.g. MALLOC_PERTURB_=90). Any test that imports pyarrow is > segfaulting because of an issue discovered and fixed in the AWS SDK. It is > fixed as of 1.9.214 but arrow is currently built with version 1.8.133 (as per > https://github.com/apache/arrow/blob/ea3480033e57947ae59e7862ed35b8bf3335ea9f/cpp/thirdparty/versions.txt). > > My question is more procedural than technical. Is there a way to request that > pyarrow/arrow gets built with the fixed version of the AWS SDK for the builds > published to pypi in the next week or two or is this unrealistic? If so, is > the next best option for us to git clone the source and build with the fixed > version of the AWS SDK? > > Thanks in advance for your advice, > -Joe > > Technical details: > - This occurs because of an incorrect resource handling that segfaults on > shutdown in the AWS SDK > - pyarrow's fs module calls arrow::fs::InitializeS3() on import which sets up > the conditions to segfault on exit (depending on how the memory allocations > all work out, which is why this is easier to reproduce with MALLOC_PETURB_) > - This fix for this is > (https://github.com/aws/aws-sdk-cpp/commit/a2512bd02addd77515430ac74d7ee5f37343ec99) > - The first release tag I see for this commit in aws-sdk-cpp is in version > 1.9.214 > > > > > > > > Sent via Superhuman >
