Procedurally, you should create a JIRA ticket with your request and
once the fix is complete and merged into master you can request a
release on the mailing list.  For an example, see [1].

> Is there a way to request that pyarrow/arrow gets built with the fixed 
> version of the AWS SDK for the builds published to pypi in the next week or 
> two or is this unrealistic?

It is possible that the JIRA is addressed in the next week or two,
especially if you are able to provide a pull request and help address
feedback on the PR and work to get it merged.  Note that we actually
target version 1.8.133 specifically to avoid a crash on shutdown bug
in AWS.  More details in [2].  Are you certain that you are installing
from pip?

It is probably unrealistic to expect that a release with this fix will
happen in the next week or two.  Releasing the python libraries
requires a considerable amount of manual work and we haven't generally
done this to rush fixes (typically patch releases have been to address
significant regressions or blocking issues).  The next major release
of the python/C++/R/java implementations will be sometime around April
and we are getting close enough to that date that I don't think there
would be much enthusiasm to undertake a patch release.

> If so, is the next best option for us to git clone the source and build with 
> the fixed version of the AWS SDK?

That is one option.  If your fix is merged then another option could
be to use the nightly wheels.  This process is described in more
detail at [3].

[1] https://lists.apache.org/thread/4dyhqjcdf066bz6fw2w2l86yfbh4h452
[2] https://issues.apache.org/jira/browse/ARROW-15141
[3] 
https://arrow.apache.org/docs/python/install.html#installing-nightly-packages

On Tue, Mar 15, 2022 at 7:07 AM Joseph Smith <[email protected]> wrote:
>
> We run our tests with some flags on linux that help discovery illegal memory 
> access (e.g. MALLOC_PERTURB_=90). Any test that imports pyarrow is 
> segfaulting because of an issue discovered and fixed in the AWS SDK. It is 
> fixed as of 1.9.214 but arrow is currently built with version 1.8.133 (as per 
> https://github.com/apache/arrow/blob/ea3480033e57947ae59e7862ed35b8bf3335ea9f/cpp/thirdparty/versions.txt).
>
> My question is more procedural than technical. Is there a way to request that 
> pyarrow/arrow gets built with the fixed version of the AWS SDK for the builds 
> published to pypi in the next week or two or is this unrealistic? If so, is 
> the next best option for us to git clone the source and build with the fixed 
> version of the AWS SDK?
>
> Thanks in advance for your advice,
> -Joe
>
> Technical details:
> - This occurs because of an incorrect resource handling that segfaults on 
> shutdown in the AWS SDK
> - pyarrow's fs module calls arrow::fs::InitializeS3() on import which sets up 
> the conditions to segfault on exit (depending on how the memory allocations 
> all work out, which is why this is easier to reproduce with MALLOC_PETURB_)
> - This fix for this is 
> (https://github.com/aws/aws-sdk-cpp/commit/a2512bd02addd77515430ac74d7ee5f37343ec99)
> - The first release tag I see for this commit in aws-sdk-cpp is in version 
> 1.9.214
>
>
>
>
>
>
>
> Sent via Superhuman
>

Reply via email to