[GitHub] [arrow] rtpsw commented on pull request #36499: GH-36482: [C++][CI] Fix sporadic test failures in AsofJoinBasicTest

via GitHub Fri, 07 Jul 2023 12:40:33 -0700


rtpsw commented on PR #36499:
URL: https://github.com/apache/arrow/pull/36499#issuecomment-1625982410

> I am trying understand the issue pre-PR, but you seems to explaining the
behavior after the code change here.

Correct, sorry.

> Can you explain your comment above w.r.t to my question below?

Sure. See below.

> I would expect when the processing thread invokes AdvanceAndMemoize, it
would find the batch_ is NULLPTR and then compute the hash for the first batch.

In the pre-PR code, this description is correct for the first batch. For the
second batch, `AdvanceAndMemoize` [invokes
`GetLatestKey`](https://github.com/apache/arrow/blob/fd949f8fb416025dc9d77c035c1cac5f45ed1b94/cpp/src/arrow/acero/asof_join_node.cc#L865)
before it [invokes
`Advance`](https://github.com/apache/arrow/blob/fd949f8fb416025dc9d77c035c1cac5f45ed1b94/cpp/src/arrow/acero/asof_join_node.cc#L869),
which means `HashesFor` (invoked under `GetLatestKey`) finds
`KeyHasher::batch_` referring to the first batch instead of to the second. This
is what I meant by "finds hashes that are incorrect for the above batch".

Now, taking this a step further, if the two batches have a different
address, then things seem OK because the hashes would then be recomputed. I
suspect the bug is triggered when the first and second batches happen to have
the same address, which we know is a race condition, leading to the use of
incorrect hashes. Granted, I haven't yet nailed down the exact triggering
condition in the pre-PR code. It would take some effort - let me know if you'd
like this investigated. For the time being, in the current version of the PR, I
preferred to find the minimal change that works and I can explain.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] rtpsw commented on pull request #36499: GH-36482: [C++][CI] Fix sporadic test failures in AsofJoinBasicTest

Reply via email to