rtpsw commented on PR #34392:
URL: https://github.com/apache/arrow/pull/34392#issuecomment-1529785610

   > Can you please summarize a description for PR? i.e., what is the issue and 
what is the fix?
   
   We started with the goal of enabling future as-of-join and the first problem 
was the hang that started this issue. As the issue progressed with proposed 
fixes, we were dealing with a follow-up problem - what we called race 
condition. Perhaps a good title for this issue is something like "Fix future 
as-of-join-node hang and race condition". The resolution was [fixing 
handling](https://github.com/apache/arrow/pull/34392/commits/000219736c0abb2e0c9525963f083c0ab04d16cc)
 of 
[time](https://github.com/apache/arrow/pull/34392/commits/0dc7fa2339ddfa73be30f4780a35f3ca057f8957)
 and [key hasher 
invalidation](https://github.com/apache/arrow/pull/34392/commits/34134f0b40ea241fa3c686df06c487d0539b9c99).
   
   > Which changes to intended to keep? (since there are lots of debug 
statement)
   
   All the code related to debug-messages here is enabled only in debug mode 
(via `#ifndef NDEBUG`), so it doesn't hurt production runs. In addition, the 
code only collects debug-messages inside testers where `PrepareTest` is used, 
so its full effect is limited to these cases. OTOH, even in other testers, it 
always computes debug-messages (in debug mode), so I could fix the code to 
avoid this (using if-statements). Also, if a non-deterministic as-of-join 
problem pops up in the future in a CI job, which runs in debug mode, the debug 
messages collected for it should be helpful. Finally, the mechanism for 
collecting detailed debug messages only for a failed iteration implemented here 
is reusable in future Arrow non-deterministic failure investigations. My vote 
is to keep this code, though I wouldn't insist if anyone feels strongly against 
that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to