rtpsw commented on PR #34392: URL: https://github.com/apache/arrow/pull/34392#issuecomment-1529785610
> Can you please summarize a description for PR? i.e., what is the issue and what is the fix? We started with the goal of enabling future as-of-join and the first problem was the hang that started this issue. As the issue progressed with proposed fixes, we were dealing with a follow-up problem - what we called race condition. Perhaps a good title for this issue is something like "Fix future as-of-join-node hang and race condition". The resolution was [fixing handling](https://github.com/apache/arrow/pull/34392/commits/000219736c0abb2e0c9525963f083c0ab04d16cc) of [time](https://github.com/apache/arrow/pull/34392/commits/0dc7fa2339ddfa73be30f4780a35f3ca057f8957) and [key hasher invalidation](https://github.com/apache/arrow/pull/34392/commits/34134f0b40ea241fa3c686df06c487d0539b9c99). > Which changes to intended to keep? (since there are lots of debug statement) All the code related to debug-messages here is enabled only in debug mode (via `#ifndef NDEBUG`), so it doesn't hurt production runs. In addition, the code only collects debug-messages inside testers where `PrepareTest` is used, so its full effect is limited to these cases. OTOH, even in other testers, it always computes debug-messages (in debug mode), so I could fix the code to avoid this (using if-statements). Also, if a non-deterministic as-of-join problem pops up in the future in a CI job, which runs in debug mode, the debug messages collected for it should be helpful. Finally, the mechanism for collecting detailed debug messages only for a failed iteration implemented here is reusable in future Arrow non-deterministic failure investigations. My vote is to keep this code, though I wouldn't insist if anyone feels strongly against that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
