alamb commented on pull request #9926: URL: https://github.com/apache/arrow/pull/9926#issuecomment-814989652
> If this is occurring, there may be a more fundamental issue at play here that also needs fixing @tustvold my initial reading of `LimitStream` also suggested to me that `input.poll_next` would only be called once after the limit had been hit. The [`limit_early_shutdown`](https://github.com/apache/arrow/pull/9926/files#diff-d2c5066b508f7b8827fbe30c23dc2cdc367b4baa441eb5e74d7aee6f98e20880R310) shows this bug in fact (it fails after consuming one more input than it should, without the changes in this PR). I had a test (https://github.com/apache/arrow/pull/9926/commits/99505b4a17fc176bea3304612b89408362296e27#diff-34dec6459ccea51c881a6ea392be9ad35f112395e6b8742df32a1742ac651e31L1799) that ran an entire query and I interpreted some `println!` output as showing that all the inputs were consumed (aka that `poll_next()` was repeatedly called). I did not debug it further. I am confident that this is an improvement to `LimitStream` -- there may well be some more fundamental issue that can also be fixed / improved as subsequent PRs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
