alamb opened a new pull request #9926: URL: https://github.com/apache/arrow/pull/9926
(note this builds on the code in https://github.com/apache/arrow/pull/9924, so marking it a draft until that is merged) # Rationale Once the number of rows needed for a limit query has been produced, any further work done to read values from its input is wasted. The current implementation of LimitStream will keep polling its input for the next value, and returning `Poll::Ready(None)` , even once the limit has been reached For queries like `select * from foo limit 10` used for initial data exploration this is very wasteful. # Changes This PR changes `LimitStream` so that it drops its input once the limit has been reached -- this both potentially frees resources (memory, file handles, etc) it also avoids unnecessary computation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
