ozankabak commented on code in PR #11875:
URL: https://github.com/apache/datafusion/pull/11875#discussion_r1712004740
##########
datafusion/physical-plan/src/limit.rs:
##########
@@ -821,7 +816,7 @@ mod tests {
#[tokio::test]
async fn test_row_number_statistics_for_local_limit() -> Result<()> {
let row_count = row_number_statistics_for_local_limit(4, 10).await?;
- assert_eq!(row_count, Precision::Exact(10));
+ assert_eq!(row_count, Precision::Exact(40));
Review Comment:
I have been thinking about this and I think it is supposed to be
per-partition statistics. If we have a `with_fetch` method that takes in
per-partition stats and a partition count, and it outputs global stats; we end
up with a problem: We can not compose anymore.
I did a quick audit of the code and it seems to be used as per-partition
stats everytwhere except the old `LocalLimitExec` `statistics` impl, which was
IMHO erroneously converting to global stats -- which doesn't even make sense as
it is a local operator.
I will send a commit with my proposed cleanup of this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]