ozankabak commented on code in PR #11875:
URL: https://github.com/apache/datafusion/pull/11875#discussion_r1712004740


##########
datafusion/physical-plan/src/limit.rs:
##########
@@ -821,7 +816,7 @@ mod tests {
     #[tokio::test]
     async fn test_row_number_statistics_for_local_limit() -> Result<()> {
         let row_count = row_number_statistics_for_local_limit(4, 10).await?;
-        assert_eq!(row_count, Precision::Exact(10));
+        assert_eq!(row_count, Precision::Exact(40));

Review Comment:
   I have been thinking about this and I think it is supposed to be 
per-partition statistics. If we have a `with_fetch` method that takes in 
per-partition stats and a partition count, and it outputs global stats; we end 
up with a problem: We can not compose anymore.
   
   I did a quick audit of the code and it seems to be used as per-partition 
stats everytwhere except the old `LocalLimitExec` `statistics` impl, which was 
IMHO erroneously converting to global stats -- which doesn't even make sense as 
it is a local operator.
   
   I will send a commit with my proposed cleanup of this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to