ahshahid commented on issue #6424:
URL: https://github.com/apache/iceberg/issues/6424#issuecomment-1351924900

   Right.. I was also thinking that this is where I have a misunderstanding or 
bug...
   The question is :
   where the recordCount represents the scanned fraction row count, or the 
total row count of the split/file.
   
   I will look into the code, but as per me., the recordCount has to be partial 
scanned count.
   This is for 2 reasons:
   1)  The estimated Row count function needs to return the total row count in 
the split/file. If that was available , then calculations should not even be 
needed.
   2) For a split which is partial on a single file, or if it spawns multiple 
files, the estimated row count of total split will need to be calculated.  and 
for that the recordCount should be something which refers to partial scanned 
row count.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to