flyrain commented on PR #4683:
URL: https://github.com/apache/iceberg/pull/4683#issuecomment-1129573060

   Here are benchmarks of equality deletes and position deletes for a 10M-rows 
data file with different percentage(from 0 to 100%) of deleted rows.
   The perf of read without is_deleted column are the same with the read with 
"_deleted = false", which is expected. The grey line is for read of deleted 
rows. It outputs more and more rows when percentage of deletes increases. The 
perf chart also makes sense, even though there are room to optimize when there 
are just a few rows deleted.
   
   <img width="1110" alt="Screen Shot 2022-05-17 at 9 58 58 PM" 
src="https://user-images.githubusercontent.com/1322359/168960450-63ab05bf-3fad-4c40-bc39-06967c35ac50.png";>
   
   <img width="1011" alt="Screen Shot 2022-05-17 at 10 01 04 PM" 
src="https://user-images.githubusercontent.com/1322359/168960699-c1de06f9-b020-4d43-840a-1715a4b7f891.png";>
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to