shidayang opened a new issue, #5245:
URL: https://github.com/apache/iceberg/issues/5245

   My has done a chbenchmark of iceberg on trino. I found that the performance 
of MOR is very low when have many delete files. The scale of data is 10 
warehouse.  The average duration is less than 10 second when no have delete 
files, but when I add some delete file to every table some query spent over one 
houre.
   
   
   1. #5195 The Trino every page will call DeleteFilter#filter, every calling 
of DeleteFilter#filter will initialize delete files.
   2. #5244 #5242 We found that the cost of creating StructLikeWrapper and 
InternalRecordWrapper is high.
   this is Flame Graph:
   <img width="1410" alt="image" 
src="https://user-images.githubusercontent.com/26699250/178226456-9e953b2b-5154-4693-9b74-2ec9f277fd97.png";>
   
   
   
   The query performance improved when we made these optimizations. such as the 
query "select count(*) from stock", before optimize spent 8 minutes, after 
optimize only spent 20 seconds.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to