flyrain commented on PR #4888:
URL: https://github.com/apache/iceberg/pull/4888#issuecomment-1163769266

   Hi @aokolnychyi and @RussellSpitzer, vectorized read is enabled by default 
several months ago. But the benchmark still assumes it false by default. I have 
set it false explicitly, and run the benchmark again. Now we can see the big 
performance gain between vectorized and non-vectorized read, as the following 
diagram shows.
   <img width="996" alt="Screen Shot 2022-06-22 at 4 22 09 PM" 
src="https://user-images.githubusercontent.com/1322359/175173362-e2e6d636-4e5c-4ed4-bcc5-a8888c6b1e1c.png";>
   I also profile the benchmarks. Here is the flame graph for 25% vectorized 
read. It looks normal to me. The program spent majority time to read pos delete 
file and the data file. The read of position delete file is still 
non-vectorized, which takes a big portion. Would suggest to enable the 
vectorized read on delete files to improve the overall perf. That's probably 
next step we can do.
   <img width="1543" alt="Screen Shot 2022-06-22 at 4 26 57 PM" 
src="https://user-images.githubusercontent.com/1322359/175176096-e52048aa-5524-49b8-9645-7cdfdc9463e7.png";>
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to