bryanck opened a new pull request, #5137:
URL: https://github.com/apache/iceberg/pull/5137

   This PR changes the dictionary value accessors in the vectorized parquet 
reader so that the dictionary values are read from the underlying dictionary 
directly, rather than copying the values into a new buffer (this was already 
being done in the dictionary decimal accessor classes). The underlying parquet 
dictionary classes already load the values into a buffer, so copying them to a 
new buffer appears redundant.
   
   In very limited testing, this shows a performance gain of over 20% in 
vectorized read performance in some scenarios, though more testing would be 
required to get more accurate metrics.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to