noahtaite commented on issue #10239:
URL: https://github.com/apache/hudi/issues/10239#issuecomment-1841683174

   @ad1happy2go 
   
   Thank you for the response. I confirmed my readers **were not** setting this 
option on read, thinking it was enabled by default. After enabling, the large 
gap has significantly reduced. For example this is a very large application 
that queries 5 of my largest Hudi tables. The gap for this application was 3 
hours before, reduced to 20 minutes now:
   
   <img width="1709" alt="image" 
src="https://github.com/apache/hudi/assets/24283126/e66aec07-5b75-4c04-acb3-e740d66fb021";>
   
   We have actually seen a fairly significant performance change by enabling 
this. Seems to be mostly for the better - I had a couple sessions on shared 
clusters start to hang when they originally weren't, but dedicated clusters are 
quicker to load using metadata on large tables.
   
   Only suggestion would be to make it explicit in the configurations page that 
this is not enabled on reader side by default. However it is documented on the 
"Metadata Indexing" page.
   
   Thanks again Aditya.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to