[GitHub] [hudi] umehrot2 commented on issue #1981: [SUPPORT] Huge performance Difference Between Hudi and Regular Parquet in Athena

GitBox Wed, 19 Aug 2020 19:18:19 -0700


umehrot2 commented on issue #1981:
URL: https://github.com/apache/hudi/issues/1981#issuecomment-676855516



   @vinothchandar @rubenssoto I am thinking this could just be the difference 
between presto's performance over regular parquet where it completely uses its 
native parquet readers, vs presto's performance for Hudi where it needs to 
atleast use splits/listing logic from Hoodie's Input Format. Is it possible for 
you to try the queries on an EMR cluster and observe the difference in 
performance through presto ?
   
   cc @bhasudha as well
   
   @rubenssoto have you tried cutting ticket to AWS support regarding this ? 
They should help atleast rule out if its something specifically to do with 
Athena or just performance bottleneck with Hudi.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] umehrot2 commented on issue #1981: [SUPPORT] Huge performance Difference Between Hudi and Regular Parquet in Athena

Reply via email to