[GitHub] [hudi] zuyanton commented on issue #1829: [SUPPORT] S3 slow file listing causes Hudi read performance.

2020-07-17 Thread GitBox
zuyanton commented on issue #1829: URL: https://github.com/apache/hudi/issues/1829#issuecomment-660413464 @umehrot2 you are right , with ```convertMetastoreParquet``` set to ```false``` , when querying regular parquet table with 20k partitions I can see similar behavior of spark not

[GitHub] [hudi] zuyanton commented on issue #1829: [SUPPORT] S3 slow file listing causes Hudi read performance.

2020-07-16 Thread GitBox
zuyanton commented on issue #1829: URL: https://github.com/apache/hudi/issues/1829#issuecomment-659836676 @bvaradar we dont see similar issue with regular non hudi tables saved to s3 in parquet format. for regular tables "overhead" is the same and under one minute despite the number of

[GitHub] [hudi] zuyanton commented on issue #1829: [SUPPORT] S3 slow file listing causes Hudi read performance.

2020-07-15 Thread GitBox
zuyanton commented on issue #1829: URL: https://github.com/apache/hudi/issues/1829#issuecomment-658984768 @vinothchandar it didnt have any effect and I believe it shouldn't, since from what it looks like that parameter only gives improvement if you are trying to list statuses of multiple