vinothchandar commented on issue #1694: URL: https://github.com/apache/hudi/issues/1694#issuecomment-647138192
>2 - Yes Indexing is dominating, not sure why exactly it is, but it is after setting parameter hoodie.parquet.small.file.limit = 0 If you turn off small file handling, you end up writing more files, which means - indexing has to compare ranges/bloom filter across many more files.. This is the same reason why you should consider not doing this for query side as well.. small files will hurt query performance a lot as well.. Let's do a reset here and try to design for your use-case? happy to work through this if you can share more aobut your goals here.. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
