dmenin commented on issue #3975:
URL: https://github.com/apache/hudi/issues/3975#issuecomment-968689191
Hi @xushiyan,
MOR is not possible because it is not supported by AWS tools like Athena and
this particular dataset has no filed guaranteed to be 100% immutable, and
fields "near-immutable" would go trough the same problem. If fact the date
could be considered near-immutable as on each load, I am upsetting over 100k
rows and deleting only a few hundreds.
Ay other ideas on how to make the "getting small files from partitions" jobs
run faster? And why are there 3 of such jobs running sequentially with
different number of stages and tasks?
Thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]