koochiswathiTR commented on issue #6881: URL: https://github.com/apache/hudi/issues/6881#issuecomment-1273237951
Spark : 3.1.2 Hudi : 0.11.1 AWS EMR : 6.7 Its a spark streaming application, We use only upsert command We have 2000 partitions, We will upsert 3.2 Millions of records at a time. a micro batch in spark streaming is taking 15 mints of time to complete whole process( upsert + compaction/cleanup) etc Let me know if you need any information @xushiyan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
