bhat-vinay commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2015327684
> IIUC this adds additional shuffle and a new job? I'd like to understand how we think this impacts the current insert DAG. Yet to review the new partitioner, will do once I hear back on these. Yes, there is a sorting stage (global sort of the input batch) which might add a shuffle. New job is to assign sequentially increasing indexes for the sorted records (which the `UpsertSortPartitioner` relies on to ensure that sorted nature of the input batch is preserved while still handling small files as efficiently as possible). Not sure if this is what you meant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
