cshuo commented on PR #12796: URL: https://github.com/apache/hudi/pull/12796#issuecomment-2670323877
> @cshuo , thanks for your review. I understand your comments, and would like to research proposed changes in a separate task. I've added it to the backlog, [HUDI-9055](https://issues.apache.org/jira/browse/HUDI-9055). It would require more profiling, and, possibly, analysis of heap dumps. > > For now, I've implemented quick test of your proposed changes, but in both cases I lost a huge part of gained performance improvement. For non bucket case, total write time increased from 460 s to 590 s and 520 s correspondingly. > > And mentioned research is needed to have an answer, why we lost performance instead of improving it with proposed changes, which is a curious behavior. @geserdugarov The changes I propose is for trying to align the ser/de performance with the Flink internal `RowDataSerializer` for RowData. And for the performance lost, I suggest you can check the benchmark stability firstly, and make sure there is no extra variables. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
