cshuo commented on PR #12796:
URL: https://github.com/apache/hudi/pull/12796#issuecomment-2670323877

   > @cshuo , thanks for your review. I understand your comments, and would 
like to research proposed changes in a separate task. I've added it to the 
backlog, [HUDI-9055](https://issues.apache.org/jira/browse/HUDI-9055). It would 
require more profiling, and, possibly, analysis of heap dumps.
   > 
   > For now, I've implemented quick test of your proposed changes, but in both 
cases I lost a huge part of gained performance improvement. For non bucket 
case, total write time increased from 460 s to 590 s and 520 s correspondingly.
   > 
   > And mentioned research is needed to have an answer, why we lost 
performance instead of improving it with proposed changes, which is a curious 
behavior.
   
   @geserdugarov The changes I propose is for trying to align the ser/de 
performance with the Flink internal `RowDataSerializer` for RowData. And for 
the performance lost, I suggest you can check the benchmark stability firstly, 
and make sure there is no extra variables.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to