nonggia.liang created HUDI-4717:
-----------------------------------

             Summary:  CompactionCommitEvent message corrupted when sent by 
compact_task
                 Key: HUDI-4717
                 URL: https://issues.apache.org/jira/browse/HUDI-4717
             Project: Apache Hudi
          Issue Type: Bug
          Components: flink, flink-sql
            Reporter: nonggia.liang
         Attachments: figure 1.png, figure 2.png

When running a flink application inserting data to hudi table with async 
compaction enabled, we found that after running for some time, compactions 
become abnormal, which were scheduled, executed succesfully, but not committed. 
And we can observed inconsistence between the messges compact_task sending and 
compact_commit receiving in number, as figure 1 shown below.

By looking into the abnormal InputChannel state of the compact_commit operator 
using tool Arthas, we found the channel is waiting for a `huge` message of size 
16M, which is far more than the size of normal CompactionCommitEvent object. As 
shown by figure 2.

Now in the method processElement() of class CompactFunction, we use collector 
to send CompactionCommitEvent message asynchronously, but the Collector 
provided by flink seems not to be thread-safe. Can that be the cause of the 
corruption of the message received by compact_commit operator? Shall we use the 
MailboxExecutorAdapter to run collector.collect just like in StreamReadOperator?

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to