zhangyue19921010 opened a new pull request, #7174: URL: https://github.com/apache/hudi/pull/7174
### Change Logs Followed PR https://github.com/apache/hudi/pull/5416 Add a new Executor Type named `SIMPLE` for hoodie records writer. This Simple executor is `Single Writer and Single Reader mode`. Also this SimpleHoodieExecutor has no inner message queue and no inner lock which can consume and writing records from iterator directly. Advantages: There is no need for additional memory and cpu resources due to lock or multithreading. Disadvantages: lost some benefits such as speed limit. And can not de-coupe the network read (shuffle read) and network write (writing objects/files to storage) anymore which may lead lower throughput compared with DisruptorExecutor. Also I did a quick benchmark using hoodie benchmark framework `org.apache.spark.sql.execution.benchmark.BoundInMemoryExecutorBenchmark` Minimize the impact of producers and consumers efficiency as much as possible to focus on testing the throughput limit of the inner queue The result are ``` OpenJDK 64-Bit Server VM 1.8.0_342-b07 on Linux 5.10.62-55.141.amzn2.x86_64 Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz COW Ingestion: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ BoundInMemory Executor 34661 35143 292 0.3 3466.1 1.0X Simple Executor 17347 17796 681 0.6 1734.7 2.0X Disruptor Executor 15803 16535 936 0.6 1580.3 2.2X ``` this Simple Executor has good throughput and minimal resource usage. Also add the corresponding UTs Of course we need to unify all these executor/messageQueue related UTs as much as possible. Create a new Tickets https://issues.apache.org/jira/browse/HUDI-5106 Will do it as next step ### Impact No impact ### Risk level (write none, low medium or high below) low ### Documentation Update no ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
