zhangyue19921010 opened a new pull request, #7174:
URL: https://github.com/apache/hudi/pull/7174

   ### Change Logs
   
   Followed PR https://github.com/apache/hudi/pull/5416
   
   Add a new Executor Type named `SIMPLE` for hoodie records writer.
   
   This Simple executor is `Single Writer and Single Reader mode`. Also this 
SimpleHoodieExecutor has no inner message queue and no inner lock which can 
consume and writing records from iterator directly.
   
   
   Advantages: There is no need for additional memory and cpu resources due to 
lock or multithreading.
   
   Disadvantages: lost some benefits such as speed limit. And can not de-coupe 
the network read (shuffle read) and network write (writing objects/files to 
storage) anymore which may lead lower throughput compared with 
DisruptorExecutor.
   
   Also I did a quick benchmark using hoodie benchmark framework 
`org.apache.spark.sql.execution.benchmark.BoundInMemoryExecutorBenchmark`
   
   Minimize the impact of producers and consumers efficiency as much as 
possible to focus on testing the throughput limit of the inner queue
   
   The result are 
   ```
   OpenJDK 64-Bit Server VM 1.8.0_342-b07 on Linux 5.10.62-55.141.amzn2.x86_64
   Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
   COW Ingestion:                            Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   BoundInMemory Executor                            34661          35143       
  292          0.3        3466.1       1.0X
   Simple Executor                                   17347          17796       
  681          0.6        1734.7       2.0X
   Disruptor Executor                                15803          16535       
  936          0.6        1580.3       2.2X
   ```
   
   this Simple Executor has good throughput and minimal resource usage.
   
   Also add the corresponding UTs
   
   Of course we need to unify all these executor/messageQueue related UTs as 
much as possible. 
   Create a new Tickets https://issues.apache.org/jira/browse/HUDI-5106
   Will do it as next step
   
   ### Impact
   
   No impact
   ### Risk level (write none, low medium or high below)
   
   low
   
   ### Documentation Update
   
   no
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to