FineAndDandy opened a new issue, #4124:
URL: https://github.com/apache/accumulo/issues/4124

   **Is your feature request related to a problem? Please describe.**
   The write operations to an rfile are serialized. When writing large rfiles 
in map reduce jobs this can produces very large tales to the jobs. The 
bottleneck is often compression rather than i/o. 
   
   **Describe the solution you'd like**
   Utilizing multiple threads to process multiple blocks in parallel could 
dramatically improve write performance. Having a dedicated thread to write 
completed blocks in order would still be necessary, but should be possible. 
This could be scaled based on available memory for buffering.
   
   **Describe alternatives you've considered**
   Adding pipelines to the existing code could be a smaller lift, and have a 
big performance improvement as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to