[ 
https://issues.apache.org/jira/browse/ACCUMULO-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15170169#comment-15170169
 ] 

ASF GitHub Bot commented on ACCUMULO-4154:
------------------------------------------

Github user keith-turner commented on the pull request:

    https://github.com/apache/accumulo/pull/76#issuecomment-189532720
  
    closed because I put wrong issue# in commit message


> Improve batch writer
> --------------------
>
>                 Key: ACCUMULO-4154
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4154
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Keith Turner
>
> The batch writer currently has two drawbacks :
>  * It waits for its memory to be half full and then bins mutations for send 
> threads.  I don't think this is optimal.   Think it would be better to keep 
> the send threads busy.  As soon as there are mutation start working on them. 
> If the send threads can not keep up, then work will naturally build up (w/o 
> waiting for memory to be .5 full)
>  * The flush method blocks threads trying to add anything to the batch writer.
> Thinking of implementing the following model for the batch writer, which is 
> similar to how the conditional writer works.
>   * Have a queue that all incoming mutations are added to.
>   * Have a queue per tablet server
>   * Have a single thread thats constantly taking batches of mutations off the 
> incoming queue, binning them, and placing them on tablet server queues.
>   * When a send thread becomes idle, have it select and reserver the tablet 
> server queue with the most work on it.
>   * when mutations fail, send threads can add them back to the incoming queue
> To get better flushing behavior, as each mutation is added to the batch 
> writer it can be assigned a one up counter.   We can keep track of the 
> minimum in progress mutation.  Flush can inspect this counter and wait for 
> the minimum active mutation to reach a certain count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to