[ 
https://issues.apache.org/jira/browse/ACCUMULO-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-2766:
-----------------------------------

    Attachment: ACCUMULO-2677-1.patch

This problem seems much worse than I thought.  ACCUMULO-2677-1.patch is a 
potential fix to this problem.   I have been running experiments using 
[mutslam|https://github.com/keith-turner/mutslam] a little utility I created to 
test group commit performance.  This patch is making a dramatic difference in 
performance.  

One of the experiments that mutslam runs is the following :

 * create 128 threads
 * each thread has a batch writer (w/ one thread)
 * each thread writes 100 mutations, flushing its batchwriter after each 
mutation

W/o the patch this test takes 500 to 580 seconds.   With ACCUMULO-2677-1.patch 
the test takes 23 to 25 seconds.   While running the test I set 
{{tserver.server.threads.minimum=128}} in {{accumulo-site.xml}}.  I ran these 
experiments on a single node w/ 16 cores (hyperthreaded) w/ hadoop-2.2.0.

I am still experimenting.

> Single walog operation may wait for multiple hsync calls
> --------------------------------------------------------
>
>                 Key: ACCUMULO-2766
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2766
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.5.0, 1.5.1, 1.6.0
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>              Labels: performance
>             Fix For: 1.5.2, 1.6.1
>
>         Attachments: ACCUMULO-2677-1.patch
>
>
> While looking into slow {{hsync}} calls, I noticed an oddity in the way 
> Accumulo processes syncs.  Specifically the way {{closeLock}} is used in 
> {{DfsLogger}}, it seems like the following situation could occur. 
>  
>  # thread B starts executing DfsLogger.LogSyncingTask.run()
>  # thread 1 enters DfsLogger.logFileData()
>  # thread 1 writes to walog
>  # thread 1 locks _closeLock_ 
>  # thread 1 adds sync work to workQueue
>  # thread 1 unlocks _closeLock_
>  # thread B takes sync work off of workQueue
>  # thread B locks _closeLock_
>  # thread B calls sync
>  # thread 3 enters DfsLogger.logFileData()
>  # thread 3 writes to walog
>  # thread 3 blocks locking _closeLock_
>  # thread 4 enters DfsLogger.logFileData()
>  # thread 4 writes to walog
>  # thread 4 blocks locking _closeLock_
>  # thread B unlocks _closeLock_
>  # thread 4 locks _closeLock_ 
>  # thread 4 adds sync work to workQueue
>  # thread B takes sync work off of workQueue
>  # thread B blocks locking _closeLock_
>  # thread 4 unlocks _closeLock_
>  # thread B locks _closeLock_
>  # thread B calls sync
>  # thread B unlocks _closeLock_
>  # thread 3 locks _closeLock_
>  # thread 3 adds sync work to workQueue
>  # thread 3 unlocks _closeLock_
> In this situation thread 3 unnecessarily has to wait for an extra {{hsync}} 
> call.  Not sure if this situation actually occurs, or if it occurs very 
> frequently.  Looking at the code it seems like it would be nice if sync 
> operations could be queued w/o synchronizing w/ sync operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to