Increase the concurrency of transaction logging to edits log
------------------------------------------------------------

                 Key: HADOOP-1942
                 URL: https://issues.apache.org/jira/browse/HADOOP-1942
             Project: Hadoop
          Issue Type: Improvement
          Components: dfs
            Reporter: dhruba borthakur
            Assignee: dhruba borthakur
             Fix For: 0.15.0


For some typical workloads, the throughput of the namenode is bottlenecked by 
the rate of transactions that are being logged into tghe edits log. In the 
current code, a batching scheme implies that all transactions do not have to 
incur a sync of the edits log to disk. However, the existing batch-ing scheme 
can be improved.

One option is to keep two buffers associated with edits file. Threads write to 
the primary buffer while holding the FSNamesystem lock. Then the thread release 
the FSNamesystem lock, acquires a new lock called the syncLock, swaps buffers, 
and flushes the old buffer to the persistent store. Since the buffers are 
swapped, new transactions continue to get logged into the new buffer. (Of 
course, the new transactions cannot complete before this new buffer is sync-ed).

This approach does a better job of batching syncs to disk, thus improving 
performance.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to