[jira] [Commented] (HBASE-6980) Parallel Flushing Of Memstores

ramkrishna.s.vasudevan (JIRA) Tue, 16 Oct 2012 23:10:11 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477647#comment-13477647
 ]


ramkrishna.s.vasudevan commented on HBASE-6980:
-----------------------------------------------

bq.#1. It is not clear why we even write a META entry for flushes...
Yes.  This is actually not used but still that forms the latest entry.  So 
currently in 0.94 and trunk uses a map to form the name of the replayedits file 
that should have the seq id of maximum of the edits.  Previously i remember 
that it was minimum of the seq id that was used for naming the replayEdits. 
In one of the issues we were discussing on the usefulness of the meta data 
entry after flush. We can once again verify and we can remove it if there is 
not much usefulness from it.

bq.we track the min seq id from the current memstore instead of the max seq id 
from the snapshot memstore
The HLog keeps track of the minSeqid for the region. So you suggesting that we 
can only track the max seq id whenever an append happens to HLog? So on flush 
start we just clear this entry and use that max value for completing the flush. 
Thanks for the insights.  
 
                
> Parallel Flushing Of Memstores
> ------------------------------
>
>                 Key: HBASE-6980
>                 URL: https://issues.apache.org/jira/browse/HBASE-6980
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>
> For write dominated workloads, single threaded memstore flushing is an 
> unnecessary bottleneck. With a single flusher thread, we are basically not 
> setup to take advantage of the aggregate throughput that multi-disk nodes 
> provide.
> * For puts with WAL enabled, the bottleneck is more likely the "single" WAL 
> per region server. So this particular fix may not buy as much unless we 
> unlock that bottleneck with multiple commit logs per region server. (Topic 
> for a separate JIRA-- HBASE-6981).
> * But for puts with WAL disabled (e.g., when using HBASE-5783 style fast bulk 
> imports), we should be able to support much better ingest rates with parallel 
> flushing of memstores.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6980) Parallel Flushing Of Memstores

Reply via email to