[
https://issues.apache.org/jira/browse/HBASE-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477575#comment-13477575
]
Kannan Muthukkaruppan commented on HBASE-6980:
----------------------------------------------
Ramakrishna,
Thanks for your email.
#1. It is not clear why we even write a META entry for flushes...
{code}
private WALEdit completeCacheFlushLogEdit() {
KeyValue kv = new KeyValue(METAROW, METAFAMILY, null,
System.currentTimeMillis(), COMPLETE_CACHE_FLUSH);
WALEdit e = new WALEdit();
e.add(kv);
return e;
}
{code}
The replayRecoveredEdits() logic skips over these entries anyway. And the only
reference I see for this special entry in HLog is in unit tests.
#2. Yes, currently there is a lot of comments (related to lastSeqWritten)
before the function HLog.java:startCacheFlush(), but the logic is not very
clear to me. The changes were committed as part of HBASE-3845. I think we
should be able to simplify that logic. I think I see some potential bugs there
even it stands now-- will need to spend some more time looking at this, and
will write down an update here.
But bottom line, I still don't see any good fundamental reason we need to hold
this lock for the duration of the entire flush (even given the lastSeqWritten
map logic).
> Parallel Flushing Of Memstores
> ------------------------------
>
> Key: HBASE-6980
> URL: https://issues.apache.org/jira/browse/HBASE-6980
> Project: HBase
> Issue Type: New Feature
> Reporter: Kannan Muthukkaruppan
> Assignee: Kannan Muthukkaruppan
>
> For write dominated workloads, single threaded memstore flushing is an
> unnecessary bottleneck. With a single flusher thread, we are basically not
> setup to take advantage of the aggregate throughput that multi-disk nodes
> provide.
> * For puts with WAL enabled, the bottleneck is more likely the "single" WAL
> per region server. So this particular fix may not buy as much unless we
> unlock that bottleneck with multiple commit logs per region server. (Topic
> for a separate JIRA-- HBASE-6981).
> * But for puts with WAL disabled (e.g., when using HBASE-5783 style fast bulk
> imports), we should be able to support much better ingest rates with parallel
> flushing of memstores.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira