[ https://issues.apache.org/jira/browse/HBASE-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841321#action_12841321 ]
dhruba borthakur commented on HBASE-2283: ----------------------------------------- If we say that we restart the region server when a write/sync to the region log fails, then we can defer a fix for #2. If we do that, then we do not need any refactoring of the code at all. We can solve #1 via putting a sync marker at the end of each transaction and making the HLog.reader handle this marker correctly (details not yet worked out). > row level atomicity > -------------------- > > Key: HBASE-2283 > URL: https://issues.apache.org/jira/browse/HBASE-2283 > Project: Hadoop HBase > Issue Type: Bug > Reporter: Kannan Muthukkaruppan > Priority: Blocker > Fix For: 0.20.4, 0.21.0 > > > The flow during a HRegionServer.put() seems to be the following. [For now, > let's just consider single row Put containing edits to multiple column > families/columns.] > HRegionServer.put() does a: > HRegion.put(); > syncWal() (the HDFS sync call). /* this is assuming we have HDFS-200 > */ > HRegion.put() does a: > for each column family > { > HLog.append(all edits to the colum family); > write all edits to Memstore; > } > HLog.append() does a : > foreach edit in a single column family { > doWrite() > } > doWrite() does a: > this.writer.append(). > There seems to be two related issues here that could result in > inconsistencies. > Issue #1: A put() does a bunch of HLog.append() calls. These in turn do a > bunch of "write" calls on the underlying DFS stream. If we crash after > having written out some append's to DFS, recovery will run and apply a > partial transaction to memstore. > Issue #2: The updates to memstore should happen after the sync rather than > before. Otherwise, there is the danger that the write to DFS (sync) fails for > some reason & we return an error to the client, but we have already taken > edits to the memstore. So subsequent reads will serve uncommitted data. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.