row level atomicity -------------------- Key: HBASE-2283 URL: https://issues.apache.org/jira/browse/HBASE-2283 Project: Hadoop HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Priority: Blocker
The flow during a HRegionServer.put() seems to be the following. [For now, let's just consider single row Put containing edits to multiple column families/columns.] HRegionServer.put() does a: HRegion.put(); syncWal() (the HDFS sync call). /* this is assuming we have HDFS-200 */ HRegion.put() does a: for each column family { HLog.append(all edits to the colum family); write all edits to Memstore; } HLog.append() does a : foreach edit in a single column family { doWrite() } doWrite() does a: this.writer.append(). There seems to be two related issues here that could result in inconsistencies. Issue #1: A put() does a bunch of HLog.append() calls. These in turn do a bunch of "write" calls on the underlying DFS stream. If we crash after having written out some append's to DFS, recovery will run and apply a partial transaction to memstore. Issue #2: The updates to memstore should happen after the sync rather than before. Otherwise, there is the danger that the write to DFS (sync) fails for some reason & we return an error to the client, but we have already taken edits to the memstore. So subsequent reads will serve uncommitted data. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.