[ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122417#comment-13122417
 ] 

[email protected] commented on HBASE-4528:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2141/#review2397
-----------------------------------------------------------


Overall I see this patch as trading off service resiliency in favor of 
performance.

With the current ordering of operations (WAL append and sync prior to memstore 
insert), we ensure that an error during sync is seen by the client and memstore 
consistency is maintained.  Importantly (at least for my goals), this also 
allows us to do some reasoning about when it's necessary to abort the region 
server or when we can take additional actions to try to ride over a transient 
error.  As long as there were no deferred flush edits, we could reason that any 
error on sync was propagated back to the client as a failure and we did not 
need to abort yet.  This is the direction I've been trying to move with 
HBASE-4222/4282 and a partial form of it was already in place prior to that.

I understand why we want to reorder these operations and move the sync outside 
of the acquired row locks.  From this standpoint, since an error on sync leaves 
the memstore polluted, aborting immediately is the right thing to do.  But I 
don't think it's a desirable behavior.  I think it will lead to more complaints 
from users about observed instability of the system.

The use-case that motivated HBASE-4222 was performing a rolling restart of all 
DataNodes in a cluster, with a running, but completely quiescent HBase cluster. 
 In this case, with no data durability at stake, we really should be able to 
recover.  But instead what will happen is a catastrophic failure of 
RegionServers as each server tries to roll its HLog.  The patch in it's current 
state would regress to this behavior, triggering RS aborts even more quickly 
than prior to HBASE-4222 (no HLog close would be attempted).

I would really like to find a way to keep the performance optimization of 
moving the HLog sync outside of the row locks, while still being able to 
guarantee memstore consistency in the case of failure, so that we can still 
reason about whether or not a RS abort is really necessary.

Speaking naively, is it at all feasible that the RWCC.WriteEntry could track 
the KeyValues instances it's used to apply to the memstore?  And these 
references could then be used to attempt a memstore rollback on failure?  Any 
other ways that we can maintain memstore consistency here without giving up and 
aborting?


/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/2141/#comment5501>

    Personally, I think this is a step in the wrong direction.  I would like to 
see us be _more_ resilient in the face of transient HDFS errors, as long as we 
have sufficient information to reason that we have not compromised correctness.


- Gary


On 2011-10-06 08:08:49, Dhruba Borthakur wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2141/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-06 08:08:49)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  The changes the multiPut operation so that the sync to the wal occurs 
outside the rowlock.
bq.  
bq.  This enhancement is done only to HRegion.mut(Put[]) because this is the 
only method that gets invoked from an application. The HRegion.put(Put) is used 
only by unit tests and should possibly be deprecated.
bq.  
bq.  I have attached a unit test. I have not yet run all unit tests, but early 
feedback on this patch will be very helpful.
bq.  
bq.  
bq.  This addresses bug HBASE-4528.
bq.      https://issues.apache.org/jira/browse/HBASE-4528
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1179529 
bq.    
/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
 1179529 
bq.    /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1179529 
bq.    /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java 
1179529 
bq.    /src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 
1179529 
bq.    /src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java 
PRE-CREATION 
bq.    /src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 
1179529 
bq.  
bq.  Diff: https://reviews.apache.org/r/2141/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Not yet run the full suite of unit tests.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Dhruba
bq.  
bq.


                
> The put operation can release the rowlock before sync-ing the Hlog
> ------------------------------------------------------------------
>
>                 Key: HBASE-4528
>                 URL: https://issues.apache.org/jira/browse/HBASE-4528
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: appendNoSyncPut1.txt, appendNoSyncPut2.txt, 
> appendNoSyncPut3.txt
>
>
> This allows for better throughput when there are hot rows. A single row 
> update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to