[
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122417#comment-13122417
]
[email protected] commented on HBASE-4528:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2141/#review2397
-----------------------------------------------------------
Overall I see this patch as trading off service resiliency in favor of
performance.
With the current ordering of operations (WAL append and sync prior to memstore
insert), we ensure that an error during sync is seen by the client and memstore
consistency is maintained. Importantly (at least for my goals), this also
allows us to do some reasoning about when it's necessary to abort the region
server or when we can take additional actions to try to ride over a transient
error. As long as there were no deferred flush edits, we could reason that any
error on sync was propagated back to the client as a failure and we did not
need to abort yet. This is the direction I've been trying to move with
HBASE-4222/4282 and a partial form of it was already in place prior to that.
I understand why we want to reorder these operations and move the sync outside
of the acquired row locks. From this standpoint, since an error on sync leaves
the memstore polluted, aborting immediately is the right thing to do. But I
don't think it's a desirable behavior. I think it will lead to more complaints
from users about observed instability of the system.
The use-case that motivated HBASE-4222 was performing a rolling restart of all
DataNodes in a cluster, with a running, but completely quiescent HBase cluster.
In this case, with no data durability at stake, we really should be able to
recover. But instead what will happen is a catastrophic failure of
RegionServers as each server tries to roll its HLog. The patch in it's current
state would regress to this behavior, triggering RS aborts even more quickly
than prior to HBASE-4222 (no HLog close would be attempted).
I would really like to find a way to keep the performance optimization of
moving the HLog sync outside of the row locks, while still being able to
guarantee memstore consistency in the case of failure, so that we can still
reason about whether or not a RS abort is really necessary.
Speaking naively, is it at all feasible that the RWCC.WriteEntry could track
the KeyValues instances it's used to apply to the memstore? And these
references could then be used to attempt a memstore rollback on failure? Any
other ways that we can maintain memstore consistency here without giving up and
aborting?
/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/2141/#comment5501>
Personally, I think this is a step in the wrong direction. I would like to
see us be _more_ resilient in the face of transient HDFS errors, as long as we
have sufficient information to reason that we have not compromised correctness.
- Gary
On 2011-10-06 08:08:49, Dhruba Borthakur wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/2141/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2011-10-06 08:08:49)
bq.
bq.
bq. Review request for hbase.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. The changes the multiPut operation so that the sync to the wal occurs
outside the rowlock.
bq.
bq. This enhancement is done only to HRegion.mut(Put[]) because this is the
only method that gets invoked from an application. The HRegion.put(Put) is used
only by unit tests and should possibly be deprecated.
bq.
bq. I have attached a unit test. I have not yet run all unit tests, but early
feedback on this patch will be very helpful.
bq.
bq.
bq. This addresses bug HBASE-4528.
bq. https://issues.apache.org/jira/browse/HBASE-4528
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1179529
bq.
/src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
1179529
bq. /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1179529
bq. /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFlusher.java
1179529
bq. /src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
1179529
bq. /src/test/java/org/apache/hadoop/hbase/regionserver/TestParallelPut.java
PRE-CREATION
bq. /src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
1179529
bq.
bq. Diff: https://reviews.apache.org/r/2141/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. Not yet run the full suite of unit tests.
bq.
bq.
bq. Thanks,
bq.
bq. Dhruba
bq.
bq.
> The put operation can release the rowlock before sync-ing the Hlog
> ------------------------------------------------------------------
>
> Key: HBASE-4528
> URL: https://issues.apache.org/jira/browse/HBASE-4528
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Attachments: appendNoSyncPut1.txt, appendNoSyncPut2.txt,
> appendNoSyncPut3.txt
>
>
> This allows for better throughput when there are hot rows. A single row
> update improves from 100 puts/sec/server to 5000 puts/sec/server.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira