[
https://issues.apache.org/jira/browse/HBASE-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12878749#action_12878749
]
HBase Review Board commented on HBASE-1025:
-------------------------------------------
Message from: "Todd Lipcon" <[email protected]>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/179/#review216
-----------------------------------------------------------
Regarding rewriting recovered edits back to the WAL, I'm afraid we may have an
issue with the following situation:
- RS A writes edits 1-100 to log
- RS A fails
- RS B starts recovering log, rewrites edits 1-50
- RS B fails while recovering
- RS C starts up. It *should* replay 1-100. Instead I fear it might only replay
1-50, or replay 1-100 followed by 1-50, either of which would be incorrect.
We should add a test case where the RS fails in the middle of recovering a
prior failure.
Also, replayRecoveredEdits returns the highest seqid from the logs. Is it
possible for storeFiles to have a higher seqid than what's in the log in any
situation? We should add some kind of assert perhaps.
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment973>
this returns -1 if there is nothing to replay. Shouldn't it return the
maxSeqId + 1 perhaps?
Do we have a test which makes a region, writes some data, closes the
region, opens the region, writes some data, then recovers logs?
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment972>
we used to print the sequence ID after the increment, now we print before
the increment. Maybe change log to "sequenceid for next edit will be NNN"
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment975>
add some javadoc - I think important to note that minSeqId is *exclusive* -
it's a little unintuitive. Perhaps makes more sense to pass in
maxSeqIdFromStoreFiles+1 here, and make minSeqId be "minimum sequence id,
inclusive, to apply"
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment976>
should probably use LOG.error for these
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment974>
I agree, i don't think the splitter would ever write a partial one. We
should treat an IOE here as a real error
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment977>
when would this happen? deserves a comment I think
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment978>
can we rename maxSeqIdInLog to currentEditSeqId or something? since that's
what we're really talking about here. Also reversing this inequality to: if
(currentEditSeqId < storeMaxSeqId>) I think it would read more clearly
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment979>
I think this will crash when the region contains stores that were the
result of an HFOF bulk load (they have no sequence id)
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment980>
maybe rename to restoreEdit() or something to be clear that this is during
restoration only?
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment981>
possibly more efficient:
Collections.singletonMap(kv.getFamily(), Collections.singletonList(kv))
though there may be a comparator issue?
src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
<http://review.hbase.org/r/179/#comment970>
comment is now out of date, right?
src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
<http://review.hbase.org/r/179/#comment971>
maybe just Preconditions.checkArgument here so it throws if you try to pass
a non-DFS?
- Todd
> Reconstruction log playback has no bounds on memory used
> --------------------------------------------------------
>
> Key: HBASE-1025
> URL: https://issues.apache.org/jira/browse/HBASE-1025
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Fix For: 0.21.0
>
> Attachments: 1025-v2.txt, 1025-v3.txt, 1025-v5.patch, 1025.txt
>
>
> Makes a TreeMap and just keeps adding edits without regard for size of edits
> applied; could cause OOME (I've not seen a definitive case though have seen
> an OOME around time of a reconstructionlog replay -- perhaps this the straw
> that broke the fleas antlers?)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.