[jira] [Commented] (HBASE-4645) Edits Log recovery losing data across column families

[email protected] (Commented) (JIRA) Fri, 21 Oct 2011 11:24:57 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132894#comment-13132894
 ]

[email protected] commented on HBASE-4645:
------------------------------------------------------

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2524/#review2756
-----------------------------------------------------------

Amit-- code changes look solid.

It is unfortunate that there was a nice test TestWALReplay.java to test this 
exact case-- but was broken in catching this issue.

Furthermore, the test writes to multiple CFs in separate puts (transactions). 
It will be good to enhance it to do cross-CF puts (in a single txn).

src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
<https://reviews.apache.org/r/2524/#comment6197>

    looks like the original test is busted. It expects to flush only one region 
(in line 271); but here region.close() will cause it to flush all stores :)

- Kannan

On 2011-10-21 18:06:16, Amitanand Aiyer wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2524/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-21 18:06:16)
bq.  
bq.  
bq.  Review request for Ted Yu, Michael Stack, Jonathan Gray, Lars Hofhansl, 
Amitanand Aiyer, Kannan Muthukkaruppan, Karthik Ranganathan, and Nicolas 
Spiegelberg.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  There is a data loss happening (for some of the column families) when we 
do the replay logs.
bq.  
bq.  The bug seems to be from the fact that during replay-logs we only choose 
to replay
bq.  the logs from the maximumSequenceID across ALL the stores. This is wrong. 
If a
bq.  column family is ahead of others (because the crash happened before all 
the column
bq.  families were flushed), then we lose data for the column families that 
have not yet
bq.  caught up.
bq.  
bq.  The correct logic for replay should begin the replay from the minimum 
across the
bq.  maximum in each store.
bq.  
bq.  
bq.  This addresses bug hbase-4645.
bq.      https://issues.apache.org/jira/browse/hbase-4645
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 8c32839 
bq.    
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java 
966262b 
bq.  
bq.  Diff: https://reviews.apache.org/r/2524/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Initial patch. v1.
bq.  
bq.  mvn test (running).
bq.  
bq.  TBD: add a test case to repro the issue and make sure it fixes.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Amitanand
bq.  
bq.

> Edits Log recovery losing data across column families
> -----------------------------------------------------
>
>                 Key: HBASE-4645
>                 URL: https://issues.apache.org/jira/browse/HBASE-4645
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.89.20100924, 0.92.0
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>
> There is a data loss happening (for some of the column families) when we do 
> the replay logs.
> The bug seems to be from the fact that during replay-logs we only choose to 
> replay
> the logs from the maximumSequenceID across *ALL* the stores. This is wrong. 
> If a
> column family is ahead of others (because the crash happened before all the 
> column
> families were flushed), then we lose data for the column families that have 
> not yet
> caught up.
> The correct logic for replay should begin the replay from the minimum across 
> the
> maximum in each store. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4645) Edits Log recovery losing data across column families

Reply via email to