[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-29 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13159798#comment-13159798
 ] 

Hudson commented on HBASE-4797:
---

Integrated in HBase-0.92 #163 (See 
[https://builds.apache.org/job/HBase-0.92/163/])
HBASE-4869  Backport to 0.92: HBASE-4797 [availability] Skip recovered.edits
   files with edits older than what region currently has (Jimmy 
Xiang)

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-28 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13158997#comment-13158997
 ] 

Hudson commented on HBASE-4797:
---

Integrated in HBase-0.92-security #22 (See 
[https://builds.apache.org/job/HBase-0.92-security/22/])
HBASE-4869  Backport to 0.92: HBASE-4797 [availability] Skip recovered.edits
   files with edits older than what region currently has (Jimmy 
Xiang)

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155788#comment-13155788
 ] 

Hudson commented on HBASE-4797:
---

Integrated in HBase-TRUNK-security #6 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/6/])
HBASE-4797 [availability] Skip recovered.edits files with edits we know 
older than what region currently has (Jimmy Jiang)

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155869#comment-13155869
 ] 

Hudson commented on HBASE-4797:
---

Integrated in HBase-TRUNK #2474 (See 
[https://builds.apache.org/job/HBase-TRUNK/2474/])
HBASE-4797 [availability] Skip recovered.edits files with edits we know 
older than what region currently has (Jimmy Jiang)

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-22 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155292#comment-13155292
 ] 

stack commented on HBASE-4797:
--

@Jimmy Just FYI, since you are new, to trigger the build again, you need to 
re-upload the original patch or a new one (which you did), then (I think) you 
need to cancel and resubmit the patch.

 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-22 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155298#comment-13155298
 ] 

Jimmy Xiang commented on HBASE-4797:


Thanks!  I cancel and resubmit the patch.

 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-22 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155369#comment-13155369
 ] 

Hadoop QA commented on HBASE-4797:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12504777/0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -162 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 66 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestInstantSchemaChange
  org.apache.hadoop.hbase.client.TestAdmin

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/336//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/336//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/336//console

This message is automatically generated.

 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-22 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155391#comment-13155391
 ] 

Hadoop QA commented on HBASE-4797:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12504779/0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -162 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 66 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestMasterFailover
  org.apache.hadoop.hbase.client.TestAdmin

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/337//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/337//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/337//console

This message is automatically generated.

 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-22 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155686#comment-13155686
 ] 

Jimmy Xiang commented on HBASE-4797:


Can someone check in the patch?

 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-22 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155690#comment-13155690
 ] 

Ted Yu commented on HBASE-4797:
---

I will.

 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch, 
 0001-HBASE-4797-availability-skip-files-with-edits-we-kno.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-21 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154704#comment-13154704
 ] 

stack commented on HBASE-4797:
--

bq. The region opening is tried periodically. The waiting interval is about 1/3 
of the assignment time out. I think that's fine.

From the log snippet above though Jimmy, it seems like we are updating the 
znode every second almost.  Thats too much?

 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob

 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-21 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154723#comment-13154723
 ] 

jirapos...@reviews.apache.org commented on HBASE-4797:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2906/#review3413
---



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
https://reviews.apache.org/r/2906/#comment7642

maxSedId should be named maxSeqId


- Ted


On 2011-11-21 22:38:39, Jimmy Xiang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2906/
bq.  ---
bq.  
bq.  (Updated 2011-11-21 22:38:39)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon and Michael Stack.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  If there are multiple recovered edits files, I used the file name to find 
the initial sequence id.  After these files are sorted, we can find a file's 
possible maximum sequence id based on the next file's initial sequence id.  If 
the maximum sequence id is smaller than the current sequence id, the whole 
recovered edits file is old and ignored.
bq.  
bq.  
bq.  This addresses bug HBASE-4797.
bq.  https://issues.apache.org/jira/browse/HBASE-4797
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 8b89661 
bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java 
5daa02b 
bq.  
bq.  Diff: https://reviews.apache.org/r/2906/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Added test case to TestHRegion, and all the tests in this test are passed.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jimmy
bq.  
bq.



 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob

 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-21 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154747#comment-13154747
 ] 

jirapos...@reviews.apache.org commented on HBASE-4797:
--



bq.  On 2011-11-21 23:23:07, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 
2468
bq.   https://reviews.apache.org/r/2906/diff/2/?file=59652#file59652line2468
bq.  
bq.   maxSedId should be named maxSeqId

Good catch.


- Jimmy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2906/#review3413
---


On 2011-11-21 22:38:39, Jimmy Xiang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2906/
bq.  ---
bq.  
bq.  (Updated 2011-11-21 22:38:39)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon and Michael Stack.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  If there are multiple recovered edits files, I used the file name to find 
the initial sequence id.  After these files are sorted, we can find a file's 
possible maximum sequence id based on the next file's initial sequence id.  If 
the maximum sequence id is smaller than the current sequence id, the whole 
recovered edits file is old and ignored.
bq.  
bq.  
bq.  This addresses bug HBASE-4797.
bq.  https://issues.apache.org/jira/browse/HBASE-4797
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 8b89661 
bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java 
5daa02b 
bq.  
bq.  Diff: https://reviews.apache.org/r/2906/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Added test case to TestHRegion, and all the tests in this test are passed.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jimmy
bq.  
bq.



 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob

 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-21 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154771#comment-13154771
 ] 

jirapos...@reviews.apache.org commented on HBASE-4797:
--



bq.  On 2011-11-21 22:47:55, Michael Stack wrote:
bq.   src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 
2456
bq.   https://reviews.apache.org/r/2906/diff/2/?file=59652#file59652line2456
bq.  
bq.   So, are these already sorted in right order from oldest edit to 
newest?

All these files are under the same folder, if these files have the same name 
pattern as defined in HLog: String.format(%019d, seqid);
yes, they are sorted in the right order based on the sequence id number.

If this is not true, then the order to reapply these edits is already wrong.


bq.  On 2011-11-21 22:47:55, Michael Stack wrote:
bq.   src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 
2475
bq.   https://reviews.apache.org/r/2906/diff/2/?file=59652#file59652line2475
bq.  
bq.   Possilbe should be Possible.
bq.   
bq.   I'd be more assertive in this message.  Maximum possible sequenceid 
for this log is  + + , skipping ..

Sure, I will fix it.


bq.  On 2011-11-21 22:47:55, Michael Stack wrote:
bq.   src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java, 
line 2855
bq.   https://reviews.apache.org/r/2906/diff/2/?file=59653#file59653line2855
bq.  
bq.   Any more asserts we can do in here?   Assert we replayed N of the M 
files?

Sure, I added more test cases.


- Jimmy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2906/#review3409
---


On 2011-11-21 22:38:39, Jimmy Xiang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2906/
bq.  ---
bq.  
bq.  (Updated 2011-11-21 22:38:39)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon and Michael Stack.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  If there are multiple recovered edits files, I used the file name to find 
the initial sequence id.  After these files are sorted, we can find a file's 
possible maximum sequence id based on the next file's initial sequence id.  If 
the maximum sequence id is smaller than the current sequence id, the whole 
recovered edits file is old and ignored.
bq.  
bq.  
bq.  This addresses bug HBASE-4797.
bq.  https://issues.apache.org/jira/browse/HBASE-4797
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 8b89661 
bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java 
5daa02b 
bq.  
bq.  Diff: https://reviews.apache.org/r/2906/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Added test case to TestHRegion, and all the tests in this test are passed.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jimmy
bq.  
bq.



 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob

 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-21 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154773#comment-13154773
 ] 

jirapos...@reviews.apache.org commented on HBASE-4797:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2906/
---

(Updated 2011-11-22 00:32:48.813017)


Review request for hbase, Todd Lipcon and Michael Stack.


Changes
---

Revised patch with changes per review.


Summary
---

If there are multiple recovered edits files, I used the file name to find the 
initial sequence id.  After these files are sorted, we can find a file's 
possible maximum sequence id based on the next file's initial sequence id.  If 
the maximum sequence id is smaller than the current sequence id, the whole 
recovered edits file is old and ignored.


This addresses bug HBASE-4797.
https://issues.apache.org/jira/browse/HBASE-4797


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 8b89661 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java 5daa02b 

Diff: https://reviews.apache.org/r/2906/diff


Testing
---

Added test case to TestHRegion, and all the tests in this test are passed.


Thanks,

Jimmy



 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob

 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-21 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154774#comment-13154774
 ] 

Jimmy Xiang commented on HBASE-4797:


Cool! Thanks.

On Mon, Nov 21, 2011 at 2:51 PM, Kannan Muthukkaruppan (Commented) (JIRA) 



 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob

 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-21 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154783#comment-13154783
 ] 

jirapos...@reviews.apache.org commented on HBASE-4797:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2906/#review3416
---

Ship it!


Looks good to me.
Thanks for fixing the whitespace too (although it made the patch harder to 
read).
You also left some whitespace in testSkipRecoveredEditsReplay. :)


- Lars


On 2011-11-22 00:32:48, Jimmy Xiang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2906/
bq.  ---
bq.  
bq.  (Updated 2011-11-22 00:32:48)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon and Michael Stack.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  If there are multiple recovered edits files, I used the file name to find 
the initial sequence id.  After these files are sorted, we can find a file's 
possible maximum sequence id based on the next file's initial sequence id.  If 
the maximum sequence id is smaller than the current sequence id, the whole 
recovered edits file is old and ignored.
bq.  
bq.  
bq.  This addresses bug HBASE-4797.
bq.  https://issues.apache.org/jira/browse/HBASE-4797
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 8b89661 
bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java 
5daa02b 
bq.  
bq.  Diff: https://reviews.apache.org/r/2906/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Added test case to TestHRegion, and all the tests in this test are passed.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jimmy
bq.  
bq.



 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-21 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154785#comment-13154785
 ] 

jirapos...@reviews.apache.org commented on HBASE-4797:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2906/
---

(Updated 2011-11-22 01:02:17.373022)


Review request for hbase, Todd Lipcon and Michael Stack.


Changes
---

Removed white spaces in TestHRegion.java


Summary
---

If there are multiple recovered edits files, I used the file name to find the 
initial sequence id.  After these files are sorted, we can find a file's 
possible maximum sequence id based on the next file's initial sequence id.  If 
the maximum sequence id is smaller than the current sequence id, the whole 
recovered edits file is old and ignored.


This addresses bug HBASE-4797.
https://issues.apache.org/jira/browse/HBASE-4797


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 8b89661 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java 5daa02b 

Diff: https://reviews.apache.org/r/2906/diff


Testing
---

Added test case to TestHRegion, and all the tests in this test are passed.


Thanks,

Jimmy



 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-21 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154804#comment-13154804
 ] 

Lars Hofhansl commented on HBASE-4797:
--

I'm happy to commit this (pending the test run). Any objections?

 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-21 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154826#comment-13154826
 ] 

Hadoop QA commented on HBASE-4797:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12504687/0001-HBASE-4797-%5Bavailability%5D-skip-older-edits.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/326//console

This message is automatically generated.

 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-21 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154861#comment-13154861
 ] 

stack commented on HBASE-4797:
--

No objection from me.  Jimmy, want to attach patch with --no-prefix so hadoopqa 
runs?



 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4797) [availability] Skip recovered.edits files with edits we know older than what region currently has

2011-11-21 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154877#comment-13154877
 ] 

jirapos...@reviews.apache.org commented on HBASE-4797:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2906/#review3425
---


+1 on patch

- ramkrishna


On 2011-11-22 01:02:17, Jimmy Xiang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2906/
bq.  ---
bq.  
bq.  (Updated 2011-11-22 01:02:17)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon and Michael Stack.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  If there are multiple recovered edits files, I used the file name to find 
the initial sequence id.  After these files are sorted, we can find a file's 
possible maximum sequence id based on the next file's initial sequence id.  If 
the maximum sequence id is smaller than the current sequence id, the whole 
recovered edits file is old and ignored.
bq.  
bq.  
bq.  This addresses bug HBASE-4797.
bq.  https://issues.apache.org/jira/browse/HBASE-4797
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 8b89661 
bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java 
5daa02b 
bq.  
bq.  Diff: https://reviews.apache.org/r/2906/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Added test case to TestHRegion, and all the tests in this test are passed.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jimmy
bq.  
bq.



 [availability] Skip recovered.edits files with edits we know older than what 
 region currently has
 -

 Key: HBASE-4797
 URL: https://issues.apache.org/jira/browse/HBASE-4797
 Project: HBase
  Issue Type: Bug
  Components: performance
Reporter: stack
Assignee: Jimmy Xiang
Priority: Critical
  Labels: noob
 Fix For: 0.94.0

 Attachments: 0001-HBASE-4797-[availability]-skip-older-edits.patch, 
 0001-HBASE-4797-[availability]-skip-older-edits.patch


 Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are 
 not getting cleaned so I had 7000 regions to replay.  The distributed split 
 code did a nice job and cluster came back but interesting is that some hot 
 regions ended up having loads of recovered.edits files -- tens if not 
 hundreds -- to replay against the region (can we bulk load recovered.edits 
 instead of replaying them?).  Each recovered.edits file is taking about a 
 second to process (though only about 30 odd edits per file it seems).  The 
 region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira