[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread Alex Newman (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187520#comment-13187520
 ] 

Alex Newman commented on HBASE-2600:


I realized that manually editing the generated thrift files might not be the 
best approach. Any suggestions?

 Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
 tablename+ENDROW+randomid
 

 Key: HBASE-2600
 URL: https://issues.apache.org/jira/browse/HBASE-2600
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Alex Newman
 Attachments: 
 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, jenkins.pdf


 This is an idea that Ryan and I have been kicking around on and off for a 
 while now.
 If regionnames were made of tablename+endrow instead of tablename+startrow, 
 then in the metatables, doing a search for the region that contains the 
 wanted row, we'd just have to open a scanner using passed row and the first 
 row found by the scan would be that of the region we need (If offlined 
 parent, we'd have to scan to the next row).
 If we redid the meta tables in this format, we'd be using an access that is 
 natural to hbase, a scan as opposed to the perverse, expensive 
 getClosestRowBefore we currently have that has to walk backward in meta 
 finding a containing region.
 This issue is about changing the way we name regions.
 If we were using scans, prewarming client cache would be near costless (as 
 opposed to what we'll currently have to do which is first a 
 getClosestRowBefore and then a scan from the closestrowbefore forward).
 Converting to the new method, we'd have to run a migration on startup 
 changing the content in meta.
 Up to this, the randomid component of a region name has been the timestamp of 
 region creation.   HBASE-2531 32-bit encoding of regionnames waaay 
 too susceptible to hash clashes proposes changing the randomid so that it 
 contains actual name of the directory in the filesystem that hosts the 
 region.  If we had this in place, I think it would help with the migration to 
 this new way of doing the meta because as is, the region name in fs is a hash 
 of regionname... changing the format of the regionname would mean we generate 
 a different hash... so we'd need hbase-2531 to be in place before we could do 
 this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187585#comment-13187585
 ] 

Hudson commented on HBASE-5153:
---

Integrated in HBase-0.92 #246 (See 
[https://builds.apache.org/job/HBase-0.92/246/])
HBASE-5153  Add retry logic in 
HConnectionImplementation#resetZooKeeperTrackers (Jieshan)

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/ClosedConnectionException.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnection.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithAbort.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithRemove.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperNodeTracker.java


 Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
 ---

 Key: HBASE-5153
 URL: https://issues.apache.org/jira/browse/HBASE-5153
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.94.0, 0.92.1, 0.90.7

 Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
 HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
 HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
 HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
 HBASE-5153.patch, TestResults-hbase5153.out


 HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
 share a same connection, once this connection got abort in one thread, the 
 other threads will got a 
 HConnectionManager$HConnectionImplementation@18fb1f7 closed exception.
 It solve the problem of stale connection can't removed. But the orignal 
 HTable instance cann't be continue to use. The connection in HTable should be 
 recreated.
 Actually, there's two aproach to solve this:
 1. In user code, once catch an IOE, close connection and re-create HTable 
 instance. We can use this as a workaround.
 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

2012-01-17 Thread Nicholas Telford (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Telford updated HBASE-5208:


Attachment: HBASE-5208-004.txt

Test broken out in to single test that doesn't cause the TestSuite to timeout.

 Allow setting Scan start/stop row individually in TableInputFormat
 --

 Key: HBASE-5208
 URL: https://issues.apache.org/jira/browse/HBASE-5208
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Nicholas Telford
Priority: Minor
 Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, 
 HBASE-5208-003.txt, HBASE-5208-004.txt


 Currently, TableInputFormat initializes a serialized Scan from 
 hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using 
 properties defined in hbase.mapreduce.scan.*. However, of these properties 
 the start row and stop row (arguably the most pertinent) are missing.
 TableInputFormat should permit the specification of a start/stop row as with 
 the other fields using a new pair of properties: 
 hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end
 The primary use-case for this is to permit Oozie and other job management 
 tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a 
 contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

2012-01-17 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187604#comment-13187604
 ] 

Hadoop QA commented on HBASE-5208:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510830/HBASE-5208-004.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -144 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 83 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/791//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/791//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/791//console

This message is automatically generated.

 Allow setting Scan start/stop row individually in TableInputFormat
 --

 Key: HBASE-5208
 URL: https://issues.apache.org/jira/browse/HBASE-5208
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Nicholas Telford
Priority: Minor
 Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, 
 HBASE-5208-003.txt, HBASE-5208-004.txt


 Currently, TableInputFormat initializes a serialized Scan from 
 hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using 
 properties defined in hbase.mapreduce.scan.*. However, of these properties 
 the start row and stop row (arguably the most pertinent) are missing.
 TableInputFormat should permit the specification of a start/stop row as with 
 the other fields using a new pair of properties: 
 hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end
 The primary use-case for this is to permit Oozie and other job management 
 tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a 
 contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5200) AM.ProcessRegionInTransition() and AM.handleRegion() races thus leaving the region assignment inconsistent.

2012-01-17 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5200:
--

Affects Version/s: 0.90.5
Fix Version/s: 0.90.7
   0.92.1

 AM.ProcessRegionInTransition() and AM.handleRegion() races thus leaving the 
 region assignment inconsistent.
 ---

 Key: HBASE-5200
 URL: https://issues.apache.org/jira/browse/HBASE-5200
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.7


 This is the scenario
 Consider a case where the balancer is going on thus trying to close regions 
 in a RS.
 Before we could close a master switch happens.  
 On Master switch the set of nodes that are in RIT is collected and we first 
 get Data and start watching the node
 After that the node data is added into RIT.
 Now by this time (before adding to RIT) if the RS to which close was called 
 does a transition in AM.handleRegion() we miss the handling saying RIT state 
 was null.
 {code}
 2012-01-13 10:50:46,358 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 a66d281d231dfcaea97c270698b26b6f from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,358 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 c12e53bfd48ddc5eec507d66821c4d23 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,358 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 59ae13de8c1eb325a0dd51f4902d2052 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 f45bc9614d7575f35244849af85aa078 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 cc3ecd7054fe6cd4a1159ed92fd62641 from server 
 HOST-192-168-47-204,20020,1326342744518 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 3af40478a17fee96b4a192b22c90d5a2 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 e6096a8466e730463e10d3d61f809b92 from server 
 HOST-192-168-47-204,20020,1326342744518 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 4806781a1a23066f7baed22b4d237e24 from server 
 HOST-192-168-47-204,20020,1326342744518 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 d69e104131accaefe21dcc01fddc7629 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 {code}
 In branch the CLOSING node is created by RS thus leading to more 
 inconsistency.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5200) AM.ProcessRegionInTransition() and AM.handleRegion() races thus leaving the region assignment inconsistent.

2012-01-17 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187672#comment-13187672
 ] 

ramkrishna.s.vasudevan commented on HBASE-5200:
---

This issue is applicable in trunk also.

 AM.ProcessRegionInTransition() and AM.handleRegion() races thus leaving the 
 region assignment inconsistent.
 ---

 Key: HBASE-5200
 URL: https://issues.apache.org/jira/browse/HBASE-5200
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.7


 This is the scenario
 Consider a case where the balancer is going on thus trying to close regions 
 in a RS.
 Before we could close a master switch happens.  
 On Master switch the set of nodes that are in RIT is collected and we first 
 get Data and start watching the node
 After that the node data is added into RIT.
 Now by this time (before adding to RIT) if the RS to which close was called 
 does a transition in AM.handleRegion() we miss the handling saying RIT state 
 was null.
 {code}
 2012-01-13 10:50:46,358 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 a66d281d231dfcaea97c270698b26b6f from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,358 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 c12e53bfd48ddc5eec507d66821c4d23 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,358 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 59ae13de8c1eb325a0dd51f4902d2052 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 f45bc9614d7575f35244849af85aa078 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 cc3ecd7054fe6cd4a1159ed92fd62641 from server 
 HOST-192-168-47-204,20020,1326342744518 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 3af40478a17fee96b4a192b22c90d5a2 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 e6096a8466e730463e10d3d61f809b92 from server 
 HOST-192-168-47-204,20020,1326342744518 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 4806781a1a23066f7baed22b4d237e24 from server 
 HOST-192-168-47-204,20020,1326342744518 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 2012-01-13 10:50:46,359 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 d69e104131accaefe21dcc01fddc7629 from server 
 HOST-192-168-47-205,20020,1326363111288 but region was in  the state null and 
 not in expected PENDING_CLOSE or CLOSING states
 {code}
 In branch the CLOSING node is created by RS thus leading to more 
 inconsistency.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5215) [book] additional clarification on compactions and region merging

2012-01-17 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5215:
-

Status: Patch Available  (was: Open)

 [book] additional clarification on compactions and region merging
 -

 Key: HBASE-5215
 URL: https://issues.apache.org/jira/browse/HBASE-5215
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: docbkx_HBASE_5215.patch


 book.xml 
 * added further clarification in Arch chapter on major compactions that 
 region merges do NOT happen.  This is a common question on the dist-list.
 * added link to region-merge section in OpsMgt chapter.
 ops_mgt.xml
 * in Region Merge section, added link to ruby script in HBASE-1621 for region 
 merging.  This was cited in a recent dist-list conversation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5215) [book] additional clarification on compactions and region merging

2012-01-17 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5215:
-

Attachment: docbkx_HBASE_5215.patch

 [book] additional clarification on compactions and region merging
 -

 Key: HBASE-5215
 URL: https://issues.apache.org/jira/browse/HBASE-5215
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: docbkx_HBASE_5215.patch


 book.xml 
 * added further clarification in Arch chapter on major compactions that 
 region merges do NOT happen.  This is a common question on the dist-list.
 * added link to region-merge section in OpsMgt chapter.
 ops_mgt.xml
 * in Region Merge section, added link to ruby script in HBASE-1621 for region 
 merging.  This was cited in a recent dist-list conversation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5215) [book] additional clarification on compactions and region merging

2012-01-17 Thread Doug Meil (Created) (JIRA)
[book] additional clarification on compactions and region merging
-

 Key: HBASE-5215
 URL: https://issues.apache.org/jira/browse/HBASE-5215
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: docbkx_HBASE_5215.patch

book.xml 
* added further clarification in Arch chapter on major compactions that region 
merges do NOT happen.  This is a common question on the dist-list.
* added link to region-merge section in OpsMgt chapter.

ops_mgt.xml
* in Region Merge section, added link to ruby script in HBASE-1621 for region 
merging.  This was cited in a recent dist-list conversation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5215) [book] additional clarification on compactions and region merging

2012-01-17 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5215:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 [book] additional clarification on compactions and region merging
 -

 Key: HBASE-5215
 URL: https://issues.apache.org/jira/browse/HBASE-5215
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: docbkx_HBASE_5215.patch


 book.xml 
 * added further clarification in Arch chapter on major compactions that 
 region merges do NOT happen.  This is a common question on the dist-list.
 * added link to region-merge section in OpsMgt chapter.
 ops_mgt.xml
 * in Region Merge section, added link to ruby script in HBASE-1621 for region 
 merging.  This was cited in a recent dist-list conversation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

2012-01-17 Thread Nicholas Telford (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187683#comment-13187683
 ] 

Nicholas Telford commented on HBASE-5208:
-

Not entirely sure why there are (unrelated) tests failing. Looking at the 
error, they all appear to be caused by the following. Can someone verify 
whether or not this is caused by something in my patch?

java.lang.NumberFormatException: For input string: 18446743988250694508
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Long.parseLong(Long.java:422)
at java.lang.Long.parseLong(Long.java:468)
at 
org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413)
at 
org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148)
at 
org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401)
at org.apache.hadoop.mapred.Task.initialize(Task.java:536)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

As for the findbugs and JavaDoc issues: JavaDoc is reporting a negative number 
of problems, so I'm disregarding it. Findbugs doesn't seem to be finding 
anything in my new code, although it's difficult to be sure given the volume of 
warnings.

 Allow setting Scan start/stop row individually in TableInputFormat
 --

 Key: HBASE-5208
 URL: https://issues.apache.org/jira/browse/HBASE-5208
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Nicholas Telford
Priority: Minor
 Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, 
 HBASE-5208-003.txt, HBASE-5208-004.txt


 Currently, TableInputFormat initializes a serialized Scan from 
 hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using 
 properties defined in hbase.mapreduce.scan.*. However, of these properties 
 the start row and stop row (arguably the most pertinent) are missing.
 TableInputFormat should permit the specification of a start/stop row as with 
 the other fields using a new pair of properties: 
 hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end
 The primary use-case for this is to permit Oozie and other job management 
 tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a 
 contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187684#comment-13187684
 ] 

Hudson commented on HBASE-5153:
---

Integrated in HBase-0.92-security #81 (See 
[https://builds.apache.org/job/HBase-0.92-security/81/])
HBASE-5153  Add retry logic in 
HConnectionImplementation#resetZooKeeperTrackers (Jieshan)

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/ClosedConnectionException.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnection.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithAbort.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithRemove.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperNodeTracker.java


 Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
 ---

 Key: HBASE-5153
 URL: https://issues.apache.org/jira/browse/HBASE-5153
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.94.0, 0.92.1, 0.90.7

 Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
 HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
 HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
 HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
 HBASE-5153.patch, TestResults-hbase5153.out


 HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
 share a same connection, once this connection got abort in one thread, the 
 other threads will got a 
 HConnectionManager$HConnectionImplementation@18fb1f7 closed exception.
 It solve the problem of stale connection can't removed. But the orignal 
 HTable instance cann't be continue to use. The connection in HTable should be 
 recreated.
 Actually, there's two aproach to solve this:
 1. In user code, once catch an IOE, close connection and re-create HTable 
 instance. We can use this as a workaround.
 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187774#comment-13187774
 ] 

Zhihong Yu commented on HBASE-5153:
---

When I ran TestMergeTool on TRUNK, I got:
{code}
main prio=5 tid=102801000 nid=0x100601000 waiting on condition [1005fb000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at java.lang.Thread.sleep(Thread.java:302)
at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:328)
at 
org.apache.hadoop.hbase.util.RetryCounter.sleepUntilNextRetry(RetryCounter.java:55)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:171)
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:230)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:80)
- locked 784a83930 (a org.apache.hadoop.hbase.MasterAddressTracker)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:590)
- locked 790b70180 (a 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.resetZooKeeperTrackersWithRetries(HConnectionManager.java:625)
- locked 790b70180 (a 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.abort(HConnectionManager.java:1711)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:93)
- locked 7854b9898 (a org.apache.hadoop.hbase.MasterAddressTracker)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:590)
- locked 790b70180 (a 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.resetZooKeeperTrackersWithRetries(HConnectionManager.java:625)
- locked 790b70180 (a 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.abort(HConnectionManager.java:1711)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:93)
- locked 7854b98e0 (a org.apache.hadoop.hbase.MasterAddressTracker)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:590)
- locked 790b70180 (a 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.resetZooKeeperTrackersWithRetries(HConnectionManager.java:625)
- locked 790b70180 (a 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.abort(HConnectionManager.java:1711)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:93)
- locked 7854b9928 (a org.apache.hadoop.hbase.MasterAddressTracker)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:590)
- locked 790b70180 (a 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.resetZooKeeperTrackersWithRetries(HConnectionManager.java:625)
- locked 790b70180 (a 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.abort(HConnectionManager.java:1711)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:93)
- locked 78f991ad0 (a org.apache.hadoop.hbase.MasterAddressTracker)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:590)
- locked 790b70180 (a 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.resetZooKeeperTrackersWithRetries(HConnectionManager.java:625)
- locked 790b70180 (a 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.abort(HConnectionManager.java:1711)
at 

[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187810#comment-13187810
 ] 

Zhihong Yu commented on HBASE-5153:
---

Reverted from 0.92 and TRUNK due to failed Jenkins builds.

 Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
 ---

 Key: HBASE-5153
 URL: https://issues.apache.org/jira/browse/HBASE-5153
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.94.0, 0.92.1, 0.90.7

 Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
 HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
 HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
 HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
 HBASE-5153.patch, TestResults-hbase5153.out


 HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
 share a same connection, once this connection got abort in one thread, the 
 other threads will got a 
 HConnectionManager$HConnectionImplementation@18fb1f7 closed exception.
 It solve the problem of stale connection can't removed. But the orignal 
 HTable instance cann't be continue to use. The connection in HTable should be 
 recreated.
 Actually, there's two aproach to solve this:
 1. In user code, once catch an IOE, close connection and re-create HTable 
 instance. We can use this as a workaround.
 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5196) Failure in region split after PONR could cause region hole

2012-01-17 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187816#comment-13187816
 ] 

Jimmy Xiang commented on HBASE-5196:


Yes, the test suite on 0.90 with the patch passed.

 Failure in region split after PONR could cause region hole
 --

 Key: HBASE-5196
 URL: https://issues.apache.org/jira/browse/HBASE-5196
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.92.0, 0.94.0

 Attachments: 5196-v2.txt, hbase-5196_0.90.txt


 If region split fails after PONR, it relies on the master ServerShutdown 
 handler to fix it.  However, if the master doesn't get a chance to fix it.  
 There will be a hole in the region chain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2012-01-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187820#comment-13187820
 ] 

Zhihong Yu commented on HBASE-3845:
---

@Jinchao:
I don't see getSnapshotName() in HLog.java under 0.90

Can you attach a complete patch ?

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.92.0

 Attachments: 
 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, 
 HBASE-3845-fix-TestResettingCounters-test.txt, HBASE-3845_1.patch, 
 HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, 
 HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_branch90V1.patch, 
 HBASE-3845_branch90V2.patch, HBASE-3845_trunk_2.patch, 
 HBASE-3845_trunk_3.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-17 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5153:
--

Fix Version/s: (was: 0.90.7)
   0.90.6

 Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
 ---

 Key: HBASE-5153
 URL: https://issues.apache.org/jira/browse/HBASE-5153
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.94.0, 0.92.1, 0.90.6

 Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
 HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
 HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
 HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
 HBASE-5153.patch, TestResults-hbase5153.out


 HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
 share a same connection, once this connection got abort in one thread, the 
 other threads will got a 
 HConnectionManager$HConnectionImplementation@18fb1f7 closed exception.
 It solve the problem of stale connection can't removed. But the orignal 
 HTable instance cann't be continue to use. The connection in HTable should be 
 recreated.
 Actually, there's two aproach to solve this:
 1. In user code, once catch an IOE, close connection and re-create HTable 
 instance. We can use this as a workaround.
 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5196) Failure in region split after PONR could cause region hole

2012-01-17 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5196:
--

Fix Version/s: 0.90.6
 Hadoop Flags: Reviewed

Integrated to 0.90 branch.

Thanks for the patch, Jimmy.

 Failure in region split after PONR could cause region hole
 --

 Key: HBASE-5196
 URL: https://issues.apache.org/jira/browse/HBASE-5196
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.92.0, 0.94.0, 0.90.6

 Attachments: 5196-v2.txt, hbase-5196_0.90.txt


 If region split fails after PONR, it relies on the master ServerShutdown 
 handler to fix it.  However, if the master doesn't get a chance to fix it.  
 There will be a hole in the region chain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5201) Utilize TThreadedSelectorServer and remove redundant code in ThriftServer and HRegionThriftServer

2012-01-17 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5201:
--

Release Note: 
New config parameters:

hbase.thrift.selector.threads  Number of selector threads for reading and 
writing socket
hbase.thrift.worker.threads  Number of threads for processing the thrift calls
hbase.thrift.stop.timeout.seconds  Time to wait for server to stop gracefully
hbase.thrift.accept.queue.size.per.selector  Maximum number of accepted 
elements per selector

Integrated to TRUNK.

Thanks for the patch Scott

 Utilize TThreadedSelectorServer and remove redundant code in ThriftServer and 
 HRegionThriftServer
 -

 Key: HBASE-5201
 URL: https://issues.apache.org/jira/browse/HBASE-5201
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.94.0

 Attachments: HBASE-5201-v2.txt, HBASE-5201-v3.txt, HBASE-5201-v4.txt, 
 HBASE-5201.txt


 TThreadedSelectorServer is good for RPC-heavy situation because IO are not 
 limited to one CPU. See
 https://issues.apache.org/jira/browse/Thrift-1167
 I am porting the related classes form thrift trunk (it is not there in 
 thrift-0.7.0).
 There are lots of repeat codes in ThriftServer and HRegionThriftServer.
 These codes are now moved to a Runnable called ThriftServerRunner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5201) Utilize TThreadedSelectorServer and remove redundant code in ThriftServer and HRegionThriftServer

2012-01-17 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5201:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Utilize TThreadedSelectorServer and remove redundant code in ThriftServer and 
 HRegionThriftServer
 -

 Key: HBASE-5201
 URL: https://issues.apache.org/jira/browse/HBASE-5201
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.94.0

 Attachments: HBASE-5201-v2.txt, HBASE-5201-v3.txt, HBASE-5201-v4.txt, 
 HBASE-5201.txt


 TThreadedSelectorServer is good for RPC-heavy situation because IO are not 
 limited to one CPU. See
 https://issues.apache.org/jira/browse/Thrift-1167
 I am porting the related classes form thrift trunk (it is not there in 
 thrift-0.7.0).
 There are lots of repeat codes in ThriftServer and HRegionThriftServer.
 These codes are now moved to a Runnable called ThriftServerRunner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

2012-01-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187838#comment-13187838
 ] 

Zhihong Yu commented on HBASE-5208:
---

@Nicolas:
See MAPREDUCE-3583 for the cause of NumberFormatException.

 Allow setting Scan start/stop row individually in TableInputFormat
 --

 Key: HBASE-5208
 URL: https://issues.apache.org/jira/browse/HBASE-5208
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Nicholas Telford
Priority: Minor
 Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, 
 HBASE-5208-003.txt, HBASE-5208-004.txt


 Currently, TableInputFormat initializes a serialized Scan from 
 hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using 
 properties defined in hbase.mapreduce.scan.*. However, of these properties 
 the start row and stop row (arguably the most pertinent) are missing.
 TableInputFormat should permit the specification of a start/stop row as with 
 the other fields using a new pair of properties: 
 hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end
 The primary use-case for this is to permit Oozie and other job management 
 tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a 
 contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2012-01-17 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187842#comment-13187842
 ] 

ramkrishna.s.vasudevan commented on HBASE-3845:
---

@Ted
Tomorrow we will be uploading a patch. 

 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.92.0

 Attachments: 
 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, 
 HBASE-3845-fix-TestResettingCounters-test.txt, HBASE-3845_1.patch, 
 HBASE-3845_2.patch, HBASE-3845_4.patch, HBASE-3845_5.patch, 
 HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_branch90V1.patch, 
 HBASE-3845_branch90V2.patch, HBASE-3845_trunk_2.patch, 
 HBASE-3845_trunk_3.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat

2012-01-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187843#comment-13187843
 ] 

Zhihong Yu commented on HBASE-5208:
---

{code}
+assertTrue(job.isComplete());
{code}
Can we add more validation on top of the above ?

Also, TestTableInputFormatScan took 885 seconds on Jenkins. Is there a way to 
shorten it ?

 Allow setting Scan start/stop row individually in TableInputFormat
 --

 Key: HBASE-5208
 URL: https://issues.apache.org/jira/browse/HBASE-5208
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Nicholas Telford
Priority: Minor
 Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, 
 HBASE-5208-003.txt, HBASE-5208-004.txt


 Currently, TableInputFormat initializes a serialized Scan from 
 hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using 
 properties defined in hbase.mapreduce.scan.*. However, of these properties 
 the start row and stop row (arguably the most pertinent) are missing.
 TableInputFormat should permit the specification of a start/stop row as with 
 the other fields using a new pair of properties: 
 hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end
 The primary use-case for this is to permit Oozie and other job management 
 tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a 
 contiguous subset of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-17 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187845#comment-13187845
 ] 

Lars Hofhansl commented on HBASE-5153:
--

Was that repeatable? Might be a problem with the test (relying on the fact that 
an HConnection would just give up after one try).

 Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
 ---

 Key: HBASE-5153
 URL: https://issues.apache.org/jira/browse/HBASE-5153
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.94.0, 0.92.1, 0.90.6

 Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
 HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
 HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
 HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
 HBASE-5153.patch, TestResults-hbase5153.out


 HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
 share a same connection, once this connection got abort in one thread, the 
 other threads will got a 
 HConnectionManager$HConnectionImplementation@18fb1f7 closed exception.
 It solve the problem of stale connection can't removed. But the orignal 
 HTable instance cann't be continue to use. The connection in HTable should be 
 recreated.
 Actually, there's two aproach to solve this:
 1. In user code, once catch an IOE, close connection and re-create HTable 
 instance. We can use this as a workaround.
 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187848#comment-13187848
 ] 

Zhihong Yu commented on HBASE-5153:
---

TestMergeTool hung in several builds.
That was why I ran it on MacBook.

 Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
 ---

 Key: HBASE-5153
 URL: https://issues.apache.org/jira/browse/HBASE-5153
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.94.0, 0.92.1, 0.90.6

 Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
 HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
 HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
 HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
 HBASE-5153.patch, TestResults-hbase5153.out


 HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
 share a same connection, once this connection got abort in one thread, the 
 other threads will got a 
 HConnectionManager$HConnectionImplementation@18fb1f7 closed exception.
 It solve the problem of stale connection can't removed. But the orignal 
 HTable instance cann't be continue to use. The connection in HTable should be 
 recreated.
 Actually, there's two aproach to solve this:
 1. In user code, once catch an IOE, close connection and re-create HTable 
 instance. We can use this as a workaround.
 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5216) [book] book.xml - added more in Arch when to use hbase section

2012-01-17 Thread Doug Meil (Created) (JIRA)
[book] book.xml - added more in Arch when to use hbase section


 Key: HBASE-5216
 URL: https://issues.apache.org/jira/browse/HBASE-5216
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: book_hbase_5216.xml.patch

book.xml - Arch chapter, added an additional paragraph in the when to use 
hbase section.  This is based on a lengthy dist-list conversation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5216) [book] book.xml - added more in Arch when to use hbase section

2012-01-17 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5216:
-

Status: Patch Available  (was: Open)

 [book] book.xml - added more in Arch when to use hbase section
 

 Key: HBASE-5216
 URL: https://issues.apache.org/jira/browse/HBASE-5216
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: book_hbase_5216.xml.patch


 book.xml - Arch chapter, added an additional paragraph in the when to use 
 hbase section.  This is based on a lengthy dist-list conversation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5216) [book] book.xml - added more in Arch when to use hbase section

2012-01-17 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5216:
-

Attachment: book_hbase_5216.xml.patch

 [book] book.xml - added more in Arch when to use hbase section
 

 Key: HBASE-5216
 URL: https://issues.apache.org/jira/browse/HBASE-5216
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: book_hbase_5216.xml.patch


 book.xml - Arch chapter, added an additional paragraph in the when to use 
 hbase section.  This is based on a lengthy dist-list conversation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5216) [book] book.xml - added more in Arch when to use hbase section

2012-01-17 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5216:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 [book] book.xml - added more in Arch when to use hbase section
 

 Key: HBASE-5216
 URL: https://issues.apache.org/jira/browse/HBASE-5216
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: book_hbase_5216.xml.patch


 book.xml - Arch chapter, added an additional paragraph in the when to use 
 hbase section.  This is based on a lengthy dist-list conversation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187863#comment-13187863
 ] 

Zhihong Yu commented on HBASE-5179:
---

For hbase-5179v9.patch:
{code}
+  if (currentMetaServer != null
+   this.serverManager.isServerOnline(currentMetaServer)) {
+// Current meta server is dead, we first split its log and then expire
{code}
Is the above right ?
{code}
+this.fileSystemManager.splitLog(currentMetaServer);
 expireIfOnline(currentMetaServer);
{code}
The splitLog() would end up calling:
{code}
  public void splitLog(final ListServerName serverNames) throws IOException {
{code}
If distributedLogSplitting is true, log splitting would be in progress upon 
return from splitLog().
We need to either disable distributedLogSplitting or introduce synchronization 
mechanism for above.

 Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
 region to be assigned before log splitting is completed, causing data loss
 

 Key: HBASE-5179
 URL: https://issues.apache.org/jira/browse/HBASE-5179
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.2
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.92.0, 0.94.0, 0.90.6

 Attachments: 5179-90.txt, 5179-90v2.patch, 5179-90v3.patch, 
 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 5179-90v7.patch, 
 5179-90v8.patch, 5179-90v9.patch, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, 
 hbase-5179.patch, hbase-5179v5.patch, hbase-5179v6.patch, hbase-5179v7.patch, 
 hbase-5179v8.patch, hbase-5179v9.patch


 If master's processing its failover and ServerShutdownHandler's processing 
 happen concurrently, it may appear following  case.
 1.master completed splitLogAfterStartup()
 2.RegionserverA restarts, and ServerShutdownHandler is processing.
 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
 dead server.
 4.master starts to assign regions of RegionserverA because it is a dead 
 server by step3.
 However, when doing step4(assigning region), ServerShutdownHandler may be 
 doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5054) hadoop's classpath takes precedence over hbase's classpath

2012-01-17 Thread Harsh J (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HBASE-5054.


Resolution: Duplicate

This should already be fixed via HBASE-4854

 hadoop's classpath takes precedence over hbase's classpath
 --

 Key: HBASE-5054
 URL: https://issues.apache.org/jira/browse/HBASE-5054
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.90.4
Reporter: Steve Hoffman

 Since hbase shares the metrics framework with core hadoop, and they both use 
 'hadoop-metrics.properties' file on the classpath for configuration, the 
 ordering causes hbase's directories to be shadowed by hadoop's.  What this 
 means is that for me to set hbase's hadoop-metrics.properties, I have to do 
 it in /etc/hadoop/conf since the one in /etc/hbase/conf is later in the 
 classpath.
 Running hbase classpath confirms the ordering:
 {quote}
 % hbase classpath
 

[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-17 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4218:
---

Attachment: D447.24.patch

mbautin updated the revision [jira] [HBASE-4218] HFile data block encoding 
framework and delta encoding implementation.
Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

  Addressing Ted's comments and removing two .rej files that somehow got into 
the patch.

REVISION DETAIL
  https://reviews.facebook.net/D447

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/KeyValue.java
  src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
  src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
  src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
  src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
  src/main/ruby/hbase/admin.rb
  src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
  src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
  src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
  src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
  

[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-17 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187888#comment-13187888
 ] 

Hadoop QA commented on HBASE-4218:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510870/D447.24.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 137 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/792//console

This message is automatically generated.

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.3.patch, D447.4.patch, D447.5.patch, 
 D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-17 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4218:
--

Attachment: Delta-encoding-2012-01-17_11_09_09.patch

Appending a patch that can be applied by Hadoop QA.

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.3.patch, D447.4.patch, D447.5.patch, 
 D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding-2012-01-17_11_09_09.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-17 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4218:
--

Comment: was deleted

(was: -1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510870/D447.24.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 137 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/792//console

This message is automatically generated.)

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.3.patch, D447.4.patch, D447.5.patch, 
 D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding-2012-01-17_11_09_09.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-17 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4218:
--

Comment: was deleted

(was: -1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12509638/D447.19.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 131 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/678//console

This message is automatically generated.)

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.3.patch, D447.4.patch, D447.5.patch, 
 D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding-2012-01-17_11_09_09.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-17 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4218:
--

Comment: was deleted

(was: -1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12509651/D447.20.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 133 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/680//console

This message is automatically generated.)

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.3.patch, D447.4.patch, D447.5.patch, 
 D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding-2012-01-17_11_09_09.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5217) Reenable the thrift tests, and add a new one for getRegionInfo

2012-01-17 Thread Alex Newman (Created) (JIRA)
Reenable the thrift tests, and add a new one for getRegionInfo
--

 Key: HBASE-5217
 URL: https://issues.apache.org/jira/browse/HBASE-5217
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Assignee: Alex Newman


At some point we disabled tests for the thrift server. In addition, it looks 
like the getRegionInfo no longer functions. I'd like to reenable the tests and 
add one for getRegionInfo.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5217) Reenable the thrift tests, and add a new one for getRegionInfo

2012-01-17 Thread Alex Newman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-5217:
---

Attachment: 0001-Fixing-thrift-tests.patch

 Reenable the thrift tests, and add a new one for getRegionInfo
 --

 Key: HBASE-5217
 URL: https://issues.apache.org/jira/browse/HBASE-5217
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Assignee: Alex Newman
 Attachments: 0001-Fixing-thrift-tests.patch


 At some point we disabled tests for the thrift server. In addition, it looks 
 like the getRegionInfo no longer functions. I'd like to reenable the tests 
 and add one for getRegionInfo.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5217) Reenable the thrift tests, and add a new one for getRegionInfo

2012-01-17 Thread Alex Newman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-5217:
---

Priority: Minor  (was: Major)

 Reenable the thrift tests, and add a new one for getRegionInfo
 --

 Key: HBASE-5217
 URL: https://issues.apache.org/jira/browse/HBASE-5217
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Attachments: 0001-Fixing-thrift-tests.patch


 At some point we disabled tests for the thrift server. In addition, it looks 
 like the getRegionInfo no longer functions. I'd like to reenable the tests 
 and add one for getRegionInfo.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5217) Reenable the thrift tests, and add a new one for getRegionInfo

2012-01-17 Thread Alex Newman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-5217:
---

Description: At some point we disabled tests for the thrift server. In 
addition, it looks like the getRegionInfo no longer functions. I'd like to 
reenable the tests and add one for getRegionInfo. I had to write this to test 
my changes in HBASE-2600 anyway. I figured I would break it out. We shouldn't 
commit it until we have fixed getting the regioninfo from the thriftserver.  
(was: At some point we disabled tests for the thrift server. In addition, it 
looks like the getRegionInfo no longer functions. I'd like to reenable the 
tests and add one for getRegionInfo.)

 Reenable the thrift tests, and add a new one for getRegionInfo
 --

 Key: HBASE-5217
 URL: https://issues.apache.org/jira/browse/HBASE-5217
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Attachments: 0001-Fixing-thrift-tests.patch


 At some point we disabled tests for the thrift server. In addition, it looks 
 like the getRegionInfo no longer functions. I'd like to reenable the tests 
 and add one for getRegionInfo. I had to write this to test my changes in 
 HBASE-2600 anyway. I figured I would break it out. We shouldn't commit it 
 until we have fixed getting the regioninfo from the thriftserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5212) Fix test TestTableMapReduce against 0.23.

2012-01-17 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5212:
--

Attachment: 5212-v2.txt

Patch v2 allows TestTableMapReduce to pass based on hadoop 0.23

 Fix test TestTableMapReduce against 0.23.
 -

 Key: HBASE-5212
 URL: https://issues.apache.org/jira/browse/HBASE-5212
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Mahadev konar
 Fix For: 0.92.1

 Attachments: 5212-v2.txt, HBASE-5212.patch


 As reported by Andrew on the hadoop mailing list, mvn -Dhadoop.profile=23 
 clean test -Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce fails 
 on 0.92 branch. There are minor changes to HBase poms required to fix that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5217) Reenable the thrift tests, and add a new one for getRegionInfo

2012-01-17 Thread Alex Newman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-5217:
---

Attachment: 0001-Fixing-thrift-tests-v2.patch

 Reenable the thrift tests, and add a new one for getRegionInfo
 --

 Key: HBASE-5217
 URL: https://issues.apache.org/jira/browse/HBASE-5217
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Attachments: 0001-Fixing-thrift-tests-v2.patch, 
 0001-Fixing-thrift-tests.patch


 At some point we disabled tests for the thrift server. In addition, it looks 
 like the getRegionInfo no longer functions. I'd like to reenable the tests 
 and add one for getRegionInfo. I had to write this to test my changes in 
 HBASE-2600 anyway. I figured I would break it out. We shouldn't commit it 
 until we have fixed getting the regioninfo from the thriftserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5217) Reenable the thrift tests, and add a new one for getRegionInfo

2012-01-17 Thread Alex Newman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-5217:
---

Status: Patch Available  (was: Open)

 Reenable the thrift tests, and add a new one for getRegionInfo
 --

 Key: HBASE-5217
 URL: https://issues.apache.org/jira/browse/HBASE-5217
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Attachments: 0001-Fixing-thrift-tests-v2.patch, 
 0001-Fixing-thrift-tests.patch


 At some point we disabled tests for the thrift server. In addition, it looks 
 like the getRegionInfo no longer functions. I'd like to reenable the tests 
 and add one for getRegionInfo. I had to write this to test my changes in 
 HBASE-2600 anyway. I figured I would break it out. We shouldn't commit it 
 until we have fixed getting the regioninfo from the thriftserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5212) Fix test TestTableMapReduce against 0.23.

2012-01-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187916#comment-13187916
 ] 

Zhihong Yu commented on HBASE-5212:
---

mapreduce.TestTableMapReduce also passes based on hadoop 1.0

 Fix test TestTableMapReduce against 0.23.
 -

 Key: HBASE-5212
 URL: https://issues.apache.org/jira/browse/HBASE-5212
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Mahadev konar
 Fix For: 0.92.1

 Attachments: 5212-v2.txt, HBASE-5212.patch


 As reported by Andrew on the hadoop mailing list, mvn -Dhadoop.profile=23 
 clean test -Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce fails 
 on 0.92 branch. There are minor changes to HBase poms required to fix that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5217) Reenable the thrift tests, and add a new one for getRegionInfo

2012-01-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187917#comment-13187917
 ] 

jirapos...@reviews.apache.org commented on HBASE-5217:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3514/
---

(Updated 2012-01-17 19:26:07.447407)


Review request for hbase.


Summary
---

At some point we disabled tests for the thrift server. In addition, it looks 
like the getRegionInfo no longer functions. I'd like to reenable the tests and 
add one for getRegionInfo.


This addresses bug HBASE-5217.
https://issues.apache.org/jira/browse/HBASE-5217


Diffs (updated)
-

  src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java 12247d0 

Diff: https://reviews.apache.org/r/3514/diff


Testing
---

Ran the tests with my changes in HBASE-2600 to test.


Thanks,

Alex



 Reenable the thrift tests, and add a new one for getRegionInfo
 --

 Key: HBASE-5217
 URL: https://issues.apache.org/jira/browse/HBASE-5217
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Attachments: 0001-Fixing-thrift-tests-v2.patch, 
 0001-Fixing-thrift-tests.patch


 At some point we disabled tests for the thrift server. In addition, it looks 
 like the getRegionInfo no longer functions. I'd like to reenable the tests 
 and add one for getRegionInfo. I had to write this to test my changes in 
 HBASE-2600 anyway. I figured I would break it out. We shouldn't commit it 
 until we have fixed getting the regioninfo from the thriftserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5218) [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix

2012-01-17 Thread Doug Meil (Created) (JIRA)
[book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix
---

 Key: HBASE-5218
 URL: https://issues.apache.org/jira/browse/HBASE-5218
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: book_hbase_5218.xml.patch

Stack asked me to do this in December:  added link in Arch/HFile to the HFile 
v2 information in the appendix.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5218) [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix

2012-01-17 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5218:
-

Attachment: book_hbase_5218.xml.patch

 [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix
 ---

 Key: HBASE-5218
 URL: https://issues.apache.org/jira/browse/HBASE-5218
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: book_hbase_5218.xml.patch


 Stack asked me to do this in December:  added link in Arch/HFile to the HFile 
 v2 information in the appendix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5218) [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix

2012-01-17 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5218:
-

Status: Patch Available  (was: Open)

 [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix
 ---

 Key: HBASE-5218
 URL: https://issues.apache.org/jira/browse/HBASE-5218
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: book_hbase_5218.xml.patch


 Stack asked me to do this in December:  added link in Arch/HFile to the HFile 
 v2 information in the appendix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5218) [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix

2012-01-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187922#comment-13187922
 ] 

jirapos...@reviews.apache.org commented on HBASE-5218:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3435/
---

Review request for hbase, Todd Lipcon, Ted Yu, and Michael Stack.


Summary
---

I'm posting a preliminary version that I'm currently testing on real clusters. 
The tests are flakey on the 0.90 branch (so there is something async that I 
didn't synchronize properly), and there are a few more TODO's I want to knock 
out before this is ready for full review to be considered for committing. It's 
got some problems I need some advice figuring out.

Problem 1:

In the unit tests, I have a few cases where I fabricate new regions and try to 
force the overlapping regions to be closed. For some of these, I cannot delete 
a table after it is repaired without causing subsequent tests to fail. I think 
this is due to a few things:

1) The disable table handler uses in-memory assignment manager state while 
delete uses in META assignment information.
2) Currently I'm using the sneaky closeRegion that purposely doesn't go through 
the master and in turn doesn't modify in-memory state – disable uses out of 
date in-memory region assignments. If I use the unassign method sends RIT 
transitions to the master, but which ends up attempting to assign it again, 
causing timing/transient states.

What is a good way to clear the HMaster's assignment manager's assignment data 
for particular regions or to force it to re-read from META? (without modifying 
the 0.90 HBase's it is meant to repair).

Problem 2:

Sometimes test fail reporting HOLE_IN_REGION_CHAIN and 
SERVER_DOES_NOT_MATCH_META. This means the old and new regions are confiused 
with each other and basically something is still happening asynchronously. I 
think this is the new region is being assigned and is still transitioning. 
Sound about right? To make the unit test deterministic, should hbck wait for 
these to settle or should just the unit test wait?


This addresses bug HBASE-5218.
https://issues.apache.org/jira/browse/HBASE-5218


Diffs
-

  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 6d3401d 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java a3d8b8b 
  src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java 
29e8bb2 
  
src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java a640d57 
  src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java dbb97f8 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java 
3e8729d 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java 
11a1151 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java
 4a09ce2 

Diff: https://reviews.apache.org/r/3435/diff


Testing
---

All unit tests pass sometimes.  Some fail sometimes (generally the cases that 
fabricate new regions).  

Not ready for commit.


Thanks,

jmhsieh



 [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix
 ---

 Key: HBASE-5218
 URL: https://issues.apache.org/jira/browse/HBASE-5218
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: book_hbase_5218.xml.patch


 Stack asked me to do this in December:  added link in Arch/HFile to the HFile 
 v2 information in the appendix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5218) [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix

2012-01-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187923#comment-13187923
 ] 

jirapos...@reviews.apache.org commented on HBASE-5218:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3435/#review4263
---



src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
https://reviews.apache.org/r/3435/#comment9627

This causes unit test failure.



src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
https://reviews.apache.org/r/3435/#comment9628

This causes unit test failure.



src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
https://reviews.apache.org/r/3435/#comment9629

This causes unit test failure.



src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
https://reviews.apache.org/r/3435/#comment9630

This causes unit test failure.


- jmhsieh


On 2012-01-09 08:58:17, jmhsieh wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3435/
bq.  ---
bq.  
bq.  (Updated 2012-01-09 08:58:17)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon, Ted Yu, and Michael Stack.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  I'm posting a preliminary version that I'm currently testing on real 
clusters. The tests are flakey on the 0.90 branch (so there is something async 
that I didn't synchronize properly), and there are a few more TODO's I want to 
knock out before this is ready for full review to be considered for committing. 
It's got some problems I need some advice figuring out.
bq.  
bq.  Problem 1:
bq.  
bq.  In the unit tests, I have a few cases where I fabricate new regions and 
try to force the overlapping regions to be closed. For some of these, I cannot 
delete a table after it is repaired without causing subsequent tests to fail. I 
think this is due to a few things:
bq.  
bq.  1) The disable table handler uses in-memory assignment manager state while 
delete uses in META assignment information.
bq.  2) Currently I'm using the sneaky closeRegion that purposely doesn't go 
through the master and in turn doesn't modify in-memory state – disable uses 
out of date in-memory region assignments. If I use the unassign method sends 
RIT transitions to the master, but which ends up attempting to assign it again, 
causing timing/transient states.
bq.  
bq.  What is a good way to clear the HMaster's assignment manager's assignment 
data for particular regions or to force it to re-read from META? (without 
modifying the 0.90 HBase's it is meant to repair).
bq.  
bq.  Problem 2:
bq.  
bq.  Sometimes test fail reporting HOLE_IN_REGION_CHAIN and 
SERVER_DOES_NOT_MATCH_META. This means the old and new regions are confiused 
with each other and basically something is still happening asynchronously. I 
think this is the new region is being assigned and is still transitioning. 
Sound about right? To make the unit test deterministic, should hbck wait for 
these to settle or should just the unit test wait?
bq.  
bq.  
bq.  This addresses bug HBASE-5218.
bq.  https://issues.apache.org/jira/browse/HBASE-5218
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 6d3401d 
bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java a3d8b8b 
bq.src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java 
29e8bb2 
bq.
src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java a640d57 
bq.src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java 
dbb97f8 
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java 
3e8729d 
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java 
11a1151 
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java
 4a09ce2 
bq.  
bq.  Diff: https://reviews.apache.org/r/3435/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  All unit tests pass sometimes.  Some fail sometimes (generally the cases 
that fabricate new regions).  
bq.  
bq.  Not ready for commit.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  jmhsieh
bq.  
bq.



 [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix
 ---

 Key: HBASE-5218
 URL: https://issues.apache.org/jira/browse/HBASE-5218
 Project: HBase
  Issue Type: Improvement
Reporter: Doug 

[jira] [Updated] (HBASE-5218) [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix

2012-01-17 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5218:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix
 ---

 Key: HBASE-5218
 URL: https://issues.apache.org/jira/browse/HBASE-5218
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: book_hbase_5218.xml.patch


 Stack asked me to do this in December:  added link in Arch/HFile to the HFile 
 v2 information in the appendix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.

2012-01-17 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5203:
-

Description: 
HBASE-3584 does not not provide fully atomic operation in case of region server 
failures (see explanation there).

What should happen is that either (1) all edits are applied via a single 
WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed 
together.

For #1 it is not clear whether it is advisable to manage multiple *different* 
operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL 
replay on region startup would work, but that replication would need to be 
adapted. The refactoring needed would be non-trivial.

#2 Might actually not work, as another operation could request sync'ing a later 
edit and hence flush these entries out as well.

Addendum:
The attached patch implements #1 and fixes replication to be able to deal with 
different operations being grouped in one WALEdit.

  was:
HBASE-3584 does not not provide fully atomic operation in case of region server 
failures (see explanation there).

What should happen is that either (1) all edits are applied via a single 
WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed 
together.

For #1 it is not clear whether it is advisable to manage multiple *different* 
operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL 
replay on region startup would work, but that replication would need to be 
adapted. The refactoring needed would be non-trivial.

#2 Might actually not work, as another operation could request sync'ing a later 
edit and hence flush these entries out as well.



 Group atomic put/delete operation into a single WALEdit to handle region 
 server failures.
 -

 Key: HBASE-5203
 URL: https://issues.apache.org/jira/browse/HBASE-5203
 Project: HBase
  Issue Type: Sub-task
  Components: client, coprocessors, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5203-v3.txt, 5203.txt


 HBASE-3584 does not not provide fully atomic operation in case of region 
 server failures (see explanation there).
 What should happen is that either (1) all edits are applied via a single 
 WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed 
 together.
 For #1 it is not clear whether it is advisable to manage multiple *different* 
 operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL 
 replay on region startup would work, but that replication would need to be 
 adapted. The refactoring needed would be non-trivial.
 #2 Might actually not work, as another operation could request sync'ing a 
 later edit and hence flush these entries out as well.
 Addendum:
 The attached patch implements #1 and fixes replication to be able to deal 
 with different operations being grouped in one WALEdit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5217) Reenable the thrift tests, and add a new one for getRegionInfo

2012-01-17 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187929#comment-13187929
 ] 

Hadoop QA commented on HBASE-5217:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12510881/0001-Fixing-thrift-tests-v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 82 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/794//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/794//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/794//console

This message is automatically generated.

 Reenable the thrift tests, and add a new one for getRegionInfo
 --

 Key: HBASE-5217
 URL: https://issues.apache.org/jira/browse/HBASE-5217
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Attachments: 0001-Fixing-thrift-tests-v2.patch, 
 0001-Fixing-thrift-tests.patch


 At some point we disabled tests for the thrift server. In addition, it looks 
 like the getRegionInfo no longer functions. I'd like to reenable the tests 
 and add one for getRegionInfo. I had to write this to test my changes in 
 HBASE-2600 anyway. I figured I would break it out. We shouldn't commit it 
 until we have fixed getting the regioninfo from the thriftserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.

2012-01-17 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5203:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk

 Group atomic put/delete operation into a single WALEdit to handle region 
 server failures.
 -

 Key: HBASE-5203
 URL: https://issues.apache.org/jira/browse/HBASE-5203
 Project: HBase
  Issue Type: Sub-task
  Components: client, coprocessors, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5203-v3.txt, 5203.txt


 HBASE-3584 does not not provide fully atomic operation in case of region 
 server failures (see explanation there).
 What should happen is that either (1) all edits are applied via a single 
 WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed 
 together.
 For #1 it is not clear whether it is advisable to manage multiple *different* 
 operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL 
 replay on region startup would work, but that replication would need to be 
 adapted. The refactoring needed would be non-trivial.
 #2 Might actually not work, as another operation could request sync'ing a 
 later edit and hence flush these entries out as well.
 Addendum:
 The attached patch implements #1 and fixes replication to be able to deal 
 with different operations being grouped in one WALEdit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4913) Per-CF compaction Via the Shell

2012-01-17 Thread Mubarak Seyed (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187933#comment-13187933
 ] 

Mubarak Seyed commented on HBASE-4913:
--

@Nicolas,
If no one is working on this issue, can i take it and port from 89-fb? Please 
let me know. Thanks.

 Per-CF compaction Via the Shell
 ---

 Key: HBASE-4913
 URL: https://issues.apache.org/jira/browse/HBASE-4913
 Project: HBase
  Issue Type: Sub-task
  Components: client, regionserver
Reporter: Nicolas Spiegelberg
 Fix For: 0.94.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5217) Reenable the thrift tests, and add a new one for getRegionInfo

2012-01-17 Thread Alex Newman (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187937#comment-13187937
 ] 

Alex Newman commented on HBASE-5217:


Expect this build to fail until the dependent jira is done.

 Reenable the thrift tests, and add a new one for getRegionInfo
 --

 Key: HBASE-5217
 URL: https://issues.apache.org/jira/browse/HBASE-5217
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Attachments: 0001-Fixing-thrift-tests-v2.patch, 
 0001-Fixing-thrift-tests.patch


 At some point we disabled tests for the thrift server. In addition, it looks 
 like the getRegionInfo no longer functions. I'd like to reenable the tests 
 and add one for getRegionInfo. I had to write this to test my changes in 
 HBASE-2600 anyway. I figured I would break it out. We shouldn't commit it 
 until we have fixed getting the regioninfo from the thriftserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5217) Reenable the thrift tests, and add a new one for getRegionInfo

2012-01-17 Thread Alex Newman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-5217:
---

Attachment: -hbase-posix4e #92 Console [Jenkins].pdf

I attached my own jenkins run. The one failing test is due to 2600 not handling 
migrations yet.

 Reenable the thrift tests, and add a new one for getRegionInfo
 --

 Key: HBASE-5217
 URL: https://issues.apache.org/jira/browse/HBASE-5217
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Attachments: 0001-Fixing-thrift-tests-v2.patch, 
 0001-Fixing-thrift-tests.patch, -hbase-posix4e #92 Console [Jenkins].pdf


 At some point we disabled tests for the thrift server. In addition, it looks 
 like the getRegionInfo no longer functions. I'd like to reenable the tests 
 and add one for getRegionInfo. I had to write this to test my changes in 
 HBASE-2600 anyway. I figured I would break it out. We shouldn't commit it 
 until we have fixed getting the regioninfo from the thriftserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4770) Enable per column family compaction for a region via hbase shell

2012-01-17 Thread Mubarak Seyed (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187943#comment-13187943
 ] 

Mubarak Seyed commented on HBASE-4770:
--

@Pritam / @Nicolas,

If no one is working on this issue, can i take it up and port it to trunk? 
Thanks.

 Enable per column family compaction for a region via hbase shell
 

 Key: HBASE-4770
 URL: https://issues.apache.org/jira/browse/HBASE-4770
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.0
Reporter: Nicolas Spiegelberg
Assignee: Pritam Damania
Priority: Minor

 Pretty self-explanatory.  We have per-table  per-region compaction.  
 However, there are use cases where you have 10+ column families.  It is 
 useful to be able to specify the actual CF you wish to compact.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5218) [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix

2012-01-17 Thread Doug Meil (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187947#comment-13187947
 ] 

Doug Meil commented on HBASE-5218:
--

All these 'hbck' comments are probably for HBASE-5128, not this issue.

 [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix
 ---

 Key: HBASE-5218
 URL: https://issues.apache.org/jira/browse/HBASE-5218
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: book_hbase_5218.xml.patch


 Stack asked me to do this in December:  added link in Arch/HFile to the HFile 
 v2 information in the appendix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-17 Thread Doug Meil (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187948#comment-13187948
 ] 

Doug Meil commented on HBASE-5128:
--

Hey guys, a bunch of comments just wound up on a documentation ticket I just 
did (HBASE-5218) that I'm pretty sure were intended for this ticket.

 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.92.0, 0.90.5
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh

 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187949#comment-13187949
 ] 

jirapos...@reviews.apache.org commented on HBASE-2600:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3466/#review4427
---


I love all that removed code!!
I think it's fine to have it with patch (in fact removing all that code and the 
main reason why we're doing this, right?)


src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
https://reviews.apache.org/r/3466/#comment9937

This still has the startKey, but it is not needed. Seems we can simplify 
the code further by only requiring the endKey here(?)


- Lars


On 2012-01-17 02:40:00, Alex Newman wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3466/
bq.  ---
bq.  
bq.  (Updated 2012-01-17 02:40:00)
bq.  
bq.  
bq.  Review request for hbase, Michael Stack and Lars Hofhansl.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  This is an idea that Ryan and I have been kicking around on and off for a 
while now.
bq.  
bq.  If regionnames were made of tablename+endrow instead of 
tablename+startrow, then in the metatables, doing a search for the region that 
contains the wanted row, we'd just have to open a scanner using passed row and 
the first row found by the scan would be that of the region we need (If 
offlined parent, we'd have to scan to the next row).
bq.  
bq.  If we redid the meta tables in this format, we'd be using an access that 
is natural to hbase, a scan as opposed to the perverse, expensive 
getClosestRowBefore we currently have that has to walk backward in meta finding 
a containing region.
bq.  
bq.  This issue is about changing the way we name regions.
bq.  
bq.  If we were using scans, prewarming client cache would be near costless (as 
opposed to what we'll currently have to do which is first a getClosestRowBefore 
and then a scan from the closestrowbefore forward).
bq.  
bq.  Converting to the new method, we'd have to run a migration on startup 
changing the content in meta.
bq.  
bq.  Up to this, the randomid component of a region name has been the timestamp 
of region creation. HBASE-2531 32-bit encoding of regionnames waaay 
too susceptible to hash clashes proposes changing the randomid so that it 
contains actual name of the directory in the filesystem that hosts the region. 
If we had this in place, I think it would help with the migration to this new 
way of doing the meta because as is, the region name in fs is a hash of 
regionname... changing the format of the regionname would mean we generate a 
different hash... so we'd need hbase-2531 to be in place before we could do 
this change.
bq.  
bq.  
bq.  This addresses bug HBASE-2600.
bq.  https://issues.apache.org/jira/browse/HBASE-2600
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 
bq.src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 
bq.src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d 
bq.src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 
bq.src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 
bq.src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f 
bq.src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 
99f90b2 
bq.src/main/java/org/apache/hadoop/hbase/client/HTable.java 8aeccb6 
bq.src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 
8af8c28 
bq.src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f 
bq.src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 
bq.
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 
7a7b896 
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
b47423c 
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 
c0a4184 
bq.src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8 
bq.
src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 
8f4f4b8 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java
 3a26bbb 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
79373e1 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
a3850e5 
bq.

[jira] [Created] (HBASE-5219) Improve region selection from overloaded servers

2012-01-17 Thread Zhihong Yu (Created) (JIRA)
Improve region selection from overloaded servers


 Key: HBASE-5219
 URL: https://issues.apache.org/jira/browse/HBASE-5219
 Project: HBase
  Issue Type: Sub-task
Reporter: Zhihong Yu


As Jinchao pointed out in HBASE-3724, the current region selection method still 
tends to move many hot regions onto a new server joining the cluster.

This JIRA improves the selection to avoid such scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3724) Load balancer improvements

2012-01-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187954#comment-13187954
 ] 

Zhihong Yu commented on HBASE-3724:
---

HBASE-5219 tries to address Jinchao's comment above.

 Load balancer improvements
 --

 Key: HBASE-3724
 URL: https://issues.apache.org/jira/browse/HBASE-3724
 Project: HBase
  Issue Type: Umbrella
Reporter: stack

 Umbrella issue under which we hang all regions related to balancer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5218) [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix

2012-01-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187958#comment-13187958
 ] 

stack commented on HBASE-5218:
--

+1

 [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix
 ---

 Key: HBASE-5218
 URL: https://issues.apache.org/jira/browse/HBASE-5218
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: book_hbase_5218.xml.patch


 Stack asked me to do this in December:  added link in Arch/HFile to the HFile 
 v2 information in the appendix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5220) [book] troubleshooting.xml - adding information about 'zkcli' to Troubleshooting chapter

2012-01-17 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5220:
-

Attachment: troubleshooting_hbase_5220.xml.patch

 [book] troubleshooting.xml - adding information about 'zkcli' to 
 Troubleshooting chapter
 

 Key: HBASE-5220
 URL: https://issues.apache.org/jira/browse/HBASE-5220
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: troubleshooting_hbase_5220.xml.patch


 troubleshooting.xml
 * added entry in 'builtin' tools section on zkcli
 * added link in trouble.zookeeper.general to the above entry
 This utility came up on the dist-list recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5220) [book] troubleshooting.xml - adding information about 'zkcli' to Troubleshooting chapter

2012-01-17 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5220:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 [book] troubleshooting.xml - adding information about 'zkcli' to 
 Troubleshooting chapter
 

 Key: HBASE-5220
 URL: https://issues.apache.org/jira/browse/HBASE-5220
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: troubleshooting_hbase_5220.xml.patch


 troubleshooting.xml
 * added entry in 'builtin' tools section on zkcli
 * added link in trouble.zookeeper.general to the above entry
 This utility came up on the dist-list recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5220) [book] troubleshooting.xml - adding information about 'zkcli' to Troubleshooting chapter

2012-01-17 Thread Doug Meil (Created) (JIRA)
[book] troubleshooting.xml - adding information about 'zkcli' to 
Troubleshooting chapter


 Key: HBASE-5220
 URL: https://issues.apache.org/jira/browse/HBASE-5220
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: troubleshooting_hbase_5220.xml.patch

troubleshooting.xml
* added entry in 'builtin' tools section on zkcli
* added link in trouble.zookeeper.general to the above entry

This utility came up on the dist-list recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5212) Fix test TestTableMapReduce against 0.23.

2012-01-17 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187956#comment-13187956
 ] 

Hadoop QA commented on HBASE-5212:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510879/5212-v2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 82 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.regionserver.wal.TestLogRolling

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/795//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/795//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/795//console

This message is automatically generated.

 Fix test TestTableMapReduce against 0.23.
 -

 Key: HBASE-5212
 URL: https://issues.apache.org/jira/browse/HBASE-5212
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Mahadev konar
 Fix For: 0.92.1

 Attachments: 5212-v2.txt, HBASE-5212.patch


 As reported by Andrew on the hadoop mailing list, mvn -Dhadoop.profile=23 
 clean test -Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce fails 
 on 0.92 branch. There are minor changes to HBase poms required to fix that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5220) [book] troubleshooting.xml - adding information about 'zkcli' to Troubleshooting chapter

2012-01-17 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5220:
-

Status: Patch Available  (was: Open)

 [book] troubleshooting.xml - adding information about 'zkcli' to 
 Troubleshooting chapter
 

 Key: HBASE-5220
 URL: https://issues.apache.org/jira/browse/HBASE-5220
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: troubleshooting_hbase_5220.xml.patch


 troubleshooting.xml
 * added entry in 'builtin' tools section on zkcli
 * added link in trouble.zookeeper.general to the above entry
 This utility came up on the dist-list recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5176) AssignmentManager: getRegion: logging nit adds a redundant '+'

2012-01-17 Thread Karthik K (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187965#comment-13187965
 ] 

Karthik K commented on HBASE-5176:
--

Just curious, what is the procedure for the code review, to help get this into 
the branch ? 

 AssignmentManager: getRegion: logging nit  adds a redundant '+'  
 -

 Key: HBASE-5176
 URL: https://issues.apache.org/jira/browse/HBASE-5176
 Project: HBase
  Issue Type: Bug
 Environment: hadoop 1.0.0 , zk 3.4.2 , hbase 0.92.0 rc3
Reporter: Karthik K
Priority: Minor
 Fix For: 0.92.0

 Attachments: HBASE-5176.patch


 From the logs of HMaster: 
 2012-01-10 17:28:24,370 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 -ROOT-,,0.70236052 destination server is + localhost,60020,1326242475275
 Was the '+' intended to be there , as part of some token for log verification 
 or just being redundant , w.r.t the following string append ? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5218) [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix

2012-01-17 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187970#comment-13187970
 ] 

Jonathan Hsieh commented on HBASE-5218:
---

@Doug

I updated review board -- sorry for my minor dyslexia.


 [book] book.xml - added Arch/Hfile, added link to HFile v2 info in appendix
 ---

 Key: HBASE-5218
 URL: https://issues.apache.org/jira/browse/HBASE-5218
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: book_hbase_5218.xml.patch


 Stack asked me to do this in December:  added link in Arch/HFile to the HFile 
 v2 information in the appendix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5176) AssignmentManager: getRegion: logging nit adds a redundant '+'

2012-01-17 Thread Zhihong Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu reassigned HBASE-5176:
-

Assignee: Karthik K

 AssignmentManager: getRegion: logging nit  adds a redundant '+'  
 -

 Key: HBASE-5176
 URL: https://issues.apache.org/jira/browse/HBASE-5176
 Project: HBase
  Issue Type: Bug
 Environment: hadoop 1.0.0 , zk 3.4.2 , hbase 0.92.0 rc3
Reporter: Karthik K
Assignee: Karthik K
Priority: Minor
 Fix For: 0.92.0

 Attachments: HBASE-5176.patch


 From the logs of HMaster: 
 2012-01-10 17:28:24,370 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 -ROOT-,,0.70236052 destination server is + localhost,60020,1326242475275
 Was the '+' intended to be there , as part of some token for log verification 
 or just being redundant , w.r.t the following string append ? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187980#comment-13187980
 ] 

jirapos...@reviews.apache.org commented on HBASE-2600:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3466/
---

(Updated 2012-01-17 20:35:34.796686)


Review request for hbase, Michael Stack and Lars Hofhansl.


Summary
---

This is an idea that Ryan and I have been kicking around on and off for a while 
now.

If regionnames were made of tablename+endrow instead of tablename+startrow, 
then in the metatables, doing a search for the region that contains the wanted 
row, we'd just have to open a scanner using passed row and the first row found 
by the scan would be that of the region we need (If offlined parent, we'd have 
to scan to the next row).

If we redid the meta tables in this format, we'd be using an access that is 
natural to hbase, a scan as opposed to the perverse, expensive 
getClosestRowBefore we currently have that has to walk backward in meta finding 
a containing region.

This issue is about changing the way we name regions.

If we were using scans, prewarming client cache would be near costless (as 
opposed to what we'll currently have to do which is first a getClosestRowBefore 
and then a scan from the closestrowbefore forward).

Converting to the new method, we'd have to run a migration on startup changing 
the content in meta.

Up to this, the randomid component of a region name has been the timestamp of 
region creation. HBASE-2531 32-bit encoding of regionnames waaay too 
susceptible to hash clashes proposes changing the randomid so that it contains 
actual name of the directory in the filesystem that hosts the region. If we had 
this in place, I think it would help with the migration to this new way of 
doing the meta because as is, the region name in fs is a hash of regionname... 
changing the format of the regionname would mean we generate a different 
hash... so we'd need hbase-2531 to be in place before we could do this change.


This addresses bug HBASE-2600.
https://issues.apache.org/jira/browse/HBASE-2600


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 
  src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d 
  src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 
  src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f 
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 
  src/main/java/org/apache/hadoop/hbase/client/HTable.java 8aeccb6 
  src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28 
  src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f 
  src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 
  src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 
7a7b896 
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
b47423c 
  src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184 
  src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8 
  
src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 
8f4f4b8 
  
src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java
 3a26bbb 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1 
  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0 
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
a3850e5 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533 
  src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 
  src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d 
  src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 
67e7a04 
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41 
  src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61 
  src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31 
  src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d 
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java ab80020 
  src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java 
dacb936 
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java
 5f97167 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 
6e1211b 
  

[jira] [Updated] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread Alex Newman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-2600:
---

Attachment: 
0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch

 Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
 tablename+ENDROW+randomid
 

 Key: HBASE-2600
 URL: https://issues.apache.org/jira/browse/HBASE-2600
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Alex Newman
 Attachments: 
 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, jenkins.pdf


 This is an idea that Ryan and I have been kicking around on and off for a 
 while now.
 If regionnames were made of tablename+endrow instead of tablename+startrow, 
 then in the metatables, doing a search for the region that contains the 
 wanted row, we'd just have to open a scanner using passed row and the first 
 row found by the scan would be that of the region we need (If offlined 
 parent, we'd have to scan to the next row).
 If we redid the meta tables in this format, we'd be using an access that is 
 natural to hbase, a scan as opposed to the perverse, expensive 
 getClosestRowBefore we currently have that has to walk backward in meta 
 finding a containing region.
 This issue is about changing the way we name regions.
 If we were using scans, prewarming client cache would be near costless (as 
 opposed to what we'll currently have to do which is first a 
 getClosestRowBefore and then a scan from the closestrowbefore forward).
 Converting to the new method, we'd have to run a migration on startup 
 changing the content in meta.
 Up to this, the randomid component of a region name has been the timestamp of 
 region creation.   HBASE-2531 32-bit encoding of regionnames waaay 
 too susceptible to hash clashes proposes changing the randomid so that it 
 contains actual name of the directory in the filesystem that hosts the 
 region.  If we had this in place, I think it would help with the migration to 
 this new way of doing the meta because as is, the region name in fs is a hash 
 of regionname... changing the format of the regionname would mean we generate 
 a different hash... so we'd need hbase-2531 to be in place before we could do 
 this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5176) AssignmentManager#getRegion: logging nit adds a redundant '+'

2012-01-17 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5176:
--

Summary: AssignmentManager#getRegion: logging nit  adds a redundant '+'
(was: AssignmentManager: getRegion: logging nit  adds a redundant '+'  )

 AssignmentManager#getRegion: logging nit  adds a redundant '+'  
 

 Key: HBASE-5176
 URL: https://issues.apache.org/jira/browse/HBASE-5176
 Project: HBase
  Issue Type: Bug
 Environment: hadoop 1.0.0 , zk 3.4.2 , hbase 0.92.0 rc3
Reporter: Karthik K
Assignee: Karthik K
Priority: Minor
 Fix For: 0.92.0

 Attachments: HBASE-5176.patch


 From the logs of HMaster: 
 2012-01-10 17:28:24,370 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 -ROOT-,,0.70236052 destination server is + localhost,60020,1326242475275
 Was the '+' intended to be there , as part of some token for log verification 
 or just being redundant , w.r.t the following string append ? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5176) AssignmentManager#getRegion: logging nit adds a redundant '+'

2012-01-17 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5176:
--

Fix Version/s: (was: 0.92.0)
   0.92.1
   0.94.0

Integrated to 0.92 branch and TRUNK.

Thanks for the patch Karthik.

Thanks for the review Stack.

 AssignmentManager#getRegion: logging nit  adds a redundant '+'  
 

 Key: HBASE-5176
 URL: https://issues.apache.org/jira/browse/HBASE-5176
 Project: HBase
  Issue Type: Bug
 Environment: hadoop 1.0.0 , zk 3.4.2 , hbase 0.92.0 rc3
Reporter: Karthik K
Assignee: Karthik K
Priority: Minor
 Fix For: 0.94.0, 0.92.1

 Attachments: HBASE-5176.patch


 From the logs of HMaster: 
 2012-01-10 17:28:24,370 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 -ROOT-,,0.70236052 destination server is + localhost,60020,1326242475275
 Was the '+' intended to be there , as part of some token for log verification 
 or just being redundant , w.r.t the following string append ? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187984#comment-13187984
 ] 

Hadoop QA commented on HBASE-2600:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12510898/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 31 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/797//console

This message is automatically generated.

 Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
 tablename+ENDROW+randomid
 

 Key: HBASE-2600
 URL: https://issues.apache.org/jira/browse/HBASE-2600
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Alex Newman
 Attachments: 
 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, jenkins.pdf


 This is an idea that Ryan and I have been kicking around on and off for a 
 while now.
 If regionnames were made of tablename+endrow instead of tablename+startrow, 
 then in the metatables, doing a search for the region that contains the 
 wanted row, we'd just have to open a scanner using passed row and the first 
 row found by the scan would be that of the region we need (If offlined 
 parent, we'd have to scan to the next row).
 If we redid the meta tables in this format, we'd be using an access that is 
 natural to hbase, a scan as opposed to the perverse, expensive 
 getClosestRowBefore we currently have that has to walk backward in meta 
 finding a containing region.
 This issue is about changing the way we name regions.
 If we were using scans, prewarming client cache would be near costless (as 
 opposed to what we'll currently have to do which is first a 
 getClosestRowBefore and then a scan from the closestrowbefore forward).
 Converting to the new method, we'd have to run a migration on startup 
 changing the content in meta.
 Up to this, the randomid component of a region name has been the timestamp of 
 region creation.   HBASE-2531 32-bit encoding of regionnames waaay 
 too susceptible to hash clashes proposes changing the randomid so that it 
 contains actual name of the directory in the filesystem that hosts the 
 region.  If we had this in place, I think it would help with the migration to 
 this new way of doing the meta because as is, the region name in fs is a hash 
 of regionname... changing the format of the regionname would mean we generate 
 a different hash... so we'd need hbase-2531 to be in place before we could do 
 this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187989#comment-13187989
 ] 

Hudson commented on HBASE-5153:
---

Integrated in HBase-0.92 #248 (See 
[https://builds.apache.org/job/HBase-0.92/248/])
HBASE-5153 revert due to failed Jenkins builds

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/ClosedConnectionException.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnection.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithAbort.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithRemove.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperNodeTracker.java


 Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
 ---

 Key: HBASE-5153
 URL: https://issues.apache.org/jira/browse/HBASE-5153
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.94.0, 0.92.1, 0.90.6

 Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
 HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
 HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
 HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
 HBASE-5153.patch, TestResults-hbase5153.out


 HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
 share a same connection, once this connection got abort in one thread, the 
 other threads will got a 
 HConnectionManager$HConnectionImplementation@18fb1f7 closed exception.
 It solve the problem of stale connection can't removed. But the orignal 
 HTable instance cann't be continue to use. The connection in HTable should be 
 recreated.
 Actually, there's two aproach to solve this:
 1. In user code, once catch an IOE, close connection and re-create HTable 
 instance. We can use this as a workaround.
 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5221) bin/hbase script doesn't look for Hadoop jars in the right place in trunk layout

2012-01-17 Thread Todd Lipcon (Created) (JIRA)
bin/hbase script doesn't look for Hadoop jars in the right place in trunk layout


 Key: HBASE-5221
 URL: https://issues.apache.org/jira/browse/HBASE-5221
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Todd Lipcon


Running against an 0.24.0-SNAPSHOT hadoop:
ls: cannot access /home/todd/ha-demo/hadoop-0.24.0-SNAPSHOT/hadoop-common*.jar: 
No such file or directory
ls: cannot access /home/todd/ha-demo/hadoop-0.24.0-SNAPSHOT/hadoop-hdfs*.jar: 
No such file or directory
ls: cannot access /home/todd/ha-demo/hadoop-0.24.0-SNAPSHOT/hadoop-mapred*.jar: 
No such file or directory
The jars are rooted deeper in the heirarchy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3489) .oldlogs not being cleaned out

2012-01-17 Thread Josh Wymer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188014#comment-13188014
 ] 

Josh Wymer commented on HBASE-3489:
---

After turning replication off on the slave cluster, the .oldlogs were cleaned 
up. So it appears as if hbase thinks that the slave cluster intends to 
replicate as well and doesn't clean the logs.

 .oldlogs not being cleaned out
 --

 Key: HBASE-3489
 URL: https://issues.apache.org/jira/browse/HBASE-3489
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
 Environment: 10 Nodes Write Heavy Cluster
Reporter: Wayne
 Attachments: oldlog.txt


 The .oldlogs folder is never being cleaned up. The 
 hbase.master.logcleaner.ttl has been set to clean up the old logs but the 
 clean up is never kicking in. The limit of 10 files is not the problem. After 
 running for 5 days not a single log file has ever been deleted and the 
 logcleaner is set to 2 days (from the default of 7 days). It is assumed that 
 the replication changes that want to be sure to keep these logs around if 
 needed have caused the cleanup to be blocked. There is no replication defined 
 (knowingly).
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4913) Per-CF compaction Via the Shell

2012-01-17 Thread Nicolas Spiegelberg (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188020#comment-13188020
 ] 

Nicolas Spiegelberg commented on HBASE-4913:


@Mubarak: go for it.  note that the client-side compaction/flush business logic 
has significantly changed, so it's a non-trivial port.  However, the core 
logic/ideas should be similar across versions, so prior work will be helpful.

 Per-CF compaction Via the Shell
 ---

 Key: HBASE-4913
 URL: https://issues.apache.org/jira/browse/HBASE-4913
 Project: HBase
  Issue Type: Sub-task
  Components: client, regionserver
Reporter: Nicolas Spiegelberg
 Fix For: 0.94.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4770) Enable per column family compaction for a region via hbase shell

2012-01-17 Thread Nicolas Spiegelberg (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Spiegelberg resolved HBASE-4770.


Resolution: Duplicate

Duplicate of HBASE-4913

 Enable per column family compaction for a region via hbase shell
 

 Key: HBASE-4770
 URL: https://issues.apache.org/jira/browse/HBASE-4770
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.0
Reporter: Nicolas Spiegelberg
Assignee: Pritam Damania
Priority: Minor

 Pretty self-explanatory.  We have per-table  per-region compaction.  
 However, there are use cases where you have 10+ column families.  It is 
 useful to be able to specify the actual CF you wish to compact.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Closed] (HBASE-4770) Enable per column family compaction for a region via hbase shell

2012-01-17 Thread Nicolas Spiegelberg (Closed) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Spiegelberg closed HBASE-4770.
--

Assignee: (was: Pritam Damania)

 Enable per column family compaction for a region via hbase shell
 

 Key: HBASE-4770
 URL: https://issues.apache.org/jira/browse/HBASE-4770
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.0
Reporter: Nicolas Spiegelberg
Priority: Minor

 Pretty self-explanatory.  We have per-table  per-region compaction.  
 However, there are use cases where you have 10+ column families.  It is 
 useful to be able to specify the actual CF you wish to compact.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188066#comment-13188066
 ] 

Zhihong Yu commented on HBASE-2600:
---

For HRegionInfo.createRegionName():
{code}
  public static byte [] createRegionName(final byte [] tableName,
  final byte [] startKey, final long regionid, boolean newFormat) {
{code}
I don't see it deprecated in 0.92.

I suggest creating a sub-task in 0.92.1 for this JIRA which deprecates the 
API's whose semantics are changed in this JIRA.

Some (though few) users may utilize the above API in their codebase.

 Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
 tablename+ENDROW+randomid
 

 Key: HBASE-2600
 URL: https://issues.apache.org/jira/browse/HBASE-2600
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Alex Newman
 Attachments: 
 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, jenkins.pdf


 This is an idea that Ryan and I have been kicking around on and off for a 
 while now.
 If regionnames were made of tablename+endrow instead of tablename+startrow, 
 then in the metatables, doing a search for the region that contains the 
 wanted row, we'd just have to open a scanner using passed row and the first 
 row found by the scan would be that of the region we need (If offlined 
 parent, we'd have to scan to the next row).
 If we redid the meta tables in this format, we'd be using an access that is 
 natural to hbase, a scan as opposed to the perverse, expensive 
 getClosestRowBefore we currently have that has to walk backward in meta 
 finding a containing region.
 This issue is about changing the way we name regions.
 If we were using scans, prewarming client cache would be near costless (as 
 opposed to what we'll currently have to do which is first a 
 getClosestRowBefore and then a scan from the closestrowbefore forward).
 Converting to the new method, we'd have to run a migration on startup 
 changing the content in meta.
 Up to this, the randomid component of a region name has been the timestamp of 
 region creation.   HBASE-2531 32-bit encoding of regionnames waaay 
 too susceptible to hash clashes proposes changing the randomid so that it 
 contains actual name of the directory in the filesystem that hosts the 
 region.  If we had this in place, I think it would help with the migration to 
 this new way of doing the meta because as is, the region name in fs is a hash 
 of regionname... changing the format of the regionname would mean we generate 
 a different hash... so we'd need hbase-2531 to be in place before we could do 
 this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4913) Per-CF compaction Via the Shell

2012-01-17 Thread Mubarak Seyed (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188067#comment-13188067
 ] 

Mubarak Seyed commented on HBASE-4913:
--

Working on port, will attach a patch. Thanks.

 Per-CF compaction Via the Shell
 ---

 Key: HBASE-4913
 URL: https://issues.apache.org/jira/browse/HBASE-4913
 Project: HBase
  Issue Type: Sub-task
  Components: client, regionserver
Reporter: Nicolas Spiegelberg
 Fix For: 0.94.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5222) Stopping replication via the stop_replication command in hbase shell on a slave cluster doesn't prevent replication but does alter cleanup tasks

2012-01-17 Thread Josh Wymer (Created) (JIRA)
Stopping replication via the stop_replication command in hbase shell on a 
slave cluster doesn't prevent replication but does alter cleanup tasks
--

 Key: HBASE-5222
 URL: https://issues.apache.org/jira/browse/HBASE-5222
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: Josh Wymer


The problem showed itself when our /hbase/.oldlogs directory was not being 
cleaned up on our slave cluster. Turns out we had previously disabled 
replication manually by executing stop_replication in the hbase shell. 
Therefore, the clean old logs task (among others) was honoring that while the 
Replication Sink fails to check the same value. Therefore replication was still 
being written to the slave cluster (via replication sink) but the logs were 
being saved longer than the default 10 minutes as hbase stores them when 
stop_replication has been executed until it has been started again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-2600:
--

Attachment: 2600-trunk-01-17.txt

Patch rebased for the latest TRUNK.

 Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
 tablename+ENDROW+randomid
 

 Key: HBASE-2600
 URL: https://issues.apache.org/jira/browse/HBASE-2600
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Alex Newman
 Attachments: 
 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
 2600-trunk-01-17.txt, jenkins.pdf


 This is an idea that Ryan and I have been kicking around on and off for a 
 while now.
 If regionnames were made of tablename+endrow instead of tablename+startrow, 
 then in the metatables, doing a search for the region that contains the 
 wanted row, we'd just have to open a scanner using passed row and the first 
 row found by the scan would be that of the region we need (If offlined 
 parent, we'd have to scan to the next row).
 If we redid the meta tables in this format, we'd be using an access that is 
 natural to hbase, a scan as opposed to the perverse, expensive 
 getClosestRowBefore we currently have that has to walk backward in meta 
 finding a containing region.
 This issue is about changing the way we name regions.
 If we were using scans, prewarming client cache would be near costless (as 
 opposed to what we'll currently have to do which is first a 
 getClosestRowBefore and then a scan from the closestrowbefore forward).
 Converting to the new method, we'd have to run a migration on startup 
 changing the content in meta.
 Up to this, the randomid component of a region name has been the timestamp of 
 region creation.   HBASE-2531 32-bit encoding of regionnames waaay 
 too susceptible to hash clashes proposes changing the randomid so that it 
 contains actual name of the directory in the filesystem that hosts the 
 region.  If we had this in place, I think it would help with the migration to 
 this new way of doing the meta because as is, the region name in fs is a hash 
 of regionname... changing the format of the regionname would mean we generate 
 a different hash... so we'd need hbase-2531 to be in place before we could do 
 this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5222) Stopping replication via the stop_replication command in hbase shell on a slave cluster isn't acknowledged in the replication sink

2012-01-17 Thread Josh Wymer (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Wymer updated HBASE-5222:
--

Description: After running stop_replication in the hbase shell on our 
slave cluster we saw replication continue for weeks. Turns out that the 
replication sink is missing a check to get the replication state and therefore 
continued to write.  (was: The problem showed itself when our /hbase/.oldlogs 
directory was not being cleaned up on our slave cluster. Turns out we had 
previously disabled replication manually by executing stop_replication in the 
hbase shell. Therefore, the clean old logs task (among others) was honoring 
that while the Replication Sink fails to check the same value. Therefore 
replication was still being written to the slave cluster (via replication sink) 
but the logs were being saved longer than the default 10 minutes as hbase 
stores them when stop_replication has been executed until it has been started 
again.)
Summary: Stopping replication via the stop_replication command in 
hbase shell on a slave cluster isn't acknowledged in the replication sink  
(was: Stopping replication via the stop_replication command in hbase shell on 
a slave cluster doesn't prevent replication but does alter cleanup tasks)

 Stopping replication via the stop_replication command in hbase shell on a 
 slave cluster isn't acknowledged in the replication sink
 

 Key: HBASE-5222
 URL: https://issues.apache.org/jira/browse/HBASE-5222
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: Josh Wymer

 After running stop_replication in the hbase shell on our slave cluster we 
 saw replication continue for weeks. Turns out that the replication sink is 
 missing a check to get the replication state and therefore continued to write.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5212) Fix test TestTableMapReduce against 0.23.

2012-01-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188093#comment-13188093
 ] 

stack commented on HBASE-5212:
--

+1 on patch.

 Fix test TestTableMapReduce against 0.23.
 -

 Key: HBASE-5212
 URL: https://issues.apache.org/jira/browse/HBASE-5212
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Mahadev konar
 Fix For: 0.92.1

 Attachments: 5212-v2.txt, HBASE-5212.patch


 As reported by Andrew on the hadoop mailing list, mvn -Dhadoop.profile=23 
 clean test -Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce fails 
 on 0.92 branch. There are minor changes to HBase poms required to fix that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5212) Fix test TestTableMapReduce against 0.23.

2012-01-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188100#comment-13188100
 ] 

Zhihong Yu commented on HBASE-5212:
---

The patch would break TestLogRolling against hadoop 1.0:
{code}
java.lang.AssertionError: DataNodes 2 default replication 2
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.testLogRollOnDatanodeDeath(TestLogRolling.java:335)
{code}
See HBASE-5191

 Fix test TestTableMapReduce against 0.23.
 -

 Key: HBASE-5212
 URL: https://issues.apache.org/jira/browse/HBASE-5212
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Mahadev konar
 Fix For: 0.92.1

 Attachments: 5212-v2.txt, HBASE-5212.patch


 As reported by Andrew on the hadoop mailing list, mvn -Dhadoop.profile=23 
 clean test -Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce fails 
 on 0.92 branch. There are minor changes to HBase poms required to fix that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5201) Utilize TThreadedSelectorServer and remove redundant code in ThriftServer and HRegionThriftServer

2012-01-17 Thread Scott Chen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188102#comment-13188102
 ] 

Scott Chen commented on HBASE-5201:
---

Thanks for the help Zhihong.
I will add the documentation soon.

 Utilize TThreadedSelectorServer and remove redundant code in ThriftServer and 
 HRegionThriftServer
 -

 Key: HBASE-5201
 URL: https://issues.apache.org/jira/browse/HBASE-5201
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.94.0

 Attachments: HBASE-5201-v2.txt, HBASE-5201-v3.txt, HBASE-5201-v4.txt, 
 HBASE-5201.txt


 TThreadedSelectorServer is good for RPC-heavy situation because IO are not 
 limited to one CPU. See
 https://issues.apache.org/jira/browse/Thrift-1167
 I am porting the related classes form thrift trunk (it is not there in 
 thrift-0.7.0).
 There are lots of repeat codes in ThriftServer and HRegionThriftServer.
 These codes are now moved to a Runnable called ThriftServerRunner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188106#comment-13188106
 ] 

Zhihong Yu commented on HBASE-2600:
---

Still need to understand the test failure.
Since TestMetaMigrationRemovingHTD migrates from 0.90 HBase, I wonder if the 
test itself should be maintained in TRUNK.
Migration from 0.90 to 0.94 isn't supported.

 Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
 tablename+ENDROW+randomid
 

 Key: HBASE-2600
 URL: https://issues.apache.org/jira/browse/HBASE-2600
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Alex Newman
 Attachments: 
 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
 2600-trunk-01-17.txt, jenkins.pdf


 This is an idea that Ryan and I have been kicking around on and off for a 
 while now.
 If regionnames were made of tablename+endrow instead of tablename+startrow, 
 then in the metatables, doing a search for the region that contains the 
 wanted row, we'd just have to open a scanner using passed row and the first 
 row found by the scan would be that of the region we need (If offlined 
 parent, we'd have to scan to the next row).
 If we redid the meta tables in this format, we'd be using an access that is 
 natural to hbase, a scan as opposed to the perverse, expensive 
 getClosestRowBefore we currently have that has to walk backward in meta 
 finding a containing region.
 This issue is about changing the way we name regions.
 If we were using scans, prewarming client cache would be near costless (as 
 opposed to what we'll currently have to do which is first a 
 getClosestRowBefore and then a scan from the closestrowbefore forward).
 Converting to the new method, we'd have to run a migration on startup 
 changing the content in meta.
 Up to this, the randomid component of a region name has been the timestamp of 
 region creation.   HBASE-2531 32-bit encoding of regionnames waaay 
 too susceptible to hash clashes proposes changing the randomid so that it 
 contains actual name of the directory in the filesystem that hosts the 
 region.  If we had this in place, I think it would help with the migration to 
 this new way of doing the meta because as is, the region name in fs is a hash 
 of regionname... changing the format of the regionname would mean we generate 
 a different hash... so we'd need hbase-2531 to be in place before we could do 
 this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread Alex Newman (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188110#comment-13188110
 ] 

Alex Newman commented on HBASE-2600:


I will in fact fix this, but I am going to need stacks help. I know what the 
issue is.

 Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
 tablename+ENDROW+randomid
 

 Key: HBASE-2600
 URL: https://issues.apache.org/jira/browse/HBASE-2600
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Alex Newman
 Attachments: 
 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
 2600-trunk-01-17.txt, jenkins.pdf


 This is an idea that Ryan and I have been kicking around on and off for a 
 while now.
 If regionnames were made of tablename+endrow instead of tablename+startrow, 
 then in the metatables, doing a search for the region that contains the 
 wanted row, we'd just have to open a scanner using passed row and the first 
 row found by the scan would be that of the region we need (If offlined 
 parent, we'd have to scan to the next row).
 If we redid the meta tables in this format, we'd be using an access that is 
 natural to hbase, a scan as opposed to the perverse, expensive 
 getClosestRowBefore we currently have that has to walk backward in meta 
 finding a containing region.
 This issue is about changing the way we name regions.
 If we were using scans, prewarming client cache would be near costless (as 
 opposed to what we'll currently have to do which is first a 
 getClosestRowBefore and then a scan from the closestrowbefore forward).
 Converting to the new method, we'd have to run a migration on startup 
 changing the content in meta.
 Up to this, the randomid component of a region name has been the timestamp of 
 region creation.   HBASE-2531 32-bit encoding of regionnames waaay 
 too susceptible to hash clashes proposes changing the randomid so that it 
 contains actual name of the directory in the filesystem that hosts the 
 region.  If we had this in place, I think it would help with the migration to 
 this new way of doing the meta because as is, the region name in fs is a hash 
 of regionname... changing the format of the regionname would mean we generate 
 a different hash... so we'd need hbase-2531 to be in place before we could do 
 this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188117#comment-13188117
 ] 

Hadoop QA commented on HBASE-2600:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510912/2600-trunk-01-17.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 24 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 81 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/798//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/798//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/798//console

This message is automatically generated.

 Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
 tablename+ENDROW+randomid
 

 Key: HBASE-2600
 URL: https://issues.apache.org/jira/browse/HBASE-2600
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Alex Newman
 Attachments: 
 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
 2600-trunk-01-17.txt, jenkins.pdf


 This is an idea that Ryan and I have been kicking around on and off for a 
 while now.
 If regionnames were made of tablename+endrow instead of tablename+startrow, 
 then in the metatables, doing a search for the region that contains the 
 wanted row, we'd just have to open a scanner using passed row and the first 
 row found by the scan would be that of the region we need (If offlined 
 parent, we'd have to scan to the next row).
 If we redid the meta tables in this format, we'd be using an access that is 
 natural to hbase, a scan as opposed to the perverse, expensive 
 getClosestRowBefore we currently have that has to walk backward in meta 
 finding a containing region.
 This issue is about changing the way we name regions.
 If we were using scans, prewarming client cache would be near costless (as 
 opposed to what we'll currently have to do which is first a 
 getClosestRowBefore and then a scan from the closestrowbefore forward).
 Converting to the new method, we'd have to run a migration on startup 
 changing the content in meta.
 Up to this, the randomid component of a region name has been the timestamp of 
 region creation.   HBASE-2531 32-bit encoding of regionnames waaay 
 too susceptible to hash clashes proposes changing the randomid so that it 
 contains actual name of the directory in the filesystem that hosts the 
 region.  If we had this in place, I think it would help with the migration to 
 this new way of doing the meta because as is, the region name in fs is a hash 
 of regionname... changing the format of the regionname would mean we generate 
 a different hash... so we'd need hbase-2531 to be in place before we could do 
 this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5176) AssignmentManager#getRegion: logging nit adds a redundant '+'

2012-01-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188118#comment-13188118
 ] 

Hudson commented on HBASE-5176:
---

Integrated in HBase-0.92 #249 (See 
[https://builds.apache.org/job/HBase-0.92/249/])
HBASE-5176  AssignmentManager#getRegion: logging nit  adds a redundant '+' 
(Karthik K)

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


 AssignmentManager#getRegion: logging nit  adds a redundant '+'  
 

 Key: HBASE-5176
 URL: https://issues.apache.org/jira/browse/HBASE-5176
 Project: HBase
  Issue Type: Bug
 Environment: hadoop 1.0.0 , zk 3.4.2 , hbase 0.92.0 rc3
Reporter: Karthik K
Assignee: Karthik K
Priority: Minor
 Fix For: 0.94.0, 0.92.1

 Attachments: HBASE-5176.patch


 From the logs of HMaster: 
 2012-01-10 17:28:24,370 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 -ROOT-,,0.70236052 destination server is + localhost,60020,1326242475275
 Was the '+' intended to be there , as part of some token for log verification 
 or just being redundant , w.r.t the following string append ? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188140#comment-13188140
 ] 

stack commented on HBASE-2600:
--

I like what Ted says above about need to deprecate a call to remove.

Here's some feedback.  I'm half-way done.  Patch is shaping up nicely.


What is the change in TestHRegionInfo?  You change a startkey to an endkey?  
I'm not sure I follow why this is done.

I like the removal of the testGetClosestBefore from TestMinVersions.java
and of TestGetClosestAtOrBefore.java (hurray!)

I love all the removed code.

So in HConstants, ZEROS is deprecated but NINES is not?  How is NINES used 
still (later I see it used but I'm not sure what its doing?)

I like removal of META_ROW_DELIMITER

Spacing is wacky here:

-  this.startKey, this.id,
-  !HTableDescriptor.isMetaTable(tableNameAsBytes));
-return Bytes.toString(nameAsBytes);
+   this.endKey,
+   
Long.toString(this.id).getBytes(),
+   
!HTableDescriptor.isMetaTable(tableNameAsBytes));
+return Bytes.toStringBinary(nameAsBytes);

What happens if the last region in a table is missing for whatever reason?

Is the javadoc on getStartRow in HTableDescriptor right?  Its says its 
returning first
possible region that could match a tablename + searchrow?  Is it first possible 
row
in meta?


 Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
 tablename+ENDROW+randomid
 

 Key: HBASE-2600
 URL: https://issues.apache.org/jira/browse/HBASE-2600
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Alex Newman
 Attachments: 
 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
 2600-trunk-01-17.txt, jenkins.pdf


 This is an idea that Ryan and I have been kicking around on and off for a 
 while now.
 If regionnames were made of tablename+endrow instead of tablename+startrow, 
 then in the metatables, doing a search for the region that contains the 
 wanted row, we'd just have to open a scanner using passed row and the first 
 row found by the scan would be that of the region we need (If offlined 
 parent, we'd have to scan to the next row).
 If we redid the meta tables in this format, we'd be using an access that is 
 natural to hbase, a scan as opposed to the perverse, expensive 
 getClosestRowBefore we currently have that has to walk backward in meta 
 finding a containing region.
 This issue is about changing the way we name regions.
 If we were using scans, prewarming client cache would be near costless (as 
 opposed to what we'll currently have to do which is first a 
 getClosestRowBefore and then a scan from the closestrowbefore forward).
 Converting to the new method, we'd have to run a migration on startup 
 changing the content in meta.
 Up to this, the randomid component of a region name has been the timestamp of 
 region creation.   HBASE-2531 32-bit encoding of regionnames waaay 
 too susceptible to hash clashes proposes changing the randomid so that it 
 contains actual name of the directory in the filesystem that hosts the 
 region.  If we had this in place, I think it would help with the migration to 
 this new way of doing the meta because as is, the region name in fs is a hash 
 of regionname... changing the format of the regionname would mean we generate 
 a different hash... so we'd need hbase-2531 to be in place before we could do 
 this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread Alex Newman (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188154#comment-13188154
 ] 

Alex Newman commented on HBASE-2600:


 What is the change in TestHRegionInfo? You change a startkey to an endkey? 
 I'm not sure I follow why this is done.
it's because creatioRegionId now uses the start key instead of the end key. 

 So in HConstants, ZEROS is deprecated but NINES is not? How is NINES used 
 still (later I see it used but I'm not sure what its doing?)
Fixed

 Spacing
fixed

 What if the last region is missing
The stoprow should catch it.

Is the javadoc on getStartRow in HTableDescriptor right? Its says its 
returning first possible region that could match a tablename + searchrow? Is 
it first possible row in meta?
 Oh your right fixed


Patch inbound

 Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
 tablename+ENDROW+randomid
 

 Key: HBASE-2600
 URL: https://issues.apache.org/jira/browse/HBASE-2600
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Alex Newman
 Attachments: 
 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
 2600-trunk-01-17.txt, jenkins.pdf


 This is an idea that Ryan and I have been kicking around on and off for a 
 while now.
 If regionnames were made of tablename+endrow instead of tablename+startrow, 
 then in the metatables, doing a search for the region that contains the 
 wanted row, we'd just have to open a scanner using passed row and the first 
 row found by the scan would be that of the region we need (If offlined 
 parent, we'd have to scan to the next row).
 If we redid the meta tables in this format, we'd be using an access that is 
 natural to hbase, a scan as opposed to the perverse, expensive 
 getClosestRowBefore we currently have that has to walk backward in meta 
 finding a containing region.
 This issue is about changing the way we name regions.
 If we were using scans, prewarming client cache would be near costless (as 
 opposed to what we'll currently have to do which is first a 
 getClosestRowBefore and then a scan from the closestrowbefore forward).
 Converting to the new method, we'd have to run a migration on startup 
 changing the content in meta.
 Up to this, the randomid component of a region name has been the timestamp of 
 region creation.   HBASE-2531 32-bit encoding of regionnames waaay 
 too susceptible to hash clashes proposes changing the randomid so that it 
 contains actual name of the directory in the filesystem that hosts the 
 region.  If we had this in place, I think it would help with the migration to 
 this new way of doing the meta because as is, the region name in fs is a hash 
 of regionname... changing the format of the regionname would mean we generate 
 a different hash... so we'd need hbase-2531 to be in place before we could do 
 this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-01-17 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188153#comment-13188153
 ] 

Lars Hofhansl commented on HBASE-2600:
--

Is HRegionInfo part of the public API?

 Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
 tablename+ENDROW+randomid
 

 Key: HBASE-2600
 URL: https://issues.apache.org/jira/browse/HBASE-2600
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Alex Newman
 Attachments: 
 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
 2600-trunk-01-17.txt, jenkins.pdf


 This is an idea that Ryan and I have been kicking around on and off for a 
 while now.
 If regionnames were made of tablename+endrow instead of tablename+startrow, 
 then in the metatables, doing a search for the region that contains the 
 wanted row, we'd just have to open a scanner using passed row and the first 
 row found by the scan would be that of the region we need (If offlined 
 parent, we'd have to scan to the next row).
 If we redid the meta tables in this format, we'd be using an access that is 
 natural to hbase, a scan as opposed to the perverse, expensive 
 getClosestRowBefore we currently have that has to walk backward in meta 
 finding a containing region.
 This issue is about changing the way we name regions.
 If we were using scans, prewarming client cache would be near costless (as 
 opposed to what we'll currently have to do which is first a 
 getClosestRowBefore and then a scan from the closestrowbefore forward).
 Converting to the new method, we'd have to run a migration on startup 
 changing the content in meta.
 Up to this, the randomid component of a region name has been the timestamp of 
 region creation.   HBASE-2531 32-bit encoding of regionnames waaay 
 too susceptible to hash clashes proposes changing the randomid so that it 
 contains actual name of the directory in the filesystem that hosts the 
 region.  If we had this in place, I think it would help with the migration to 
 this new way of doing the meta because as is, the region name in fs is a hash 
 of regionname... changing the format of the regionname would mean we generate 
 a different hash... so we'd need hbase-2531 to be in place before we could do 
 this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5223) TestMetaReaderEditor is missing call to CatalogTracker.stop()

2012-01-17 Thread Zhihong Yu (Created) (JIRA)
TestMetaReaderEditor is missing call to CatalogTracker.stop()
-

 Key: HBASE-5223
 URL: https://issues.apache.org/jira/browse/HBASE-5223
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
 Attachments: 5223.txt

I noticed that TestMetaReaderEditor hung on 0.92 Jenkins builds - see 
https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/249/console

It turns out that CatalogTracker.stop() is missing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >