[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Open  (was: Patch Available)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 
 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Patch Available  (was: Open)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 
 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Attachment: 5064.v13.patch

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 
 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3741) Make HRegionServer aware of the regions it's opening/closing

2011-12-28 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176517#comment-13176517
 ] 

ramkrishna.s.vasudevan commented on HBASE-3741:
---

@Johnyang
The final fix is in HBASE-4186.  So you may have to upgrade to 0.90.5. 


 Make HRegionServer aware of the regions it's opening/closing
 

 Key: HBASE-3741
 URL: https://issues.apache.org/jira/browse/HBASE-3741
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.90.3

 Attachments: HBASE-3741-rsfix-v2.patch, HBASE-3741-rsfix-v3.patch, 
 HBASE-3741-rsfix.patch, HBASE-3741-trunk.patch


 This is a serious issue about a race between regions being opened and closed 
 in region servers. We had this situation where the master tried to unassign a 
 region for balancing, failed, force unassigned it, force assigned it 
 somewhere else, failed to open it on another region server (took too long), 
 and then reassigned it back to the original region server. A few seconds 
 later, the region server processed the first closed and the region was left 
 unassigned.
 This is from the master log:
 {quote}
 11-04-05 15:11:17,758 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
 Sent CLOSE to serverName=sv4borg42,60020,1300920459477, load=(requests=187, 
 regions=574, usedHeap=3918, maxHeap=6973) for region 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
 2011-04-05 15:12:10,021 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  state=PENDING_CLOSE, ts=1302041477758
 2011-04-05 15:12:10,021 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_CLOSE for too long, running forced unassign again on 
 region=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
 ...
 2011-04-05 15:14:45,783 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  state=CLOSED, ts=1302041685733
 2011-04-05 15:14:45,783 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x42ec2cece810b68 Creating (or updating) unassigned node for 
 1470298961 with OFFLINE state
 ...
 2011-04-05 15:14:45,885 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
 region 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961;
  
 plan=hri=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961,
  src=sv4borg42,60020,1300920459477, dest=sv4borg40,60020,1302041218196
 2011-04-05 15:14:45,885 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  to sv4borg40,60020,1302041218196
 2011-04-05 15:15:39,410 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  state=PENDING_OPEN, ts=1302041700944
 2011-04-05 15:15:39,410 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_OPEN for too long, reassigning 
 region=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
 2011-04-05 15:15:39,410 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  state=PENDING_OPEN, ts=1302041700944
 ...
 2011-04-05 15:15:39,410 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  so generated a random one; 
 hri=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961,
  src=, dest=sv4borg42,60020,1300920459477; 19 (online=19, exclude=null) 
 available servers
 2011-04-05 15:15:39,410 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  to sv4borg42,60020,1300920459477
 2011-04-05 15:15:40,951 DEBUG 
 

[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176523#comment-13176523
 ] 

Hadoop QA commented on HBASE-5100:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508728/hbase-5100.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 77 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestDrainingServer
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/607//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/607//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/607//console

This message is automatically generated.

 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-5100.patch


 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, 

[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176533#comment-13176533
 ] 

Hadoop QA commented on HBASE-5064:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508734/5064.v13.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 77 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/608//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/608//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/608//console

This message is automatically generated.

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 
 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176550#comment-13176550
 ] 

Zhihong Yu commented on HBASE-5100:
---

I think we can remove the try/finally construct and put 
this.journal.add(JournalEntry.CLOSED_PARENT_REGION); in else block of:
{code}
  if (hstoreFilesToSplit == null) {
{code}
@Chunhui:
What do you think ?

 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-5100.patch


 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,348 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
 event for f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  state=CLOSED, ts=1324830285347
 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
 f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 

[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176574#comment-13176574
 ] 

Zhihong Yu commented on HBASE-5009:
---

Integrated to 0.92 and TRUNK.

Thanks for the patch Ramkrishna.

Thanks for the comments, Stack, Jinchao and Anoop.

 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176625#comment-13176625
 ] 

Hadoop QA commented on HBASE-5064:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508750/5064.v14.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.coprocessor.TestMasterObserver
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/609//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/609//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/609//console

This message is automatically generated.

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v2.patch, 5064.v3.patch, 
 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v8.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-28 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176626#comment-13176626
 ] 

Hudson commented on HBASE-5009:
---

Integrated in HBase-0.92 #211 (See 
[https://builds.apache.org/job/HBase-0.92/211/])
HBASE-5009  Failure of creating split dir if it already exists prevents 
splits from happening further

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/Reference.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java


 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Patch Available  (was: Open)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 5064.v8.patch, 
 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Attachment: 5064.v14.patch

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 5064.v8.patch, 
 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Open  (was: Patch Available)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 5064.v8.patch, 
 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176643#comment-13176643
 ] 

Hadoop QA commented on HBASE-5064:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508754/5064.v14.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/610//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/610//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/610//console

This message is automatically generated.

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 5064.v8.patch, 
 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176658#comment-13176658
 ] 

nkeywal commented on HBASE-5064:


once again a java.io.FileNotFoundException: 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/../logs/userlogs/job_20111228130003142_0001/attempt_20111228130003142_0001_m_01_2/log.index
 (No such file or directory)

Will have to understand this. 

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 5064.v8.patch, 
 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-28 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176660#comment-13176660
 ] 

Hudson commented on HBASE-5009:
---

Integrated in HBase-TRUNK #2587 (See 
[https://builds.apache.org/job/HBase-TRUNK/2587/])
HBASE-5009  Failure of creating split dir if it already exists prevents 
splits from happening further

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/Reference.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java


 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176677#comment-13176677
 ] 

Hadoop QA commented on HBASE-5064:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508763/5064.v15.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/611//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/611//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/611//console

This message is automatically generated.

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v15.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 
 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-28 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176678#comment-13176678
 ] 

Hudson commented on HBASE-5009:
---

Integrated in HBase-TRUNK-security #52 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/52/])
HBASE-5009  Failure of creating split dir if it already exists prevents 
splits from happening further

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/Reference.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java


 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-28 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5009:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5097) RegionObserver implementation whose preScannerOpen and postScannerOpen Impl return null can stall the system initialization through NPE

2011-12-28 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5097:
--

Summary: RegionObserver implementation whose preScannerOpen and 
postScannerOpen Impl return null can stall the system initialization through 
NPE  (was: Coprocessor RegionObserver implementation without preScannerOpen and 
postScannerOpen Impl is throwing NPE and so failing the system initialization.)

 RegionObserver implementation whose preScannerOpen and postScannerOpen Impl 
 return null can stall the system initialization through NPE
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5097) RegionObserver implementation whose preScannerOpen and postScannerOpen Impl return null can stall the system initialization through NPE

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176689#comment-13176689
 ] 

Zhihong Yu commented on HBASE-5097:
---

I think the case here is that we should check for s not being null before 
calling addScanner(s).
Some user may deliberately write code where preScannerOpen and postScannerOpen 
implementations return null to stall initialization.

Eugene implemented feature to take off bad-behaving coprocessors through 
hbase.coprocessor.abortonerror config parameter (see 
CoprocessorHost.handleCoprocessorThrowable).
I think we should cover this case as well.

 RegionObserver implementation whose preScannerOpen and postScannerOpen Impl 
 return null can stall the system initialization through NPE
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5097) RegionObserver implementation whose preScannerOpen and postScannerOpen Impl return null can stall the system initialization through NPE

2011-12-28 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176701#comment-13176701
 ] 

Lars Hofhansl commented on HBASE-5097:
--

fair enough. So if posyOpenScanner returns null we just won't create a scanner?
Might lead to hard to detect problems. 

 RegionObserver implementation whose preScannerOpen and postScannerOpen Impl 
 return null can stall the system initialization through NPE
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5097) RegionObserver implementation whose preScannerOpen and postScannerOpen Impl return null can stall the system initialization through NPE

2011-12-28 Thread Zhihong Yu (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176701#comment-13176701
 ] 

Zhihong Yu edited comment on HBASE-5097 at 12/28/11 4:57 PM:
-

fair enough. So if postOpenScanner returns null we just won't create a scanner?
Might lead to hard to detect problems. 

  was (Author: lhofhansl):
fair enough. So if posyOpenScanner returns null we just won't create a 
scanner?
Might lead to hard to detect problems. 
  
 RegionObserver implementation whose preScannerOpen and postScannerOpen Impl 
 return null can stall the system initialization through NPE
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5097) RegionObserver implementation whose preScannerOpen and postScannerOpen Impl return null can stall the system initialization through NPE

2011-12-28 Thread Zhihong Yu (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176701#comment-13176701
 ] 

Zhihong Yu edited comment on HBASE-5097 at 12/28/11 4:58 PM:
-

fair enough. So if postScannerOpen() returns null we just won't create a 
scanner?
Might lead to hard to detect problems. 

  was (Author: lhofhansl):
fair enough. So if postOpenScanner returns null we just won't create a 
scanner?
Might lead to hard to detect problems. 
  
 RegionObserver implementation whose preScannerOpen and postScannerOpen Impl 
 return null can stall the system initialization through NPE
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5097) RegionObserver implementation whose preScannerOpen and postScannerOpen Impl return null can stall the system initialization through NPE

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176705#comment-13176705
 ] 

Zhihong Yu commented on HBASE-5097:
---

We should save the non-null s before calling postScannerOpen().
If postScannerOpen() returns null, we can use the saved scanner.

 RegionObserver implementation whose preScannerOpen and postScannerOpen Impl 
 return null can stall the system initialization through NPE
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2011-12-28 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176709#comment-13176709
 ] 

Phabricator commented on HBASE-4218:


mcorgan has commented on the revision [jira] [HBASE-4218] HFile data block 
encoding (delta encoding).

  I'm porting the TRIE encoding algorithm over to this new patch, so am able to 
review a little better in eclipse than on review board.  Couple things I've 
noticed so far:

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java:32 
The enum nested in a class is unusual.  Would a more typical approach be to 
call it DataBlockEncoding (singular) and make that the enum, eliminating the 
nested Algorithm?

  So you would have DataBlockEncoding.BITSET, etc.

  This would help elsewhere in the codebase since it will eliminate the 
confusion with the unfortunately named compression Algorithm (GZIP, LZO)
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:121
 This method was added before getKeyValueObject(), so I see why it happened 
this way, but this method should probably be called getKeyValueBuffer() or 
getKeyValueByteBuffer(), and the below method should be called getKeyValue()
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:134
 rename to getKeyValue()

REVISION DETAIL
  https://reviews.facebook.net/D447


 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, 
 D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176712#comment-13176712
 ] 

Zhihong Yu commented on HBASE-5094:
---

Ram is coming up with a new patch as the previous one didn't fix the problem.

 The META can hold an entry for a region with a different server name from the 
 one actually in the AssignmentManager thus making the region inaccessible.
 

 Key: HBASE-5094
 URL: https://issues.apache.org/jira/browse/HBASE-5094
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: ramkrishna.s.vasudevan
Priority: Critical

 {code}
 RegionState rit = 
 this.services.getAssignmentManager().isRegionInTransition(e.getKey());
 ServerName addressFromAM = this.services.getAssignmentManager()
 .getRegionServerOfRegion(e.getKey());
 if (rit != null  !rit.isClosing()  !rit.isPendingClose()) {
   // Skip regions that were in transition unless CLOSING or
   // PENDING_CLOSE
   LOG.info(Skip assigning region  + rit.toString());
 } else if (addressFromAM != null
  !addressFromAM.equals(this.serverName)) {
   LOG.debug(Skip assigning region 
 + e.getKey().getRegionNameAsString()
 +  because it has been opened in 
 + addressFromAM.getServerName());
   }
 {code}
 In ServerShutDownHandler we try to get the address in the AM.  This address 
 is initially null because it is not yet updated after the region was opened 
 .i.e. the CAll back after node deletion is not yet done in the master side.
 But removal from RIT is completed on the master side.  So this will trigger a 
 new assignment.
 So there is a small window between the online region is actually added in to 
 the online list and the ServerShutdownHandler where we check the existing 
 address in AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.

2011-12-28 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5094:
--

Attachment: (was: 5094.patch)

 The META can hold an entry for a region with a different server name from the 
 one actually in the AssignmentManager thus making the region inaccessible.
 

 Key: HBASE-5094
 URL: https://issues.apache.org/jira/browse/HBASE-5094
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: ramkrishna.s.vasudevan
Priority: Critical

 {code}
 RegionState rit = 
 this.services.getAssignmentManager().isRegionInTransition(e.getKey());
 ServerName addressFromAM = this.services.getAssignmentManager()
 .getRegionServerOfRegion(e.getKey());
 if (rit != null  !rit.isClosing()  !rit.isPendingClose()) {
   // Skip regions that were in transition unless CLOSING or
   // PENDING_CLOSE
   LOG.info(Skip assigning region  + rit.toString());
 } else if (addressFromAM != null
  !addressFromAM.equals(this.serverName)) {
   LOG.debug(Skip assigning region 
 + e.getKey().getRegionNameAsString()
 +  because it has been opened in 
 + addressFromAM.getServerName());
   }
 {code}
 In ServerShutDownHandler we try to get the address in the AM.  This address 
 is initially null because it is not yet updated after the region was opened 
 .i.e. the CAll back after node deletion is not yet done in the master side.
 But removal from RIT is completed on the master side.  So this will trigger a 
 new assignment.
 So there is a small window between the online region is actually added in to 
 the online list and the ServerShutdownHandler where we check the existing 
 address in AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5041) Major compaction on non existing table does not throw error

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176714#comment-13176714
 ] 

Zhihong Yu commented on HBASE-5041:
---

Sorry for getting back to this so late.
I think we can keep the signature for split, flush and compact methods.

@Shrijeet:
Please attach new patch so that Hadoop QA can test it.

Thanks

 Major compaction on non existing table does not throw error 
 

 Key: HBASE-5041
 URL: https://issues.apache.org/jira/browse/HBASE-5041
 Project: HBase
  Issue Type: Bug
  Components: regionserver, shell
Affects Versions: 0.90.3
Reporter: Shrijeet Paliwal
Assignee: Shrijeet Paliwal
 Fix For: 0.92.0, 0.94.0, 0.90.6

 Attachments: 0001-HBASE-5041-Throw-error-if-table-does-not-exist.patch


 Following will not complain even if fubar does not exist
 {code}
 echo major_compact 'fubar' | $HBASE_HOME/bin/hbase shell
 {code}
 The downside for this defect is that major compaction may be skipped due to
 a typo by Ops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3274) Replace all config properties references in code with string constants

2011-12-28 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-3274:
---

Attachment: HBASE-3274.D1053.1.patch

QwertyManiac requested code review of HBASE-3274 [jira] Replace all config 
properties references in code with string constants.
Reviewers: JIRA

  Fixes for source packages avro to io.

  See HBASE-2721 for details. We have fixed the default values in HBASE-3272 
but we should also follow Hadoop to remove all hardcoded strings that refer to 
configuration properties and move them to HConstants.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D1053

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
  src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
  src/main/java/org/apache/hadoop/hbase/client/HTable.java
  src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
  src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCoprocessorHost.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/2199/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


 Replace all config properties references in code with string constants
 --

 Key: HBASE-3274
 URL: https://issues.apache.org/jira/browse/HBASE-3274
 Project: HBase
  Issue Type: Improvement
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
 Attachments: HBASE-3274.D1053.1.patch

   Original Estimate: 168h
  Time Spent: 2h
  Remaining Estimate: 166h

 See HBASE-2721 for details. We have fixed the default values in HBASE-3272 
 but we should also follow Hadoop to remove all hardcoded strings that refer 
 to configuration properties and move them to HConstants. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176739#comment-13176739
 ] 

Hadoop QA commented on HBASE-5064:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508763/5064.v15.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
  org.apache.hadoop.hbase.mapred.TestTableMapReduce

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/612//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/612//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/612//console

This message is automatically generated.

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v15.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 
 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root r

2011-12-28 Thread Jimmy Xiang (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HBASE-5099:
--

Assignee: Jimmy Xiang

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png, hbase-5099.patch


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root re

2011-12-28 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5099:
---

Attachment: hbase-5099.patch

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png, hbase-5099.patch


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root re

2011-12-28 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5099:
---

Status: Patch Available  (was: Open)

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png, hbase-5099.patch


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root

2011-12-28 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176778#comment-13176778
 ] 

Jimmy Xiang commented on HBASE-5099:


Cool, let me submit a patch.

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png, hbase-5099.patch


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5031) [89-fb] Remove hard-coded non-existent host name from TestScanner

2011-12-28 Thread Nicolas Spiegelberg (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176782#comment-13176782
 ] 

Nicolas Spiegelberg commented on HBASE-5031:


@Mikhail: is this pertinent to trunk or only 89-fb?  This is the current 
stopping point in my linear merge.

 [89-fb] Remove hard-coded non-existent host name from TestScanner 
 --

 Key: HBASE-5031
 URL: https://issues.apache.org/jira/browse/HBASE-5031
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Priority: Minor
 Attachments: D867.1.patch


 TestScanner is failing on 0.89-fb because it has a hard-coded fake host name 
 that it is trying to look up. Replacing this with 127.0.0.1:random_port 
 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root

2011-12-28 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176783#comment-13176783
 ] 

jirapos...@reviews.apache.org commented on HBASE-5099:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3323/
---

Review request for hbase, Ted Yu and Michael Stack.


Summary
---

Per discussion with Ted (on issues), I put up a patch to run 
tryRecoveringExpiredZKSession() in a separate thread and time it out and fail 
the recovery if it is stuck somewhere.

I added a test to test the abort method.  However, for the mini cluster, 
becomeActiveMaster() doesn't succeed so the abort method ends up always 
aborted.  So the actually success recovery is not tested.


This addresses bug HBASE-5099.
https://issues.apache.org/jira/browse/HBASE-5099


Diffs
-

  src/main/java/org/apache/hadoop/hbase/master/HMaster.java a5935a6 
  src/test/java/org/apache/hadoop/hbase/master/TestMaster.java b7a8270 

Diff: https://reviews.apache.org/r/3323/diff


Testing
---

mvn -PlocalTests -Dtest=TestMaster* clean test


Thanks,

Jimmy



 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png, hbase-5099.patch


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5097) RegionObserver implementation whose preScannerOpen and postScannerOpen Impl return null can stall the system initialization through NPE

2011-12-28 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176786#comment-13176786
 ] 

Andrew Purtell commented on HBASE-5097:
---

bq. We should save the non-null s before calling postScannerOpen(). If 
postScannerOpen() returns null, we can use the saved scanner.

... and emit a warning. Sounds good.

 RegionObserver implementation whose preScannerOpen and postScannerOpen Impl 
 return null can stall the system initialization through NPE
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root

2011-12-28 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176798#comment-13176798
 ] 

jirapos...@reviews.apache.org commented on HBASE-5099:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3323/#review4136
---


Nice effort.


src/main/java/org/apache/hadoop/hbase/master/HMaster.java
https://reviews.apache.org/r/3323/#comment9339

We should check the return value here.
False return value means timeout.



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
https://reviews.apache.org/r/3323/#comment9340

I think we should catch ExecutionException (ee) coming out of the get() 
call and rethrow ee.getCause().



src/test/java/org/apache/hadoop/hbase/master/TestMaster.java
https://reviews.apache.org/r/3323/#comment9343

TestMaster doesn't have timeout parameter.
We'd better add timeout parameter here.



src/test/java/org/apache/hadoop/hbase/master/TestMaster.java
https://reviews.apache.org/r/3323/#comment9341

Type: recovery.



src/test/java/org/apache/hadoop/hbase/master/TestMaster.java
https://reviews.apache.org/r/3323/#comment9342

I would expect some assertion here.


- Ted


On 2011-12-28 19:31:45, Jimmy Xiang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3323/
bq.  ---
bq.  
bq.  (Updated 2011-12-28 19:31:45)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu and Michael Stack.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Per discussion with Ted (on issues), I put up a patch to run 
tryRecoveringExpiredZKSession() in a separate thread and time it out and fail 
the recovery if it is stuck somewhere.
bq.  
bq.  I added a test to test the abort method.  However, for the mini cluster, 
becomeActiveMaster() doesn't succeed so the abort method ends up always 
aborted.  So the actually success recovery is not tested.
bq.  
bq.  
bq.  This addresses bug HBASE-5099.
bq.  https://issues.apache.org/jira/browse/HBASE-5099
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java a5935a6 
bq.src/test/java/org/apache/hadoop/hbase/master/TestMaster.java b7a8270 
bq.  
bq.  Diff: https://reviews.apache.org/r/3323/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  mvn -PlocalTests -Dtest=TestMaster* clean test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jimmy
bq.  
bq.



 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png, hbase-5099.patch


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread Konstantin Shvachko (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176805#comment-13176805
 ] 

Konstantin Shvachko commented on HBASE-5064:


Looks like NumberFormatException is the cause of at least some failures.

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v15.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 
 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176814#comment-13176814
 ] 

Zhihong Yu commented on HBASE-5064:
---

Test failure isn't always associated with NumberFormatException.
I ran tests with patch 15 and saw the following in TestTableMapReduce-output:
{code}
2011-12-28 09:28:40,839 DEBUG [pool-1-thread-1] 
client.HConnectionManager$HConnectionImplementation(1228): Cached location for 
mrtest,rrr,1325093287841.463ef8c4d900c4b3c9f1d2df476e1bd9. is sea-lab-1:36329
2011-12-28 09:28:40,840 DEBUG [pool-1-thread-1] 
client.HConnectionManager$HConnectionImplementation(1228): Cached location for 
mrtest,sss,1325093287841.d4448f0cc9806499135566cb47630427. is sea-lab-1:36329
2011-12-28 09:28:40,840 DEBUG [pool-1-thread-1] 
client.HConnectionManager$HConnectionImplementation(1228): Cached location for 
mrtest,ttt,1325093287841.1ca6fb61e8ce69ce8505cfd7b157008c. is sea-lab-1:36329
2011-12-28 09:28:40,840 DEBUG [pool-1-thread-1] 
client.HConnectionManager$HConnectionImplementation(1228): Cached location for 
mrtest,uuu,1325093287841.b74b52fb22bea8b1fd100e225943ab58. is sea-lab-1:36329
2011-12-28 09:28:40,841 DEBUG [pool-1-thread-1] 
client.HConnectionManager$HConnectionImplementation(1228): Cached location for 
mrtest,vvv,1325093287841.f50b26a0272f89c2123d80b8d13592b8. is sea-lab-1:36329
2011-12-28 09:28:40,841 INFO  [pool-1-thread-1] 
mapred.TableInputFormatBase(142): split: 1-sea-lab-1:mmm,
2011-12-28 09:28:41,948 DEBUG [Finalizer] 
client.HConnectionManager$HConnectionImplementation(1859): The connection to 
null was closed by the finalize method.
2011-12-28 09:28:56,283 WARN  [JVM Runner 
jvm_20111228092823016_0001_m_1001648490 spawned.] 
mapred.DefaultTaskController(137): Exit code from task is : 1
2011-12-28 09:28:56,284 WARN  [Thread-368] mapred.TaskRunner(270): 
attempt_20111228092823016_0001_m_00_0 : Child Error
java.io.IOException: Task process exit with nonzero status of 1.
  at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
2011-12-28 09:29:03,464 WARN  [JVM Runner 
jvm_20111228092823016_0001_m_1912611626 spawned.] 
mapred.DefaultTaskController(137): Exit code from task is : 1
2011-12-28 09:29:03,465 WARN  [Thread-390] mapred.TaskRunner(270): 
attempt_20111228092823016_0001_m_00_0 : Child Error
java.io.IOException: Task process exit with nonzero status of 1.
  at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
java.lang.Throwable: Child Error
  at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
Caused by: java.io.IOException: Task process exit with nonzero status of 1.
  at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)

java.lang.Throwable: Child Error
  at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
Caused by: java.io.IOException: Task process exit with nonzero status of 1.
  at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)

2011-12-28 09:29:10,565 WARN  [1318497351@qtp-99120402-1] 
mapred.TaskLogServlet(111): Failed to retrieve stdout log for task: 
attempt_20111228092823016_0001_m_00_0
java.io.FileNotFoundException: 
/home/hduser/trunk/../logs/userlogs/job_20111228092823016_0001/attempt_20111228092823016_0001_m_00_0/log.index
 (No such file or directory)
{code}

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v15.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 
 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176826#comment-13176826
 ] 

Hadoop QA commented on HBASE-5099:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508783/hbase-5099.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/613//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/613//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/613//console

This message is automatically generated.

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png, hbase-5099.patch


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Patch Available  (was: Open)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v15.patch, 5064.v16.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 
 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176830#comment-13176830
 ] 

nkeywal commented on HBASE-5064:


Yep, may be the NumberFormatException was hiding the FileNotFoundException 
before (it's possible as well to have both error simultaneously). I haven't 
tried v16 on the whole suite test locally, but it seems to work on my subset. 
Let's see.

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v15.patch, 5064.v16.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 
 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Open  (was: Patch Available)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v15.patch, 5064.v16.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 
 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Attachment: 5064.v16.patch

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v15.patch, 5064.v16.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 
 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176853#comment-13176853
 ] 

Hadoop QA commented on HBASE-5064:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508793/5064.v16.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/614//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/614//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/614//console

This message is automatically generated.

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v15.patch, 5064.v16.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 
 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Open  (was: Patch Available)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v15.patch, 5064.v16.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 
 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root

2011-12-28 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176859#comment-13176859
 ] 

jirapos...@reviews.apache.org commented on HBASE-5099:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3323/
---

(Updated 2011-12-28 22:38:05.839771)


Review request for hbase, Ted Yu and Michael Stack.


Changes
---

Thanks for the comment.  Here is a new patch.


Summary
---

Per discussion with Ted (on issues), I put up a patch to run 
tryRecoveringExpiredZKSession() in a separate thread and time it out and fail 
the recovery if it is stuck somewhere.

I added a test to test the abort method.  However, for the mini cluster, 
becomeActiveMaster() doesn't succeed so the abort method ends up always 
aborted.  So the actually success recovery is not tested.


This addresses bug HBASE-5099.
https://issues.apache.org/jira/browse/HBASE-5099


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/master/HMaster.java a5935a6 
  src/test/java/org/apache/hadoop/hbase/master/TestMaster.java b7a8270 

Diff: https://reviews.apache.org/r/3323/diff


Testing
---

mvn -PlocalTests -Dtest=TestMaster* clean test


Thanks,

Jimmy



 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png, hbase-5099.patch


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176860#comment-13176860
 ] 

nkeywal commented on HBASE-5064:


fail: java.io.FileNotFoundException: 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/target/test-data/238434d5-7c45-413f-ba99-1440a560e1f1/hadoop-log-dir/userlogs/job_20111228220208254_0001/attempt_20111228220208254_0001_m_02_0/log.index
 (No such file or directory)

enough for today

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v15.patch, 5064.v16.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 
 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176863#comment-13176863
 ] 

Zhihong Yu commented on HBASE-5064:
---

@N:
You didn't use System.[get/set]Property() for hadoop.log.dir

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v13.patch, 5064.v14.patch, 5064.v14.patch, 
 5064.v15.patch, 5064.v16.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 
 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root

2011-12-28 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176867#comment-13176867
 ] 

jirapos...@reviews.apache.org commented on HBASE-5099:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3323/#review4138
---



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
https://reviews.apache.org/r/3323/#comment9344

I think declaring ExecutionException is better than declaring Throwable.
See comment below.



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
https://reviews.apache.org/r/3323/#comment9345

The call() at line 1429 would only throw 3 types of exceptions: IE, IOE and 
KE.

So the cause should be one the three types.
I suggest using instanceof to check against each of them and if a match is 
found, cast as that exception type and rethrow.

If there is no match, I suggest using this ctor to throw a new IOException:
IOException(String message, Throwable cause) 


- Ted


On 2011-12-28 22:38:05, Jimmy Xiang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3323/
bq.  ---
bq.  
bq.  (Updated 2011-12-28 22:38:05)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu and Michael Stack.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Per discussion with Ted (on issues), I put up a patch to run 
tryRecoveringExpiredZKSession() in a separate thread and time it out and fail 
the recovery if it is stuck somewhere.
bq.  
bq.  I added a test to test the abort method.  However, for the mini cluster, 
becomeActiveMaster() doesn't succeed so the abort method ends up always 
aborted.  So the actually success recovery is not tested.
bq.  
bq.  
bq.  This addresses bug HBASE-5099.
bq.  https://issues.apache.org/jira/browse/HBASE-5099
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java a5935a6 
bq.src/test/java/org/apache/hadoop/hbase/master/TestMaster.java b7a8270 
bq.  
bq.  Diff: https://reviews.apache.org/r/3323/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  mvn -PlocalTests -Dtest=TestMaster* clean test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jimmy
bq.  
bq.



 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png, hbase-5099-v2.patch, hbase-5099.patch


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5033) Opening/Closing store in parallel to reduce region open/close time

2011-12-28 Thread Liyin Tang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HBASE-5033:
--

Attachment: HBASE-5033-apach-trunk.patch

The patch is based on the apache trunk. 
This patch has passed all the unit tests except for TestCoprocessorEndpoint, 
which is failed with or without this change.

 Opening/Closing store in parallel to reduce region open/close time
 --

 Key: HBASE-5033
 URL: https://issues.apache.org/jira/browse/HBASE-5033
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D933.1.patch, D933.2.patch, D933.3.patch, D933.4.patch, 
 D933.5.patch, HBASE-5033-apach-trunk.patch


 Region servers are opening/closing each store and each store file for every 
 store in sequential fashion, which may cause inefficiency to open/close 
 regions. 
 So this diff is to open/close each store in parallel in order to reduce 
 region open/close time. Also it would help to reduce the cluster restart time.
 1) Opening each store in parallel
 2) Loading each store file for every store in parallel
 3) Closing each store in parallel
 4) Closing each store file for every store in parallel.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5033) Opening/Closing store in parallel to reduce region open/close time

2011-12-28 Thread Liyin Tang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HBASE-5033:
--

Status: Patch Available  (was: Open)

 Opening/Closing store in parallel to reduce region open/close time
 --

 Key: HBASE-5033
 URL: https://issues.apache.org/jira/browse/HBASE-5033
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D933.1.patch, D933.2.patch, D933.3.patch, D933.4.patch, 
 D933.5.patch, HBASE-5033-apach-trunk.patch


 Region servers are opening/closing each store and each store file for every 
 store in sequential fashion, which may cause inefficiency to open/close 
 regions. 
 So this diff is to open/close each store in parallel in order to reduce 
 region open/close time. Also it would help to reduce the cluster restart time.
 1) Opening each store in parallel
 2) Loading each store file for every store in parallel
 3) Closing each store in parallel
 4) Closing each store file for every store in parallel.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root re

2011-12-28 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5099:
---

Status: Open  (was: Patch Available)

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png, hbase-5099-v2.patch, hbase-5099.patch


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root re

2011-12-28 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5099:
---

Status: Patch Available  (was: Open)

Updated per review comments.  The failed tests: 
TestHFileOutputFormat,TestTableMapReduce work fine on my box.

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png, hbase-5099-v2.patch, hbase-5099.patch


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5033) Opening/Closing store in parallel to reduce region open/close time

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176869#comment-13176869
 ] 

Hadoop QA commented on HBASE-5033:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12508798/HBASE-5033-apach-trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/616//console

This message is automatically generated.

 Opening/Closing store in parallel to reduce region open/close time
 --

 Key: HBASE-5033
 URL: https://issues.apache.org/jira/browse/HBASE-5033
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D933.1.patch, D933.2.patch, D933.3.patch, D933.4.patch, 
 D933.5.patch, HBASE-5033-apach-trunk.patch


 Region servers are opening/closing each store and each store file for every 
 store in sequential fashion, which may cause inefficiency to open/close 
 regions. 
 So this diff is to open/close each store in parallel in order to reduce 
 region open/close time. Also it would help to reduce the cluster restart time.
 1) Opening each store in parallel
 2) Loading each store file for every store in parallel
 3) Closing each store in parallel
 4) Closing each store file for every store in parallel.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3924) Improve Shell's CLI help

2011-12-28 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-3924:
-

Fix Version/s: (was: 0.92.0)

 Improve Shell's CLI help
 

 Key: HBASE-3924
 URL: https://issues.apache.org/jira/browse/HBASE-3924
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
 Fix For: 0.94.0

 Attachments: HBASE-3924.patch


 In the hirb.rb source we have
 {noformat}
 # so they don't go through to irb.  Output shell 'usage' if user types 
 '--help'
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  formatFormatter for outputting results: console | html.
 Default: console
  -d | --debug  Set DEBUG log levels.
 HERE
 found = []
 format = 'console'
 script2run = nil
 log_level = org.apache.log4j.Level::ERROR
 for arg in ARGV
  if arg =~ /^--format=(.+)/i
format = $1
if format =~ /^html$/i
  raise NoMethodError.new(Not yet implemented)
elsif format =~ /^console$/i
  # This is default
else
  raise ArgumentError.new(Unsupported format  + arg)
end
found.push(arg)
  elsif arg == '-h' || arg == '--help'
puts cmdline_help
exit
  elsif arg == '-d' || arg == '--debug'
log_level = org.apache.log4j.Level::DEBUG
$fullBackTrace = true
puts Setting DEBUG log level...
  else
# Presume it a script. Save it off for running later below
# after we've set up some environment.
script2run = arg
found.push(arg)
# Presume that any other args are meant for the script.
break
  end
 end
 {noformat}
 We should enhance the help printed when using -h/--help to look like this?
 {noformat}
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  --format={console|html}Formatter for outputting results.
 Default: console
  -d | --debug  Set DEBUG log levels.
  -h | --help   This help.
  script-filename [script-options]
 HERE
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3924) Improve Shell's CLI help

2011-12-28 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-3924:
-

Attachment: 3924.txt

This is what I committed to trunk.
(Don't think we should burden 0.92 with it)

 Improve Shell's CLI help
 

 Key: HBASE-3924
 URL: https://issues.apache.org/jira/browse/HBASE-3924
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
 Fix For: 0.94.0

 Attachments: 3924.txt, HBASE-3924.patch


 In the hirb.rb source we have
 {noformat}
 # so they don't go through to irb.  Output shell 'usage' if user types 
 '--help'
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  formatFormatter for outputting results: console | html.
 Default: console
  -d | --debug  Set DEBUG log levels.
 HERE
 found = []
 format = 'console'
 script2run = nil
 log_level = org.apache.log4j.Level::ERROR
 for arg in ARGV
  if arg =~ /^--format=(.+)/i
format = $1
if format =~ /^html$/i
  raise NoMethodError.new(Not yet implemented)
elsif format =~ /^console$/i
  # This is default
else
  raise ArgumentError.new(Unsupported format  + arg)
end
found.push(arg)
  elsif arg == '-h' || arg == '--help'
puts cmdline_help
exit
  elsif arg == '-d' || arg == '--debug'
log_level = org.apache.log4j.Level::DEBUG
$fullBackTrace = true
puts Setting DEBUG log level...
  else
# Presume it a script. Save it off for running later below
# after we've set up some environment.
script2run = arg
found.push(arg)
# Presume that any other args are meant for the script.
break
  end
 end
 {noformat}
 We should enhance the help printed when using -h/--help to look like this?
 {noformat}
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  --format={console|html}Formatter for outputting results.
 Default: console
  -d | --debug  Set DEBUG log levels.
  -h | --help   This help.
  script-filename [script-options]
 HERE
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3924) Improve Shell's CLI help

2011-12-28 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-3924:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 Improve Shell's CLI help
 

 Key: HBASE-3924
 URL: https://issues.apache.org/jira/browse/HBASE-3924
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
 Fix For: 0.94.0

 Attachments: 3924.txt, HBASE-3924.patch


 In the hirb.rb source we have
 {noformat}
 # so they don't go through to irb.  Output shell 'usage' if user types 
 '--help'
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  formatFormatter for outputting results: console | html.
 Default: console
  -d | --debug  Set DEBUG log levels.
 HERE
 found = []
 format = 'console'
 script2run = nil
 log_level = org.apache.log4j.Level::ERROR
 for arg in ARGV
  if arg =~ /^--format=(.+)/i
format = $1
if format =~ /^html$/i
  raise NoMethodError.new(Not yet implemented)
elsif format =~ /^console$/i
  # This is default
else
  raise ArgumentError.new(Unsupported format  + arg)
end
found.push(arg)
  elsif arg == '-h' || arg == '--help'
puts cmdline_help
exit
  elsif arg == '-d' || arg == '--debug'
log_level = org.apache.log4j.Level::DEBUG
$fullBackTrace = true
puts Setting DEBUG log level...
  else
# Presume it a script. Save it off for running later below
# after we've set up some environment.
script2run = arg
found.push(arg)
# Presume that any other args are meant for the script.
break
  end
 end
 {noformat}
 We should enhance the help printed when using -h/--help to look like this?
 {noformat}
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  --format={console|html}Formatter for outputting results.
 Default: console
  -d | --debug  Set DEBUG log levels.
  -h | --help   This help.
  script-filename [script-options]
 HERE
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2011-12-28 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4218:
---

Attachment: D447.14.patch

mbautin updated the revision [jira] [HBASE-4218] HFile data block encoding 
(delta encoding).
Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

  Addressing Ted's comment and Matt's comments.

REVISION DETAIL
  https://reviews.facebook.net/D447

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
  src/main/java/org/apache/hadoop/hbase/KeyValue.java
  src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
  src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
  src/main/ruby/hbase/admin.rb
  src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
  src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
  src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
  src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
  src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
  src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
  src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java


 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 

[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176873#comment-13176873
 ] 

Hadoop QA commented on HBASE-4218:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508801/D447.14.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 68 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/618//console

This message is automatically generated.

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, 
 D447.12.patch, D447.13.patch, D447.14.patch, D447.2.patch, D447.3.patch, 
 D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, 
 D447.9.patch, Data-block-encoding-2011-12-23.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5032) Add other DELETE or DELETE into the delete bloom filter

2011-12-28 Thread Liyin Tang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HBASE-5032:
--

Description: 
To speed up time range scans we need to seek to the maximum timestamp of the 
requested range,instead of going to the first KV of the (row, column) pair and 
iterating from there. If we don't know the (row, column), e.g. if it is not 
specified in the query, we need to go to end of the current row/column pair 
first, get a KV from there, and do another seek to (row', column', 
timerange_max) from there. We can only skip over to the timerange_max timestamp 
when we know that there are no DeleteColumn records at the top of that 
row/column with a higher timestamp. We can utilize another Bloom filter keyed 
on (row, column) to quickly find that out. (From HBASE-4962)

So the motivation is to save seek ops for scanning time-range queries if we 
know there is no delete for this row/column. 

From the implementation prospective, we have already have a delete family 
bloom filter which contains all the 







  was:
Previously, the delete family bloom filter only contains the row key which has 
the delete family. It helps us to avoid the top-row seek operation.

This jira attempts to add the delete column into this delete bloom filter as 
well (rename the delete family bloom filter as delete bloom filter).

The motivation is to save seek ops for scan time-range queries if we know there 
is no delete column for this row/column. 
We can seek directly to the exact timestamp we are interested in, instead of 
seeking to the latest timestamp and keeping skipping to find out whether there 
is any delete column before the interested timestamp.



Summary: Add other DELETE or DELETE  into the delete bloom filter  
(was: Add DELETE COLUMN into the delete bloom filter)

 Add other DELETE or DELETE  into the delete bloom filter
 

 Key: HBASE-5032
 URL: https://issues.apache.org/jira/browse/HBASE-5032
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 To speed up time range scans we need to seek to the maximum timestamp of the 
 requested range,instead of going to the first KV of the (row, column) pair 
 and iterating from there. If we don't know the (row, column), e.g. if it is 
 not specified in the query, we need to go to end of the current row/column 
 pair first, get a KV from there, and do another seek to (row', column', 
 timerange_max) from there. We can only skip over to the timerange_max 
 timestamp when we know that there are no DeleteColumn records at the top of 
 that row/column with a higher timestamp. We can utilize another Bloom filter 
 keyed on (row, column) to quickly find that out. (From HBASE-4962)
 So the motivation is to save seek ops for scanning time-range queries if we 
 know there is no delete for this row/column. 
 From the implementation prospective, we have already have a delete family 
 bloom filter which contains all the 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5032) Add other DELETE type information into the delete bloom filter to optimize the time range query

2011-12-28 Thread Liyin Tang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HBASE-5032:
--

Description: 
To speed up time range scans we need to seek to the maximum timestamp of the 
requested range,instead of going to the first KV of the (row, column) pair and 
iterating from there. If we don't know the (row, column), e.g. if it is not 
specified in the query, we need to go to end of the current row/column pair 
first, get a KV from there, and do another seek to (row', column', 
timerange_max) from there. We can only skip over to the timerange_max timestamp 
when we know that there are no DeleteColumn records at the top of that 
row/column with a higher timestamp. We can utilize another Bloom filter keyed 
on (row, column) to quickly find that out. (From HBASE-4962)

So the motivation is to save seek ops for scanning time-range queries if we 
know there is no delete for this row/column. 

From the implementation prospective, we have already had a delete family bloom 
filter which contains all the delete family key values. So we can reuse the 
same bloom filter for all other kinds of delete information such as delete 
columns or delete. 







  was:
To speed up time range scans we need to seek to the maximum timestamp of the 
requested range,instead of going to the first KV of the (row, column) pair and 
iterating from there. If we don't know the (row, column), e.g. if it is not 
specified in the query, we need to go to end of the current row/column pair 
first, get a KV from there, and do another seek to (row', column', 
timerange_max) from there. We can only skip over to the timerange_max timestamp 
when we know that there are no DeleteColumn records at the top of that 
row/column with a higher timestamp. We can utilize another Bloom filter keyed 
on (row, column) to quickly find that out. (From HBASE-4962)

So the motivation is to save seek ops for scanning time-range queries if we 
know there is no delete for this row/column. 

From the implementation prospective, we have already have a delete family 
bloom filter which contains all the 







Summary: Add other DELETE type information into the delete bloom filter 
to optimize the time range query  (was: Add other DELETE or DELETE  into the 
delete bloom filter)

 Add other DELETE type information into the delete bloom filter to optimize 
 the time range query
 ---

 Key: HBASE-5032
 URL: https://issues.apache.org/jira/browse/HBASE-5032
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 To speed up time range scans we need to seek to the maximum timestamp of the 
 requested range,instead of going to the first KV of the (row, column) pair 
 and iterating from there. If we don't know the (row, column), e.g. if it is 
 not specified in the query, we need to go to end of the current row/column 
 pair first, get a KV from there, and do another seek to (row', column', 
 timerange_max) from there. We can only skip over to the timerange_max 
 timestamp when we know that there are no DeleteColumn records at the top of 
 that row/column with a higher timestamp. We can utilize another Bloom filter 
 keyed on (row, column) to quickly find that out. (From HBASE-4962)
 So the motivation is to save seek ops for scanning time-range queries if we 
 know there is no delete for this row/column. 
 From the implementation prospective, we have already had a delete family 
 bloom filter which contains all the delete family key values. So we can reuse 
 the same bloom filter for all other kinds of delete information such as 
 delete columns or delete. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2011-12-28 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176907#comment-13176907
 ] 

Phabricator commented on HBASE-5010:


Kannan has commented on the revision [jira] [HBASE-5010] [89-fb] Filter HFiles 
based on TTL.

  The compaction code path doesn't yet get the benefit of this optimization. 
Spoke with Mikhail offline, and he'll update the diff to handle this case.

REVISION DETAIL
  https://reviews.facebook.net/D909


 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Zhihong Yu
 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5101) Add a max number of regions per regionserver limit

2011-12-28 Thread Jonathan Hsieh (Created) (JIRA)
Add a max number of regions per regionserver limit
--

 Key: HBASE-5101
 URL: https://issues.apache.org/jira/browse/HBASE-5101
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh


In a testing environment, a cluster got to a state with more than 1500 regions 
per region server, and essentially because stuck and unavailable.  We could add 
a limit to the number of regions that a region server can serve to prevent this 
from happening.  This looks like it could be implemented in the core or as a 
coprocessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5101) Add a max number of regions per regionserver limit

2011-12-28 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5101:
--

Description: In a testing environment, a cluster got to a state with more 
than 1500 regions per region server, and essentially became stuck and 
unavailable.  We could add a limit to the number of regions that a region 
server can serve to prevent this from happening.  This looks like it could be 
implemented in the core or as a coprocessor.  (was: In a testing environment, a 
cluster got to a state with more than 1500 regions per region server, and 
essentially because stuck and unavailable.  We could add a limit to the number 
of regions that a region server can serve to prevent this from happening.  This 
looks like it could be implemented in the core or as a coprocessor.)

 Add a max number of regions per regionserver limit
 --

 Key: HBASE-5101
 URL: https://issues.apache.org/jira/browse/HBASE-5101
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh

 In a testing environment, a cluster got to a state with more than 1500 
 regions per region server, and essentially became stuck and unavailable.  We 
 could add a limit to the number of regions that a region server can serve to 
 prevent this from happening.  This looks like it could be implemented in the 
 core or as a coprocessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5052) The path where a dynamically loaded coprocessor jar is copied on the local file system depends on the region name (and implicitly, the start key)

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176919#comment-13176919
 ] 

Hadoop QA commented on HBASE-5052:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12507668/HBASE-5052.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/615//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/615//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/615//console

This message is automatically generated.

 The path where a dynamically loaded coprocessor jar is copied on the local 
 file system depends on the region name (and implicitly, the start key)
 -

 Key: HBASE-5052
 URL: https://issues.apache.org/jira/browse/HBASE-5052
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
 Attachments: HBASE-5052.patch


 When loading a coprocessor from hdfs, the jar file gets copied to a path on 
 the local filesystem, which depends on the region name, and the region start 
 key. The name is cleaned, but not enough, so when you have filesystem 
 unfriendly characters (/?:, etc), the coprocessor is not loaded, and an error 
 is thrown

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176923#comment-13176923
 ] 

Hadoop QA commented on HBASE-5099:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508799/hbase-5099-v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.mapred.TestTableMapReduce

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/617//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/617//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/617//console

This message is automatically generated.

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png, hbase-5099-v2.patch, hbase-5099.patch


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3924) Improve Shell's CLI help

2011-12-28 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176924#comment-13176924
 ] 

Hudson commented on HBASE-3924:
---

Integrated in HBase-TRUNK #2588 (See 
[https://builds.apache.org/job/HBase-TRUNK/2588/])
HBASE-3924  Improve Shell's CLI help (Harsh J)

larsh : 
Files : 
* /hbase/trunk/bin/hirb.rb


 Improve Shell's CLI help
 

 Key: HBASE-3924
 URL: https://issues.apache.org/jira/browse/HBASE-3924
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
 Fix For: 0.94.0

 Attachments: 3924.txt, HBASE-3924.patch


 In the hirb.rb source we have
 {noformat}
 # so they don't go through to irb.  Output shell 'usage' if user types 
 '--help'
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  formatFormatter for outputting results: console | html.
 Default: console
  -d | --debug  Set DEBUG log levels.
 HERE
 found = []
 format = 'console'
 script2run = nil
 log_level = org.apache.log4j.Level::ERROR
 for arg in ARGV
  if arg =~ /^--format=(.+)/i
format = $1
if format =~ /^html$/i
  raise NoMethodError.new(Not yet implemented)
elsif format =~ /^console$/i
  # This is default
else
  raise ArgumentError.new(Unsupported format  + arg)
end
found.push(arg)
  elsif arg == '-h' || arg == '--help'
puts cmdline_help
exit
  elsif arg == '-d' || arg == '--debug'
log_level = org.apache.log4j.Level::DEBUG
$fullBackTrace = true
puts Setting DEBUG log level...
  else
# Presume it a script. Save it off for running later below
# after we've set up some environment.
script2run = arg
found.push(arg)
# Presume that any other args are meant for the script.
break
  end
 end
 {noformat}
 We should enhance the help printed when using -h/--help to look like this?
 {noformat}
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  --format={console|html}Formatter for outputting results.
 Default: console
  -d | --debug  Set DEBUG log levels.
  -h | --help   This help.
  script-filename [script-options]
 HERE
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5097) RegionObserver implementation whose preScannerOpen and postScannerOpen Impl return null can stall the system initialization through NPE

2011-12-28 Thread ramkrishna.s.vasudevan (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan reassigned HBASE-5097:
-

Assignee: ramkrishna.s.vasudevan

 RegionObserver implementation whose preScannerOpen and postScannerOpen Impl 
 return null can stall the system initialization through NPE
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5052) The path where a dynamically loaded coprocessor jar is copied on the local file system depends on the region name (and implicitly, the start key)

2011-12-28 Thread Zhihong Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu reassigned HBASE-5052:
-

Assignee: Andrei Dragomir

 The path where a dynamically loaded coprocessor jar is copied on the local 
 file system depends on the region name (and implicitly, the start key)
 -

 Key: HBASE-5052
 URL: https://issues.apache.org/jira/browse/HBASE-5052
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Attachments: HBASE-5052.patch


 When loading a coprocessor from hdfs, the jar file gets copied to a path on 
 the local filesystem, which depends on the region name, and the region start 
 key. The name is cleaned, but not enough, so when you have filesystem 
 unfriendly characters (/?:, etc), the coprocessor is not loaded, and an error 
 is thrown

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5052) The path where a dynamically loaded coprocessor jar is copied on the local file system depends on the region name (and implicitly, the start key)

2011-12-28 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5052:
--

Fix Version/s: 0.94.0
   0.92.0
 Hadoop Flags: Reviewed

 The path where a dynamically loaded coprocessor jar is copied on the local 
 file system depends on the region name (and implicitly, the start key)
 -

 Key: HBASE-5052
 URL: https://issues.apache.org/jira/browse/HBASE-5052
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5052.patch


 When loading a coprocessor from hdfs, the jar file gets copied to a path on 
 the local filesystem, which depends on the region name, and the region start 
 key. The name is cleaned, but not enough, so when you have filesystem 
 unfriendly characters (/?:, etc), the coprocessor is not loaded, and an error 
 is thrown

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-28 Thread chunhui shen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176927#comment-13176927
 ] 

chunhui shen commented on HBASE-5100:
-

@Zhihong
If we remove the try/finally construct, when encountering excepiton in 
this.parent.close(false), the rollback of split would not do 
this.parent.initialize() because of no JournalEntry.CLOSED_PARENT_REGION.

 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-5100.patch


 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,348 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
 event for f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  state=CLOSED, ts=1324830285347
 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
 f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 

[jira] [Commented] (HBASE-5052) The path where a dynamically loaded coprocessor jar is copied on the local file system depends on the region name (and implicitly, the start key)

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176928#comment-13176928
 ] 

Zhihong Yu commented on HBASE-5052:
---

+1 on patch.
TestCoprocessorEndpoint itself isn't stable.

 The path where a dynamically loaded coprocessor jar is copied on the local 
 file system depends on the region name (and implicitly, the start key)
 -

 Key: HBASE-5052
 URL: https://issues.apache.org/jira/browse/HBASE-5052
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-5052.patch


 When loading a coprocessor from hdfs, the jar file gets copied to a path on 
 the local filesystem, which depends on the region name, and the region start 
 key. The name is cleaned, but not enough, so when you have filesystem 
 unfriendly characters (/?:, etc), the coprocessor is not loaded, and an error 
 is thrown

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176930#comment-13176930
 ] 

Zhihong Yu commented on HBASE-5100:
---

@Chunhui:
My comment @ 28/Dec/11 06:35 renames the boolean whose initial value would be 
false.
The renamed boolean carries negated value compared to that for closedJE.

What do you think ?

 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-5100.patch


 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,348 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
 event for f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  state=CLOSED, ts=1324830285347
 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
 f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 

[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-28 Thread chunhui shen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176932#comment-13176932
 ] 

chunhui shen commented on HBASE-5100:
-

@Zhihong
If the initial boolean value is false, when encountering excepiton in 
this.parent.close(false), it will be false all the same which causes 
JournalEntry.CLOSED_PARENT_REGION not added in this.journal

 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-5100.patch


 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,348 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
 event for f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  state=CLOSED, ts=1324830285347
 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
 f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 

[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176934#comment-13176934
 ] 

Zhihong Yu commented on HBASE-5100:
---

Current form of patch is fine.
How about renaming closedJE as addJournalEntry ?

 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-5100.patch


 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,348 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
 event for f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  state=CLOSED, ts=1324830285347
 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
 f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 

[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-28 Thread chunhui shen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176936#comment-13176936
 ] 

chunhui shen commented on HBASE-5100:
-

@Zhihong
En, it's better.

 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-5100.patch


 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,348 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
 event for f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  state=CLOSED, ts=1324830285347
 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
 f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  destination server is + 

[jira] [Updated] (HBASE-5010) Filter HFiles based on TTL

2011-12-28 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5010:
---

Attachment: D909.4.patch

mbautin updated the revision [jira] [HBASE-5010] [89-fb] Filter HFiles based 
on TTL.
Reviewers: Kannan, Liyin, JIRA

  Addressing Kannan's comment about doing the same optimization during 
compactions. Adding a compaction test to the unit test, and verifying that we 
don't read expired files using per-CF metrics.

REVISION DETAIL
  https://reviews.facebook.net/D909

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
  src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
  src/main/java/org/apache/hadoop/hbase/regionserver/NonLazyKeyValueScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/TimeRangeTracker.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  src/main/java/org/apache/hadoop/hbase/util/Threads.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java


 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Zhihong Yu
 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2011-12-28 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176942#comment-13176942
 ] 

Phabricator commented on HBASE-5010:


mbautin has commented on the revision [jira] [HBASE-5010] [89-fb] Filter 
HFiles based on TTL.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java:219 
Assuming that a default scan will go through all KVs. Please let me know if 
this is incorrect.

REVISION DETAIL
  https://reviews.facebook.net/D909


 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Zhihong Yu
 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2011-12-28 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176943#comment-13176943
 ] 

Phabricator commented on HBASE-5010:


mbautin has commented on the revision [jira] [HBASE-5010] [89-fb] Filter 
HFiles based on TTL.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java:154 This 
is addressed in the most recent version of the diff.

REVISION DETAIL
  https://reviews.facebook.net/D909


 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Zhihong Yu
 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5102) Change the default value of the property hbase.connection.per.config to false in hbase-default.xml

2011-12-28 Thread ramkrishna.s.vasudevan (Created) (JIRA)
Change the default value of  the property hbase.connection.per.config to 
false in hbase-default.xml
-

 Key: HBASE-5102
 URL: https://issues.apache.org/jira/browse/HBASE-5102
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
Priority: Minor


The property hbase.connection.per.config has a default value of true in 
hbase-default.xml. In HConnectionManager we try to assign false as the default 
value if no value is specified.  Better to make it uniform. 
As per Ted's suggestion making it false in the hbase-default.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176944#comment-13176944
 ] 

Hadoop QA commented on HBASE-5010:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508811/D909.4.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 17 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/619//console

This message is automatically generated.

 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Zhihong Yu
 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-28 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176948#comment-13176948
 ] 

ramkrishna.s.vasudevan commented on HBASE-5100:
---

@Chunhui
Good catch. What about addClosedJournalEntry?

Other than that +1 on patch.

 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-5100.patch


 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,348 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
 event for f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  state=CLOSED, ts=1324830285347
 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
 f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 

[jira] [Commented] (HBASE-5101) Add a max number of regions per regionserver limit

2011-12-28 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176949#comment-13176949
 ] 

Andrew Purtell commented on HBASE-5101:
---

This would disable splitting on the RS at the limit?


 Add a max number of regions per regionserver limit
 --

 Key: HBASE-5101
 URL: https://issues.apache.org/jira/browse/HBASE-5101
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh

 In a testing environment, a cluster got to a state with more than 1500 
 regions per region server, and essentially became stuck and unavailable.  We 
 could add a limit to the number of regions that a region server can serve to 
 prevent this from happening.  This looks like it could be implemented in the 
 core or as a coprocessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5101) Add a max number of regions per regionserver limit

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176951#comment-13176951
 ] 

Zhihong Yu commented on HBASE-5101:
---

@Jon:
Can you describe how the cluster got into this state ?
I guess a lot of region servers crashed ?
Otherwise there should have been better planning for the HFile size limit.

 Add a max number of regions per regionserver limit
 --

 Key: HBASE-5101
 URL: https://issues.apache.org/jira/browse/HBASE-5101
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh

 In a testing environment, a cluster got to a state with more than 1500 
 regions per region server, and essentially became stuck and unavailable.  We 
 could add a limit to the number of regions that a region server can serve to 
 prevent this from happening.  This looks like it could be implemented in the 
 core or as a coprocessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5101) Add a max number of regions per regionserver limit

2011-12-28 Thread Zhihong Yu (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176951#comment-13176951
 ] 

Zhihong Yu edited comment on HBASE-5101 at 12/29/11 2:27 AM:
-

@Jon:
Can you describe how the cluster got into this state ?
I guess a lot of region servers crashed ?
Otherwise there should have been better planning for the region size.

  was (Author: zhi...@ebaysf.com):
@Jon:
Can you describe how the cluster got into this state ?
I guess a lot of region servers crashed ?
Otherwise there should have been better planning for the HFile size limit.
  
 Add a max number of regions per regionserver limit
 --

 Key: HBASE-5101
 URL: https://issues.apache.org/jira/browse/HBASE-5101
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh

 In a testing environment, a cluster got to a state with more than 1500 
 regions per region server, and essentially became stuck and unavailable.  We 
 could add a limit to the number of regions that a region server can serve to 
 prevent this from happening.  This looks like it could be implemented in the 
 core or as a coprocessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5101) Add a max number of regions per regionserver limit

2011-12-28 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176959#comment-13176959
 ] 

Jonathan Hsieh commented on HBASE-5101:
---

@Andrew 
I think that would be the idea. The hope is to avoid getting region servers 
into trouble and to give an admin some warning when they are approaching 
trouble (maybe reached some percentage of region limit).

@Ted
I've been purposely testing using a stress configuration with heavy write load 
that purposely requires flushes (4 MB), splits (64MB) and compactions all the 
time.  Along the way region servers crash (which is fine -- fault injection is 
part of this workload). 

I've encountered some situations where folks don't know the distribution of 
their row keys (or don't have uniform row key distributions).  This could be a 
useful go-between in situations where region pre-splitting with dynamic 
splitting off may not be effective.  

 Add a max number of regions per regionserver limit
 --

 Key: HBASE-5101
 URL: https://issues.apache.org/jira/browse/HBASE-5101
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh

 In a testing environment, a cluster got to a state with more than 1500 
 regions per region server, and essentially became stuck and unavailable.  We 
 could add a limit to the number of regions that a region server can serve to 
 prevent this from happening.  This looks like it could be implemented in the 
 core or as a coprocessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5010) Filter HFiles based on TTL

2011-12-28 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5010:
---

Attachment: D909.5.patch

mbautin updated the revision [jira] [HBASE-5010] [89-fb] Filter HFiles based 
on TTL.
Reviewers: Kannan, Liyin, JIRA

  Actually, this is where I addressed Kannan's comment about compactions:

  https://reviews.facebook.net/D909?vs=2679id=3015whitespace=ignore-all

  (see the new line 154 in StoreScanner.java:

  scanners = selectScannersFrom(scanners);

  )

  I have spent quite a bit of time making sure that the unit test is testing 
the optimization during compactions. I am now invoking compactions in two 
different ways, for a total of 6 parameterized instances of the test, but it is 
still really quick. All of our compaction codepaths go through StoreScanner 
constructors, so we should have the optimization in all of them.

  All unit tests pass.

REVISION DETAIL
  https://reviews.facebook.net/D909

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
  src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
  src/main/java/org/apache/hadoop/hbase/regionserver/NonLazyKeyValueScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/TimeRangeTracker.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  src/main/java/org/apache/hadoop/hbase/util/Threads.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java


 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Zhihong Yu
 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176966#comment-13176966
 ] 

Hadoop QA commented on HBASE-5010:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508817/D909.5.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 20 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/620//console

This message is automatically generated.

 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Zhihong Yu
 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2011-12-28 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176967#comment-13176967
 ] 

Phabricator commented on HBASE-5010:


mbautin has commented on the revision [jira] [HBASE-5010] [89-fb] Filter 
HFiles based on TTL.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java:155 This 
is why we filter out expired StoreFiles on compactions.

REVISION DETAIL
  https://reviews.facebook.net/D909


 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Zhihong Yu
 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5101) Add a max number of regions per regionserver limit

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176969#comment-13176969
 ] 

Zhihong Yu commented on HBASE-5101:
---

bq. maybe reached some percentage of region limit
How do we calculate region limit ?

 Add a max number of regions per regionserver limit
 --

 Key: HBASE-5101
 URL: https://issues.apache.org/jira/browse/HBASE-5101
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh

 In a testing environment, a cluster got to a state with more than 1500 
 regions per region server, and essentially became stuck and unavailable.  We 
 could add a limit to the number of regions that a region server can serve to 
 prevent this from happening.  This looks like it could be implemented in the 
 core or as a coprocessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5101) Add a max number of regions per regionserver limit

2011-12-28 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176970#comment-13176970
 ] 

Zhihong Yu commented on HBASE-5101:
---

HBASE-4365 is related to this JIRA.

 Add a max number of regions per regionserver limit
 --

 Key: HBASE-5101
 URL: https://issues.apache.org/jira/browse/HBASE-5101
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh

 In a testing environment, a cluster got to a state with more than 1500 
 regions per region server, and essentially became stuck and unavailable.  We 
 could add a limit to the number of regions that a region server can serve to 
 prevent this from happening.  This looks like it could be implemented in the 
 core or as a coprocessor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2011-12-28 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4218:
---

Attachment: D447.15.patch

mbautin updated the revision [jira] [HBASE-4218] HFile data block encoding 
(delta encoding).
Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

  Simplifying user-facing data block encoding knobs:
  * DATA_BLOCK_ENCODING specifies block encoding type
  * ENCODE_IN_CACHE_ONLY can be set to true to avoid encoding data blocks on 
disk. This is false by default (i.e. we encode blocks everywhere by default if 
DATA_BLOCK_ENCODING is specified).


REVISION DETAIL
  https://reviews.facebook.net/D447

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
  src/main/java/org/apache/hadoop/hbase/KeyValue.java
  src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
  src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
  src/main/ruby/hbase/admin.rb
  src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
  src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
  src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
  src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
  src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
  

[jira] [Commented] (HBASE-4608) HLog Compression

2011-12-28 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176984#comment-13176984
 ] 

jirapos...@reviews.apache.org commented on HBASE-4608:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2740/
---

(Updated 2011-12-29 04:38:25.385999)


Review request for hbase, Eli Collins and Todd Lipcon.


Changes
---

added tests. fixed code issues as mentioned by todd.


Summary
---

Heres what I have so far. Things are written, and should work. I need to 
rework the test cases to test this, and put something in the config file to 
enable/disable. Obviously this isn't ready for commit at the moment, but I can 
get those two things done pretty quickly.

Obviously the dictionary is incredibly simple at the moment, I'll come up with 
something cooler sooner. Let me know how this looks.


This addresses bug HBase-4608.
https://issues.apache.org/jira/browse/HBase-4608


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/HConstants.java 5120a3c 
  
src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 24407af 
  src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java f067221 
  
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java
 d9cd6de 
  
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java
 cbef70f 
  src/main/java/org/apache/hadoop/hbase/regionserver/wal/SimpleDictionary.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java e1117ef 
  
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestSimpleDictionary.java
 PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java 
59910bf 
  
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplayCompressed.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/2740/diff


Testing
---


Thanks,

Li



 HLog Compression
 

 Key: HBASE-4608
 URL: https://issues.apache.org/jira/browse/HBASE-4608
 Project: HBase
  Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
 Attachments: 4608v1.txt


 The current bottleneck to HBase write speed is replicating the WAL appends 
 across different datanodes. We can speed up this process by compressing the 
 HLog. Current plan involves using a dictionary to compress table name, region 
 id, cf name, and possibly other bits of repeated data. Also, HLog format may 
 be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2011-12-28 Thread Li Pi (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176989#comment-13176989
 ] 

Li Pi commented on HBASE-4608:
--

This should be a good time to mention that, at this point, the patch is working.

There is some refactoring to make it prettier, and room for
optimization, but please test out the compressor! (with a realistic
load and see how much improvement it gains.)

Compressor.java contains a command line compression tool that you can
use. Just run this against a HLog and check the differing sizes of the
outputs.

On Wed, Dec 28, 2011 at 8:38 PM, jirapos...@reviews.apache.org


 HLog Compression
 

 Key: HBASE-4608
 URL: https://issues.apache.org/jira/browse/HBASE-4608
 Project: HBase
  Issue Type: New Feature
Reporter: Li Pi
Assignee: Li Pi
 Attachments: 4608v1.txt


 The current bottleneck to HBase write speed is replicating the WAL appends 
 across different datanodes. We can speed up this process by compressing the 
 HLog. Current plan involves using a dictionary to compress table name, region 
 id, cf name, and possibly other bits of repeated data. Also, HLog format may 
 be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2011-12-28 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176975#comment-13176975
 ] 

Phabricator commented on HBASE-5010:


Kannan has commented on the revision [jira] [HBASE-5010] [89-fb] Filter HFiles 
based on TTL.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java:209 scan 
 columns arguments no longer used?
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java:224 and 
we are using this.scan  this.columns here?


REVISION DETAIL
  https://reviews.facebook.net/D909


 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Zhihong Yu
 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
 D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2011-12-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176974#comment-13176974
 ] 

Hadoop QA commented on HBASE-4218:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508818/D447.15.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 68 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/621//console

This message is automatically generated.

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, 
 D447.12.patch, D447.13.patch, D447.14.patch, D447.15.patch, D447.2.patch, 
 D447.3.patch, D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, 
 D447.8.patch, D447.9.patch, Data-block-encoding-2011-12-23.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2011-12-28 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176976#comment-13176976
 ] 

Mikhail Bautin commented on HBASE-4218:
---

Just a quick note from an offline conversation with Kannan: we need to support 
modifying data block encoding column family settings. In the most recent 
version of the patch 
(https://reviews.facebook.net/D447?vs=id=3237whitespace=ignore-all) there are 
the following user-facing column family settings:

* DATA_BLOCK_ENCODING - specifies data block encoding type or NONE
* ENCODE_IN_CACHE_ONLY - boolean (false by default). If true, data blocks are 
only encoded in cache but not on disk

We removed the encoded scanner flag, and we use encoded scanners by default 
any time we use data block encoding.

Given the above column family settings, we need to unit-test at least the 
following transitions:
# Switching from no data block encoding to a data block encoding everywhere, 
and vice versa
# Switching from no data block encoding to a data block encoding in cache only, 
and vice versa
# Flipping the in cache only flag but keeping the data block encoding type 
the same
# Switching from one data block encoding everywhere to another one
# Switching from one data block encoding in cache only to another one
# Switching to a different data block encoding and flipping the in cache only 
flag.


 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, 
 D447.12.patch, D447.13.patch, D447.14.patch, D447.15.patch, D447.2.patch, 
 D447.3.patch, D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, 
 D447.8.patch, D447.9.patch, Data-block-encoding-2011-12-23.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >