[jira] [Updated] (HBASE-4335) Splits can create temporary holes in .META. that confuse clients and regionservers

2011-10-07 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4335:
-

Attachment: 4335-v3.txt

New patch. Breaks SplitTransaction.execute into three parts.
In part to make the phases clear, in part so that a test can test each of the 
phases independently.

Also added a test. The test uses phaseI and phaseIII directly and mocks a bit 
with phaseII (that's the one that bring the daughters online and updates .META.)

I could validate that if I change the order back to what is was before this 
patch the client would indeed reach the wrong region if querying past the split 
key and would (before HBASE-4334) silently return an empty result set.

Let me know what you think about this change.

TestSplitTransaction and the new TestEndToEndSplitTransaction pass.

 Splits can create temporary holes in .META. that confuse clients and 
 regionservers
 --

 Key: HBASE-4335
 URL: https://issues.apache.org/jira/browse/HBASE-4335
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: Joe Pallas
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4335-v2.txt, 4335-v3.txt, 4335.txt


 When a SplitTransaction is performed, three updates are done to .META.:
 1. The parent region is marked as splitting (and hence offline)
 2. The first daughter region is added (same start key as parent)
 3. The second daughter region is added (split key is start key)
 (later, the original parent region is deleted, but that's not important to 
 this discussion)
 Steps 2 and 3 are actually done concurrently by 
 SplitTransaction.DaughterOpener threads.  While the master is notified when a 
 split is complete, the only visibility that clients have is whether the 
 daughter regions have appeared in .META.
 If the second daughter is added to .META. first, then .META. will contain the 
 (offline) parent region followed by the second daughter region.  If the 
 client looks up a key that is greater than (or equal to) the split, the 
 client will find the second daughter region and use it.  If the key is less 
 than the split key, the client will find the parent region and see that it is 
 offline, triggering a retry.
 If the first daughter is added to .META. before the second daughter, there is 
 a window during which .META. has a hole: the first daughter effectively hides 
 the parent region (same start key), but there is no entry for the second 
 daughter.  A region lookup will find the first daughter for all keys in the 
 parent's range, but the first daughter does not include keys at or beyond the 
 split key.
 See HBASE-4333 and HBASE-4334 for details on how this causes problems and 
 suggestions for mitigating this in the client and regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4480) Testing script to simplfy local testing

2011-10-07 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122574#comment-13122574
 ] 

ramkrishna.s.vasudevan commented on HBASE-4480:
---

@Scott
Nice one.  Thanks a lot.

 Testing script to simplfy local testing
 ---

 Key: HBASE-4480
 URL: https://issues.apache.org/jira/browse/HBASE-4480
 Project: HBase
  Issue Type: Improvement
Reporter: Jesse Yates
Priority: Minor
  Labels: test
 Attachments: runtest.sh, runtest2.sh


 As mentioned by http://search-hadoop.com/m/r2Ab624ES3e and 
 http://search-hadoop.com/m/cZjDH1ykGIA it would be nice if we could have a 
 script that would handle more of the finer points of running/checking our 
 test suite.
 This script should:
 (1) Allow people to determine which tests are hanging/taking a long time to 
 run
 (2) Allow rerunning of particular tests to make sure it wasn't an artifact of 
 running the whole suite that caused the failure
 (3) Allow people to specify to run just unit tests or also integration tests 
 (essentially wrapping calls to 'maven test' and 'maven verify').
 This script should just be a convenience script - running tests directly from 
 maven should not be impacted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122585#comment-13122585
 ] 

jirapos...@reviews.apache.org commented on HBASE-4540:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2425
---



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
https://reviews.apache.org/r/2251/#comment5519

Yes Ted.  I too was thinking of unifying both the deleteNode() apis.  
Was thinking what can the expectedVersion that can be passed when we need 
not check it.  Can we pass -1? and check if -1 is passed for expectedVersion we 
will skip that check.


- ramkrishna


On 2011-10-06 17:55:05, ramkrishna vasudevan wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  ---
bq.  
bq.  (Updated 2011-10-06 17:55:05)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify 
OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.  https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
 1179238 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.



 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 

[jira] [Updated] (HBASE-4550) When master passed regionserver different address , because regionserver didn't create new zookeeper znode, as a result stop-hbase.sh is hang

2011-10-07 Thread wanbin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wanbin updated HBASE-4550:
--

Fix Version/s: (was: 0.90.4)

 When master passed regionserver different address , because regionserver 
 didn't create new zookeeper znode,  as  a result stop-hbase.sh is hang
 ---

 Key: HBASE-4550
 URL: https://issues.apache.org/jira/browse/HBASE-4550
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.3
Reporter: wanbin
   Original Estimate: 2h
  Remaining Estimate: 2h

 when master passed regionserver different address, regionserver didn't create 
 new zookeeper znode, master store new address in ServerManager, when call 
 stop-hbase.sh , RegionServerTracker.nodeDeleted received path is old address, 
 serverManager.expireServer is not be called. so stop-hbase.sh is hang.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4550) When master passed regionserver different address , because regionserver didn't create new zookeeper znode, as a result stop-hbase.sh is hang

2011-10-07 Thread wanbin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wanbin updated HBASE-4550:
--

Attachment: patch

 When master passed regionserver different address , because regionserver 
 didn't create new zookeeper znode,  as  a result stop-hbase.sh is hang
 ---

 Key: HBASE-4550
 URL: https://issues.apache.org/jira/browse/HBASE-4550
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.3
Reporter: wanbin
 Attachments: patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 when master passed regionserver different address, regionserver didn't create 
 new zookeeper znode, master store new address in ServerManager, when call 
 stop-hbase.sh , RegionServerTracker.nodeDeleted received path is old address, 
 serverManager.expireServer is not be called. so stop-hbase.sh is hang.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4550) When master passed regionserver different address , because regionserver didn't create new zookeeper znode, as a result stop-hbase.sh is hang

2011-10-07 Thread wanbin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122715#comment-13122715
 ] 

wanbin commented on HBASE-4550:
---

I fixed this problem, somebody can check it. thanks.

 When master passed regionserver different address , because regionserver 
 didn't create new zookeeper znode,  as  a result stop-hbase.sh is hang
 ---

 Key: HBASE-4550
 URL: https://issues.apache.org/jira/browse/HBASE-4550
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.3
Reporter: wanbin
 Attachments: patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 when master passed regionserver different address, regionserver didn't create 
 new zookeeper znode, master store new address in ServerManager, when call 
 stop-hbase.sh , RegionServerTracker.nodeDeleted received path is old address, 
 serverManager.expireServer is not be called. so stop-hbase.sh is hang.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122829#comment-13122829
 ] 

jirapos...@reviews.apache.org commented on HBASE-4540:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/
---

(Updated 2011-10-07 14:27:20.231903)


Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.


Changes
---

LOG.debug(zkw.prefix(Successfully deleted unassigned node for region  +
regionName +  in expected state  + expectedState));
@Ted - I have not removed this log so that it can be used for debugging.
Refactored the testcase and made it much simpler so that it doesn't take much 
time.


Summary
---

Fix for handling HBASE-4539 and HBASE-4540.
Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler 
scenarios.
Also addresses Ted's comments.


This addresses bug HBASE-4540.
https://issues.apache.org/jira/browse/HBASE-4540


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
 1179945 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
 1179945 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
 1179945 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
 1179945 

Diff: https://reviews.apache.org/r/2251/diff


Testing
---

Yes


Thanks,

ramkrishna



 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
 2011-10-05 20:50:48,290 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 2011-10-05 20:50:53,743 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122842#comment-13122842
 ] 

Ted Yu commented on HBASE-4540:
---

For Ram's comment @ 07/Oct/11 07:22
Since -1 is a possible return value from ZKAssign methods, I think we should 
use other values such as -2.

 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
 2011-10-05 20:50:48,290 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 2011-10-05 20:50:53,743 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
 Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
 2011-10-05 20:50:54,397 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-07 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122860#comment-13122860
 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
---

@Ted
I just uploaded the patch before you had commented this. In that patch i had 
used -1.
So if we are going to use -2 or some negative value is it ok to add in javadoc 
something like
   * @param expectedVersion of the znode that is to be deleted.
   *If expectedVersion need not be compared while deleting the znode
   *pass -2(NEGATIVE_VERSION)
Is it ok Ted? 

 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
 2011-10-05 20:50:48,290 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 2011-10-05 20:50:53,743 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
 Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
 2011-10-05 20:50:54,397 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-07 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122862#comment-13122862
 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
---

Can we better document like anything less than some value. may be either 0 or 
-1? Instead of going with one value.

 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
 2011-10-05 20:50:48,290 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 2011-10-05 20:50:53,743 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
 Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
 2011-10-05 20:50:54,397 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122884#comment-13122884
 ] 

Ted Yu commented on HBASE-4540:
---

We may designate some negative value for other purpose in the future. 
I think using one known value is recommended. 
The Javadoc addition above is nice. 

 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
 2011-10-05 20:50:48,290 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 2011-10-05 20:50:53,743 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
 Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
 2011-10-05 20:50:54,397 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4547) TestAdmin failing in 0.92 because .tableinfo not found

2011-10-07 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4547:
-

Attachment: 4547-part2.txt

Need this piece where we test existence before doing delete when updating a 
file.  TestFSTableDescriptors was failing.

 TestAdmin failing in 0.92 because .tableinfo not found
 --

 Key: HBASE-4547
 URL: https://issues.apache.org/jira/browse/HBASE-4547
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4547-part2.txt, 4547.txt


 I've been running tests before commit and found the following happens with 
 some regularity, sporadic of course, but they fail fairly frequently:
 {code}
 Failed tests:   
 testOnlineChangeTableSchema(org.apache.hadoop.hbase.client.TestAdmin)
   testForceSplit(org.apache.hadoop.hbase.client.TestAdmin): expected:2 but 
 was:1
   testForceSplitMultiFamily(org.apache.hadoop.hbase.client.TestAdmin): 
 expected:2 but was:1
 {code}
 Looking, it seems like we fail to find .tableinfo in the tests that modify 
 table schema while table is online.
 The update of a table schema just does an overwrite.  In the tests we 
 sometimes fail to find the newly written file or we get EOFE reading it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122920#comment-13122920
 ] 

jirapos...@reviews.apache.org commented on HBASE-4540:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/
---

(Updated 2011-10-07 16:13:33.022073)


Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.


Changes
---

If we do not want to compare the version of znode while deleting we can pass -2 
to the deleteNode api.
Uploaded the patch with the change.


Summary
---

Fix for handling HBASE-4539 and HBASE-4540.
Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler 
scenarios.
Also addresses Ted's comments.


This addresses bug HBASE-4540.
https://issues.apache.org/jira/browse/HBASE-4540


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
 1179945 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
 1179945 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
 1179945 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
 1179945 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/2251/diff


Testing
---

Yes


Thanks,

ramkrishna



 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
 2011-10-05 20:50:48,290 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 2011-10-05 20:50:53,743 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
 Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
 2011-10-05 

[jira] [Commented] (HBASE-1744) Thrift server to match the new java api.

2011-10-07 Thread Tim Sell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122938#comment-13122938
 ] 

Tim Sell commented on HBASE-1744:
-

Is run the HBase Thrift2 server ok?

What do you mean by experience? Do you mean examples of usage in different 
languages?

 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Tim Sell
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.4.patch, HBASE-1744.5.patch, HBASE-1744.preview.1.patch, 
 thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1744) Thrift server to match the new java api.

2011-10-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122947#comment-13122947
 ] 

Ted Yu commented on HBASE-1744:
---

run the HBase Thrift2 server is fine.
By experience, usage in different programming languages would be nice to share.

 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Tim Sell
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.4.patch, HBASE-1744.5.patch, HBASE-1744.preview.1.patch, 
 thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122957#comment-13122957
 ] 

jirapos...@reviews.apache.org commented on HBASE-4540:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2431
---



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
https://reviews.apache.org/r/2251/#comment5524

Space between } and catch, please.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
https://reviews.apache.org/r/2251/#comment5525

Should we expose this constant as public ?
How about naming this constant DONT_COMPARE_VERSION or 
NO_VERSION_COMPARISON ?


- Ted


On 2011-10-07 16:13:33, ramkrishna vasudevan wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  ---
bq.  
bq.  (Updated 2011-10-07 16:13:33)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify 
OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.  https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
 1179945 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
 1179945 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
 1179945 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
 1179945 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.



 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122973#comment-13122973
 ] 

Ted Yu commented on HBASE-4540:
---

@Ramkrishna:
ZKAssign.transitionNode() is already using -1 to indicate no version comparison.
Your patch @ 07/Oct/11 14:27 should be good.

Sorry for the confusion.

 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
 2011-10-05 20:50:48,290 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 2011-10-05 20:50:53,743 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
 Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
 2011-10-05 20:50:54,397 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4335) Splits can create temporary holes in .META. that confuse clients and regionservers

2011-10-07 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122974#comment-13122974
 ] 

Jonathan Hsieh commented on HBASE-4335:
---

@Lars some nits / suggestions on v3.

TestEndToEndSplitTransaction needs license.

Maybe a more descriptive function names for phaseI, phaseII, phaseIII?

Any reason for the (overly?) general Class... instead of just taking a single 
Class and checking for null when no exceptions expected?  Or maybe just make 
'test' return boolean and assertTrue/assertFalse?






 Splits can create temporary holes in .META. that confuse clients and 
 regionservers
 --

 Key: HBASE-4335
 URL: https://issues.apache.org/jira/browse/HBASE-4335
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: Joe Pallas
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4335-v2.txt, 4335-v3.txt, 4335.txt


 When a SplitTransaction is performed, three updates are done to .META.:
 1. The parent region is marked as splitting (and hence offline)
 2. The first daughter region is added (same start key as parent)
 3. The second daughter region is added (split key is start key)
 (later, the original parent region is deleted, but that's not important to 
 this discussion)
 Steps 2 and 3 are actually done concurrently by 
 SplitTransaction.DaughterOpener threads.  While the master is notified when a 
 split is complete, the only visibility that clients have is whether the 
 daughter regions have appeared in .META.
 If the second daughter is added to .META. first, then .META. will contain the 
 (offline) parent region followed by the second daughter region.  If the 
 client looks up a key that is greater than (or equal to) the split, the 
 client will find the second daughter region and use it.  If the key is less 
 than the split key, the client will find the parent region and see that it is 
 offline, triggering a retry.
 If the first daughter is added to .META. before the second daughter, there is 
 a window during which .META. has a hole: the first daughter effectively hides 
 the parent region (same start key), but there is no entry for the second 
 daughter.  A region lookup will find the first daughter for all keys in the 
 parent's range, but the first daughter does not include keys at or beyond the 
 split key.
 See HBASE-4333 and HBASE-4334 for details on how this causes problems and 
 suggestions for mitigating this in the client and regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-07 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122992#comment-13122992
 ] 

ramkrishna.s.vasudevan commented on HBASE-4540:
---

If any node exists the version will start from 0. 
Thanks Ted for the confirmation.  I will wait for one day for further reviews 
and will make changes accordingly if not will take the patc at @ 07/Oct/11 
14:27.
The space between catch and } i will take care.

 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
 2011-10-05 20:50:48,290 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 2011-10-05 20:50:53,743 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
 Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
 2011-10-05 20:50:54,397 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows

2011-10-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122994#comment-13122994
 ] 

Ted Yu commented on HBASE-4536:
---

If the number of columns for the underlying family is not huge, the option of 
translating family delete marker is attractive.

 Allow CF to retain deleted rows
 ---

 Key: HBASE-4536
 URL: https://issues.apache.org/jira/browse/HBASE-4536
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0


 Parent allows for a cluster to retain rows for a TTL or keep a minimum number 
 of versions.
 However, if a client deletes a row all version older than the delete tomb 
 stone will be remove at the next major compaction (and even at memstore flush 
 - see HBASE-4241).
 There should be a way to retain those version to guard against software error.
 I see two options here:
 1. Add a new flag HColumnDescriptor. Something like RETAIN_DELETED.
 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of 
 versions even past the delete marker.
 #1 would allow for more flexibility. #2 comes somewhat naturally with parent 
 (from a user viewpoint)
 Comments? Any other options?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4333) Client does not check for holes in .META.

2011-10-07 Thread Joe Pallas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123032#comment-13123032
 ] 

Joe Pallas commented on HBASE-4333:
---

Are you satisfied that this is not an issue for scanners?  If so, I'm okay with 
closing this.

 Client does not check for holes in .META.
 -

 Key: HBASE-4333
 URL: https://issues.apache.org/jira/browse/HBASE-4333
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4
Reporter: Joe Pallas

 If there is a temporary hole in .META., the client may get the wrong region 
 from HConnection.locateRegion.  
 HConnectionManager.HConnectionImplementation.locateRegionInMeta should check 
 the end key of the region found with getClosestRowBefore, just as it checks 
 the offline status, when it looks at the region info.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-1744) Thrift server to match the new java api.

2011-10-07 Thread Tim Sell (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Sell updated HBASE-1744:


Attachment: HBASE-1744.6.patch

Added new patch, with a few more tests and a incomplete trivial java example.

 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Tim Sell
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.4.patch, HBASE-1744.5.patch, HBASE-1744.6.patch, 
 HBASE-1744.preview.1.patch, thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123059#comment-13123059
 ] 

jirapos...@reviews.apache.org commented on HBASE-4377:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2126/
---

(Updated 2011-10-07 18:46:54.806909)


Review request for hbase, Michael Stack and Andrew Purtell.


Changes
---

Updates with nits and separated tests into different classes so that we can 
rely on new jvms to avoid OO file handle errors intermittently encountered when 
shutting down and restarting mini clusters.


Summary
---

commit fbf82c17be6b3ecca5a981f5270cf93aac26e479
Author: Jonathan Hsieh j...@cloudera.com
Date:   Wed Sep 28 10:18:11 2011 -0700

HBASE-4377 [hbck] Offline rebuild .META. from fs data only


This patch rebuilds a new .META. table by reading all the .regioninfo files in 
the hbase main directory.  It depends on the yet to be committed HBASE-4515 
(either my verison or Gary's version), HBASE-4509, and HBASE-4506.  

Some follow on work includes backporting to 0.90, auto-patching true holes, and 
adding documentation.


This addresses bug HBASE-4377.
https://issues.apache.org/jira/browse/HBASE-4377


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java b9c850d 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 154ac32 
  src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f5be448 
  src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/2126/diff


Testing
---

An earlier version of this code (backported to 0.90) was used to diagnose and 
repair a cluster that had 2700 inconsistencies due to failed splits (the 
cluster was underprovisioned memory-wise, and on restart, the some regions 
would start splitting and then die due to oome's).  This was not actually used 
on a live cluster -- it was used to reconstruct a .META. from .regioninfo's 
laid out in hbase's directory structure.

Note also that this is not an automatic fix -- whenever any problems are found, 
this bails out but dumps info on holes, suggests some fixes, and displays sets 
of overlapping regions.  It is up to the user to merge regions, to create 
.regioninfo files to plug hole, and to do any potential data loosing operations.

The tests demonstrate current expected behavior -- rebuild meta if things line 
up, and fail without making modifications if holes or overlaps exist.


Thanks,

jmhsieh



 [hbck] Offline rebuild .META. from fs data only.
 

 Key: HBASE-4377
 URL: https://issues.apache.org/jira/browse/HBASE-4377
 Project: HBase
  Issue Type: New Feature
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
 hbase-4377-trunk.v2.patch


 In a worst case situation, it may be helpful to have an offline .META. 
 rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
 from scratch.  Users could move bad regions out until there is a clean 
 rebuild.  
 It would likely fill in region split holes.  Follow on work could given 
 options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123058#comment-13123058
 ] 

jirapos...@reviews.apache.org commented on HBASE-4377:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2126/
---

(Updated 2011-10-07 18:47:01.208741)


Review request for hbase, Michael Stack and Andrew Purtell.


Summary
---

commit fbf82c17be6b3ecca5a981f5270cf93aac26e479
Author: Jonathan Hsieh j...@cloudera.com
Date:   Wed Sep 28 10:18:11 2011 -0700

HBASE-4377 [hbck] Offline rebuild .META. from fs data only


This patch rebuilds a new .META. table by reading all the .regioninfo files in 
the hbase main directory.  It depends on the yet to be committed HBASE-4515 
(either my verison or Gary's version), HBASE-4509, and HBASE-4506.  

Some follow on work includes backporting to 0.90, auto-patching true holes, and 
adding documentation.


This addresses bug HBASE-4377.
https://issues.apache.org/jira/browse/HBASE-4377


Diffs
-

  src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java b9c850d 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 154ac32 
  src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f5be448 
  src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/2126/diff


Testing
---

An earlier version of this code (backported to 0.90) was used to diagnose and 
repair a cluster that had 2700 inconsistencies due to failed splits (the 
cluster was underprovisioned memory-wise, and on restart, the some regions 
would start splitting and then die due to oome's).  This was not actually used 
on a live cluster -- it was used to reconstruct a .META. from .regioninfo's 
laid out in hbase's directory structure.

Note also that this is not an automatic fix -- whenever any problems are found, 
this bails out but dumps info on holes, suggests some fixes, and displays sets 
of overlapping regions.  It is up to the user to merge regions, to create 
.regioninfo files to plug hole, and to do any potential data loosing operations.

The tests demonstrate current expected behavior -- rebuild meta if things line 
up, and fail without making modifications if holes or overlaps exist.


Thanks,

jmhsieh



 [hbck] Offline rebuild .META. from fs data only.
 

 Key: HBASE-4377
 URL: https://issues.apache.org/jira/browse/HBASE-4377
 Project: HBase
  Issue Type: New Feature
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
 hbase-4377-trunk.v2.patch


 In a worst case situation, it may be helpful to have an offline .META. 
 rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
 from scratch.  Users could move bad regions out until there is a clean 
 rebuild.  
 It would likely fill in region split holes.  Follow on work could given 
 options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123080#comment-13123080
 ] 

jirapos...@reviews.apache.org commented on HBASE-4377:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2287/
---

Review request for hbase and Ted Yu.


Summary
---

Backport to 0.90

commit 89862b73c6358e27220b87b0362599d86ab0fe4a
Author: Jonathan Hsieh j...@cloudera.com
Date:   Wed Sep 28 10:18:11 2011 -0700

HBASE-4377 [hbck] Offline rebuild .META. from fs data only



This addresses bug HBASE-4377.
https://issues.apache.org/jira/browse/HBASE-4377


Diffs
-

  src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java ef246c3 
  src/main/java/org/apache/hadoop/hbase/util/Bytes.java 13ad026 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java b04aab6 
  src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f792720 
  src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/2287/diff


Testing
---

Note, the assertion test result is different in the failure cases due to 
HBASE-451 changes. (0.90 returns 0 tables since it does a meta scan on empty 
meta, trunk branch looks at hdfs dirs, and returns 1).

This version passes after HBASE-4508 (backport HBASE-3777 to 0.90 branch) is 
applied. 

I believe if that patch is not applied, I could modify the test code to force 
some explicit HConnection deletions.


Thanks,

jmhsieh



 [hbck] Offline rebuild .META. from fs data only.
 

 Key: HBASE-4377
 URL: https://issues.apache.org/jira/browse/HBASE-4377
 Project: HBase
  Issue Type: New Feature
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
 hbase-4377-trunk.v2.patch


 In a worst case situation, it may be helpful to have an offline .META. 
 rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
 from scratch.  Users could move bad regions out until there is a clean 
 rebuild.  
 It would likely fill in region split holes.  Follow on work could given 
 options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4554) Allow set/unset arbitrary table attributes from shell.

2011-10-07 Thread Mingjie Lai (Created) (JIRA)
Allow set/unset arbitrary table attributes from shell.
--

 Key: HBASE-4554
 URL: https://issues.apache.org/jira/browse/HBASE-4554
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Mingjie Lai
Assignee: Mingjie Lai
 Fix For: 0.92.0


Table/region level coprocessor -- RegionObserver -- can be configured by 
setting a HTD's attribute which matches Coprocessor$*. 

Current shell -- alter -- cannot support to set/unset a table's arbitrary 
attribute. We need it in order to configure region level coprocessors to a 
table. 

Proposed new shell:
{code}
hbase shell  alter 't1', METHOD = 'table_att', COPROCESSOR$1 = 
'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|'

hbase shell  describe 't1'
 {NAME = 't1', COPROCESSOR$1 = 
'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|', MAX_FILESIZE = 
'134217728', ...}

hbase shell  alter 't1', METHOD = 'table_att_unset', COPROCESSOR$1

hbase shell  describe 't1'
 {NAME = 't1', MAX_FILESIZE = '134217728', ...}
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4335) Splits can create temporary holes in .META. that confuse clients and regionservers

2011-10-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123123#comment-13123123
 ] 

Ted Yu commented on HBASE-4335:
---

How about calling the first phase createDaughtersPhase, second 
openDaughtersPhase and the third phase transitionZKNodePhase ?

 Splits can create temporary holes in .META. that confuse clients and 
 regionservers
 --

 Key: HBASE-4335
 URL: https://issues.apache.org/jira/browse/HBASE-4335
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: Joe Pallas
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4335-v2.txt, 4335-v3.txt, 4335.txt


 When a SplitTransaction is performed, three updates are done to .META.:
 1. The parent region is marked as splitting (and hence offline)
 2. The first daughter region is added (same start key as parent)
 3. The second daughter region is added (split key is start key)
 (later, the original parent region is deleted, but that's not important to 
 this discussion)
 Steps 2 and 3 are actually done concurrently by 
 SplitTransaction.DaughterOpener threads.  While the master is notified when a 
 split is complete, the only visibility that clients have is whether the 
 daughter regions have appeared in .META.
 If the second daughter is added to .META. first, then .META. will contain the 
 (offline) parent region followed by the second daughter region.  If the 
 client looks up a key that is greater than (or equal to) the split, the 
 client will find the second daughter region and use it.  If the key is less 
 than the split key, the client will find the parent region and see that it is 
 offline, triggering a retry.
 If the first daughter is added to .META. before the second daughter, there is 
 a window during which .META. has a hole: the first daughter effectively hides 
 the parent region (same start key), but there is no entry for the second 
 daughter.  A region lookup will find the first daughter for all keys in the 
 parent's range, but the first daughter does not include keys at or beyond the 
 split key.
 See HBASE-4333 and HBASE-4334 for details on how this causes problems and 
 suggestions for mitigating this in the client and regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4551) Small fixes to compile against 0.23-SNAPSHOT

2011-10-07 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HBASE-4551:
---

Attachment: hbase-4551.txt

new revision also removes the places where we set the lease timeouts in the NN 
to non-standard values. Now that we use the recoverLease API instead of the 
appendFile API, we don't need to do this. I ran the modified tests and they 
still pass on 0.20.

 Small fixes to compile against 0.23-SNAPSHOT
 

 Key: HBASE-4551
 URL: https://issues.apache.org/jira/browse/HBASE-4551
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.92.0

 Attachments: hbase-4551.txt, hbase-4551.txt


 - fix pom.xml to properly pull the test artifacts
 - fix TestHLog to not use the private cluster.getNameNode() API

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4547) TestAdmin failing in 0.92 because .tableinfo not found

2011-10-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123175#comment-13123175
 ] 

Hudson commented on HBASE-4547:
---

Integrated in HBase-0.92 #51 (See 
[https://builds.apache.org/job/HBase-0.92/51/])
HBASE-4547 TestAdmin failing in 0.92 because .tableinfo not found

stack : 
Files : 
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java


 TestAdmin failing in 0.92 because .tableinfo not found
 --

 Key: HBASE-4547
 URL: https://issues.apache.org/jira/browse/HBASE-4547
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4547-part2.txt, 4547.txt


 I've been running tests before commit and found the following happens with 
 some regularity, sporadic of course, but they fail fairly frequently:
 {code}
 Failed tests:   
 testOnlineChangeTableSchema(org.apache.hadoop.hbase.client.TestAdmin)
   testForceSplit(org.apache.hadoop.hbase.client.TestAdmin): expected:2 but 
 was:1
   testForceSplitMultiFamily(org.apache.hadoop.hbase.client.TestAdmin): 
 expected:2 but was:1
 {code}
 Looking, it seems like we fail to find .tableinfo in the tests that modify 
 table schema while table is online.
 The update of a table schema just does an overwrite.  In the tests we 
 sometimes fail to find the newly written file or we get EOFE reading it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123197#comment-13123197
 ] 

jirapos...@reviews.apache.org commented on HBASE-4377:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2287/#review2440
---



src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/2287/#comment5546

Minor suggestion: IOException may occur more than once. Would logging all 
such IOException's before bailing out make user experience better ?
Basically we just need to track the last such IOException in a variable and 
bail out at line 283 if the variable isn't null.



src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/2287/#comment5545

Naming rd as rootdir would make the code more readable.



src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/2287/#comment5548

I think rebuildMeta() should check the return value from generatePuts().
Otherwise we would encounter NPE at line 405 below.



src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/2287/#comment5549

Do you plan to add this logic in another JIRA ?



src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/2287/#comment5550

false should be returned if puts is null.



src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/2287/#comment5552

I think LOG.info() should be used here.


- Ted


On 2011-10-07 19:04:44, jmhsieh wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2287/
bq.  ---
bq.  
bq.  (Updated 2011-10-07 19:04:44)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Backport to 0.90
bq.  
bq.  commit 89862b73c6358e27220b87b0362599d86ab0fe4a
bq.  Author: Jonathan Hsieh j...@cloudera.com
bq.  Date:   Wed Sep 28 10:18:11 2011 -0700
bq.  
bq.  HBASE-4377 [hbck] Offline rebuild .META. from fs data only
bq.  
bq.  
bq.  
bq.  This addresses bug HBASE-4377.
bq.  https://issues.apache.org/jira/browse/HBASE-4377
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java 
ef246c3 
bq.src/main/java/org/apache/hadoop/hbase/util/Bytes.java 13ad026 
bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java b04aab6 
bq.src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java f792720 
bq.src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2287/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Note, the assertion test result is different in the failure cases due to 
HBASE-451 changes. (0.90 returns 0 tables since it does a meta scan on empty 
meta, trunk branch looks at hdfs dirs, and returns 1).
bq.  
bq.  This version passes after HBASE-4508 (backport HBASE-3777 to 0.90 branch) 
is applied. 
bq.  
bq.  I believe if that patch is not applied, I could modify the test code to 
force some explicit HConnection deletions.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  jmhsieh
bq.  
bq.



 [hbck] Offline rebuild .META. from fs data only.
 

 Key: HBASE-4377
 URL: https://issues.apache.org/jira/browse/HBASE-4377
 Project: HBase
  Issue Type: New Feature
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
 hbase-4377-trunk.v2.patch


 In a worst case situation, it may be helpful to have an offline .META. 
 rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
 from scratch.  Users could move bad regions out until there is a clean 
 rebuild.  
 It would likely fill in region split holes.  Follow on work could given 
 options to merge or select regions that overlap, or do online rebuilds.

--
This message is 

[jira] [Commented] (HBASE-4554) Allow set/unset arbitrary table attributes from shell.

2011-10-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123241#comment-13123241
 ] 

Ted Yu commented on HBASE-4554:
---

This would be a useful feature.
COPROCESSOR$1 needs to be defined as a constant if we follow example from 
src/main/ruby/hbase/admin.rb:
{code}
   if method == table_att
  htd.setMaxFileSize(JLong.valueOf(arg[MAX_FILESIZE])) if 
arg[MAX_FILESIZE]
{code}
I think table_att method targets known table attributes.
For HBASE-4554, we can introduce new method, e.g. table_dyn_att, which accepts 
two parameters: KEY and VALUE:
{code}
hbase alter 't1', {METHOD = 'table_dyn_att', KEY = 'COPROCESSOR$1', VALUE = 
'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|' }
{code}

 Allow set/unset arbitrary table attributes from shell.
 --

 Key: HBASE-4554
 URL: https://issues.apache.org/jira/browse/HBASE-4554
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Mingjie Lai
Assignee: Mingjie Lai
 Fix For: 0.92.0


 Table/region level coprocessor -- RegionObserver -- can be configured by 
 setting a HTD's attribute which matches Coprocessor$*. 
 Current shell -- alter -- cannot support to set/unset a table's arbitrary 
 attribute. We need it in order to configure region level coprocessors to a 
 table. 
 Proposed new shell:
 {code}
 hbase shell  alter 't1', METHOD = 'table_att', COPROCESSOR$1 = 
 'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|'
 hbase shell  describe 't1'
  {NAME = 't1', COPROCESSOR$1 = 
 'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|', MAX_FILESIZE = 
 '134217728', ...}
 hbase shell  alter 't1', METHOD = 'table_att_unset', COPROCESSOR$1
 hbase shell  describe 't1'
  {NAME = 't1', MAX_FILESIZE = '134217728', ...}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1744) Thrift server to match the new java api.

2011-10-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123254#comment-13123254
 ] 

Ted Yu commented on HBASE-1744:
---

src/examples/thrift2/DemoClient.java needs license.


 Thrift server to match the new java api.
 

 Key: HBASE-1744
 URL: https://issues.apache.org/jira/browse/HBASE-1744
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Tim Sell
Assignee: Tim Sell
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-1744.2.patch, HBASE-1744.3.patch, 
 HBASE-1744.4.patch, HBASE-1744.5.patch, HBASE-1744.6.patch, 
 HBASE-1744.preview.1.patch, thriftexperiment.patch


 This mutateRows, etc.. is a little confusing compared to the new cleaner java 
 client.
 Thinking of ways to make a thrift client that is just as elegant. something 
 like:
 void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
 with:
 struct TColumn {
   1:Bytes family,
   2:Bytes qualifier,
   3:i64 timestamp
 }
 struct TPut {
   1:Bytes row,
   2:mapTColumn, Bytes values
 }
 This creates more verbose rpc  than if the columns in TPut were just 
 mapBytes, mapBytes, Bytes, but that is harder to fit timestamps into and 
 still be intuitive from say python.
 Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4549) Add thrift API to read version and build date of HBase

2011-10-07 Thread Song Liu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Song Liu updated HBASE-4549:


Attachment: patch-hbase-4549.txt

The patch passes the new test in TestThriftServer. 



 Add thrift API to read version and build date of HBase 
 ---

 Key: HBASE-4549
 URL: https://issues.apache.org/jira/browse/HBASE-4549
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Song Liu
Priority: Minor
 Attachments: patch-hbase-4549.txt

   Original Estimate: 2h
  Remaining Estimate: 2h

 Adding API to get the hbase server version and build date will be helpful for 
 the client to communicate with different versions of the server accordingly. 
 class VersionInfo can be reused to provide required information. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4555) TestShell seems passed, but actually errors seen in test output file

2011-10-07 Thread Mingjie Lai (Created) (JIRA)
TestShell seems passed, but actually errors seen in test output file


 Key: HBASE-4555
 URL: https://issues.apache.org/jira/browse/HBASE-4555
 Project: HBase
  Issue Type: Test
  Components: test
Reporter: Mingjie Lai


When I was making test cases for 4554, I saw a weird issue that TestShell seems 
to pass, but actually I saw error messages in the output file.

{code}
---
 T E S T S
---
Running org.apache.hadoop.hbase.client.TestShell

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.252 sec

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
{code}

Error messages in org.apache.hadoop.hbase.client.TestShell-output.txt:

{code}
...
  6) Error:
test_alter_should_support_shortcut_DELETE_alter_specs(Hbase::AdminAlterTableTest):
ArgumentError: There should be at least one argument but the table name

/home/mlai/git/hbase-private/src/test/ruby/../../main/ruby/hbase/admin.rb:307:in
 `alter'
./src/test/ruby/hbase/admin_test.rb:271:in 
`test_alter_should_support_shortcut_DELETE_alter_specs'
org/jruby/RubyProc.java:268:in `call'
org/jruby/RubyKernel.java:2038:in `send'
org/jruby/RubyArray.java:1572:in `each'
org/jruby/RubyArray.java:1572:in `each'

  7) Error:
test_split_should_work(Hbase::AdminMethodsTest):
ArgumentError: wrong number of arguments (1 for 2)
./src/test/ruby/hbase/admin_test.rb:99:in `test_split_should_work'
org/jruby/RubyProc.java:268:in `call'
org/jruby/RubyKernel.java:2038:in `send'
org/jruby/RubyArray.java:1572:in `each'
org/jruby/RubyArray.java:1572:in `each'

192 tests, 259 assertions, 1 failures, 6 errors
Done with tests! Shutting down the cluster...
2011-10-07 16:46:14,760 INFO  [main] hbase.HBaseTestingUtility(551): Shutting 
down minicluster
2011-10-07 16:46:14,760 DEBUG [main] util.JVMClusterUtil(214): Shutting down 
HBase Cluster
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4554) Allow set/unset coprocessor table attributes from shell.

2011-10-07 Thread Mingjie Lai (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingjie Lai updated HBASE-4554:
---

Summary: Allow set/unset coprocessor table attributes from shell.  (was: 
Allow set/unset arbitrary table attributes from shell.)

Rename the jira title from ``Allow set/unset arbitrary table attributes from 
shell.''  ``Allow set/unset coprocessor table attributes from shell.''.

 Allow set/unset coprocessor table attributes from shell.
 

 Key: HBASE-4554
 URL: https://issues.apache.org/jira/browse/HBASE-4554
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Mingjie Lai
Assignee: Mingjie Lai
 Fix For: 0.92.0


 Table/region level coprocessor -- RegionObserver -- can be configured by 
 setting a HTD's attribute which matches Coprocessor$*. 
 Current shell -- alter -- cannot support to set/unset a table's arbitrary 
 attribute. We need it in order to configure region level coprocessors to a 
 table. 
 Proposed new shell:
 {code}
 hbase shell  alter 't1', METHOD = 'table_att', COPROCESSOR$1 = 
 'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|'
 hbase shell  describe 't1'
  {NAME = 't1', COPROCESSOR$1 = 
 'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|', MAX_FILESIZE = 
 '134217728', ...}
 hbase shell  alter 't1', METHOD = 'table_att_unset', COPROCESSOR$1
 hbase shell  describe 't1'
  {NAME = 't1', MAX_FILESIZE = '134217728', ...}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4554) Allow set/unset coprocessor table attributes from shell.

2011-10-07 Thread Mingjie Lai (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123315#comment-13123315
 ] 

Mingjie Lai commented on HBASE-4554:


@Ted. I changed the title so this jira only deals with cp related htd 
attributes (no more arbitrary). I still prefer to utilizing the existing 
table_att method so we can add/change multiple attributes at one time. 

{code}
alter 't1', METHOD = 'table_att', 'COPROCESSOR$1' = 'cp1', 'COPROCESSOR$2' = 
'cp2'
{code}

Code change would be something like: 
{code}
--- a/src/main/ruby/hbase/admin.rb
+++ b/src/main/ruby/hbase/admin.rb
@@ -359,6 +359,16 @@ module Hbase
   htd.setReadOnly(JBoolean.valueOf(arg[READONLY])) if arg[READONLY]
   htd.setMemStoreFlushSize(JLong.valueOf(arg[MEMSTORE_FLUSHSIZE])) if 
arg[MEMSTORE_FLUSHSIZE]
   htd.setDeferredLogFlush(JBoolean.valueOf(arg[DEFERRED_LOG_FLUSH])) if 
arg[DEFERRED_LOG_FLUSH]
+
+  # set a coprocessor attribute
+  if arg.kind_of?(Hash)
+arg.each do |key, value|
+  k = String.new(key) # prepare to strip
+  k.strip!
+  htd.setValue(k, value) if (k =~ /coprocessor\$[0-9]*/i)
+end
+  end
+  
{code}

 Allow set/unset coprocessor table attributes from shell.
 

 Key: HBASE-4554
 URL: https://issues.apache.org/jira/browse/HBASE-4554
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Mingjie Lai
Assignee: Mingjie Lai
 Fix For: 0.92.0


 Table/region level coprocessor -- RegionObserver -- can be configured by 
 setting a HTD's attribute which matches Coprocessor$*. 
 Current shell -- alter -- cannot support to set/unset a table's arbitrary 
 attribute. We need it in order to configure region level coprocessors to a 
 table. 
 Proposed new shell:
 {code}
 hbase shell  alter 't1', METHOD = 'table_att', COPROCESSOR$1 = 
 'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|'
 hbase shell  describe 't1'
  {NAME = 't1', COPROCESSOR$1 = 
 'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|', MAX_FILESIZE = 
 '134217728', ...}
 hbase shell  alter 't1', METHOD = 'table_att_unset', COPROCESSOR$1
 hbase shell  describe 't1'
  {NAME = 't1', MAX_FILESIZE = '134217728', ...}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123330#comment-13123330
 ] 

jirapos...@reviews.apache.org commented on HBASE-4218:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2308/
---

Review request for hbase.


Summary
---

Delta encoding for key values.


This addresses bug HBASE-4218.
https://issues.apache.org/jira/browse/HBASE-4218


Diffs
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockDeltaEncoder.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
 1180113 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java
 1180113 
  

[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-10-07 Thread Jacek Migdal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacek Migdal updated HBASE-4218:


Affects Version/s: 0.94.0
   Status: Patch Available  (was: Open)

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
  Labels: compression

 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-10-07 Thread Jacek Migdal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacek Migdal updated HBASE-4218:


Status: Open  (was: Patch Available)

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
  Labels: compression

 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-10-07 Thread Jacek Migdal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacek Migdal updated HBASE-4218:


Attachment: open-source.diff

Delta encoding source code.

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
  Labels: compression
 Attachments: open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-10-07 Thread Jacek Migdal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123337#comment-13123337
 ] 

Jacek Migdal commented on HBASE-4218:
-

Performance results on production data.

 CopyKeyDeltaEncoder:
   Compression performance:   1136.33 MB/s (+/- 60.91 MB/s)
   Decompression performance: 373.29 MB/s (+/- 281.22 MB/s)
 BitsetKeyDeltaEncoder:
   Compression performance:   147.57 MB/s (+/- 0.58 MB/s)
   Decompression performance: 166.78 MB/s (+/- 54.81 MB/s)
 PrefixKeyDeltaEncoder:
   Compression performance:   293.94 MB/s (+/- 1.97 MB/s)
   Decompression performance: 233.61 MB/s (+/- 91.97 MB/s)
 FastDiffDeltaEncoder:
   Compression performance:   203.47 MB/s (+/- 0.37 MB/s)
   Decompression performance: 196.77 MB/s (+/- 43.22 MB/s)
 DiffKeyDeltaEncoder:
   Compression performance:   187.74 MB/s (+/- 0.24 MB/s)
   Decompression performance: 163.13 MB/s (+/- 12.17 MB/s)
 LZO:
   Compression performance:   260.35 MB/s (+/- 0.76 MB/s)
   Decompression performance: 173.45 MB/s (+/- 76.13 MB/s)
 CopyKeyDeltaEncoder
   Saved bytes:  -4
   Key compression ratio:-0.00 %
   All compression ratio:-0.00 %
   LZO compressed size:  152019
   LZO compression ratio:85.79 %
 BitsetKeyDeltaEncoder
   Saved bytes:  747061
   Key compression ratio:75.46 %
   All compression ratio:69.82 %
   LZO compressed size:  124438
   LZO compression ratio:88.37 %
 PrefixKeyDeltaEncoder
   Saved bytes:  831602
   Key compression ratio:84.00 %
   All compression ratio:77.72 %
   LZO compressed size:  117285
   LZO compression ratio:89.04 %
 FastDiffDeltaEncoder
   Saved bytes:  935275
   Key compression ratio:94.47 %
   All compression ratio:87.41 %
   LZO compressed size:   94360
   LZO compression ratio:91.18 %
 DiffKeyDeltaEncoder
   Saved bytes:  909175
   Key compression ratio:91.84 %
   All compression ratio:84.97 %
   LZO compressed size:   96597
   LZO compression ratio:90.97 %
 Total KV prefix length:  8
 Total key length:   91
 Total key redundancy:   781606
 Total value length:  8

DeltaEncodingSeekPerformance


 BlockDeltaEncoder onDisk='NONE' inCache='NONE' inMemory=false
   Read speed:  63.99 (MB/s)
   Seeks per second: 54901.21 (#/s)
 BlockDeltaEncoder onDisk='NONE' inCache='BITSET' inMemory=false
   Read speed:  46.73 (MB/s)
   Seeks per second: 13570.50 (#/s)
 BlockDeltaEncoder onDisk='NONE' inCache='PREFIX' inMemory=false
   Read speed:  55.88 (MB/s)
   Seeks per second: 20298.89 (#/s)
 BlockDeltaEncoder onDisk='NONE' inCache='DIFF' inMemory=false
   Read speed:  54.39 (MB/s)
   Seeks per second: 15082.79 (#/s)
 BlockDeltaEncoder onDisk='NONE' inCache='FAST_DIFF' inMemory=false
   Read speed:  54.12 (MB/s)
   Seeks per second: 15432.61 (#/s)
 BlockDeltaEncoder onDisk='NONE' inCache='NONE' inMemory=true
   Read speed:  64.37 (MB/s)
   Seeks per second: 56779.82 (#/s)
 BlockDeltaEncoder onDisk='NONE' inCache='BITSET' inMemory=true
   Read speed:  35.42 (MB/s)
   Seeks per second: 46170.87 (#/s)
 BlockDeltaEncoder onDisk='NONE' inCache='PREFIX' inMemory=true
   Read speed:  43.54 (MB/s)
   Seeks per second: 60108.48 (#/s)
 BlockDeltaEncoder onDisk='NONE' inCache='DIFF' inMemory=true
   Read speed:  40.62 (MB/s)
   Seeks per second: 48779.68 (#/s)
 BlockDeltaEncoder onDisk='NONE' inCache='FAST_DIFF' inMemory=true
   Read speed:  40.76 (MB/s)
   Seeks per second: 57291.22 (#/s)


 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
  Labels: compression
 Attachments: open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% 

[jira] [Commented] (HBASE-4335) Splits can create temporary holes in .META. that confuse clients and regionservers

2011-10-07 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123342#comment-13123342
 ] 

Lars Hofhansl commented on HBASE-4335:
--

I like those names. Will do.
@Jon. Initially i expected multiple different exceptions to thrown hence the 
general class approach. You're right here it does not make sense.




 Splits can create temporary holes in .META. that confuse clients and 
 regionservers
 --

 Key: HBASE-4335
 URL: https://issues.apache.org/jira/browse/HBASE-4335
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: Joe Pallas
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4335-v2.txt, 4335-v3.txt, 4335.txt


 When a SplitTransaction is performed, three updates are done to .META.:
 1. The parent region is marked as splitting (and hence offline)
 2. The first daughter region is added (same start key as parent)
 3. The second daughter region is added (split key is start key)
 (later, the original parent region is deleted, but that's not important to 
 this discussion)
 Steps 2 and 3 are actually done concurrently by 
 SplitTransaction.DaughterOpener threads.  While the master is notified when a 
 split is complete, the only visibility that clients have is whether the 
 daughter regions have appeared in .META.
 If the second daughter is added to .META. first, then .META. will contain the 
 (offline) parent region followed by the second daughter region.  If the 
 client looks up a key that is greater than (or equal to) the split, the 
 client will find the second daughter region and use it.  If the key is less 
 than the split key, the client will find the parent region and see that it is 
 offline, triggering a retry.
 If the first daughter is added to .META. before the second daughter, there is 
 a window during which .META. has a hole: the first daughter effectively hides 
 the parent region (same start key), but there is no entry for the second 
 daughter.  A region lookup will find the first daughter for all keys in the 
 parent's range, but the first daughter does not include keys at or beyond the 
 split key.
 See HBASE-4333 and HBASE-4334 for details on how this causes problems and 
 suggestions for mitigating this in the client and regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4070) [Coprocessors] Improve region server metrics to report loaded coprocessors to master

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123354#comment-13123354
 ] 

jirapos...@reviews.apache.org commented on HBASE-4070:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2029/#review2396
---



src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java
https://reviews.apache.org/r/2029/#comment5499

Add comment like these three declarations are only used by 
testRegionServerCoprocessorsReported




src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java
https://reviews.apache.org/r/2029/#comment5500

Add comment this declaration is only used by 
testMasterCoprocessorsReported.


- Eugene


On 2011-10-05 21:45:30, Eugene Koontz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2029/
bq.  ---
bq.  
bq.  (Updated 2011-10-05 21:45:30)
bq.  
bq.  
bq.  Review request for hbase and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Proposed fix for HBASE-4070. 
bq.  
bq.  
bq.  This addresses bug HBASE-4070.
bq.  https://issues.apache.org/jira/browse/HBASE-4070
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/jamon/org/apache/hbase/tmpl/master/MasterStatusTmpl.jamon 
abeb850 
bq.src/main/jamon/org/apache/hbase/tmpl/regionserver/RSStatusTmpl.jamon 
be6fceb 
bq.src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 01bc1dd 
bq.src/main/java/org/apache/hadoop/hbase/HServerLoad.java 0c680e4 
bq.src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java a55a4b1 
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
dbae4fd 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java f80d232 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
3840279 
bq.src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java 
eda5a9b 
bq.  
bq.  Diff: https://reviews.apache.org/r/2029/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Two new tests : testRegionServerCoprocessorReported() and 
testMasterServerCoprocessorsReported() included in a new source file 
src/test/java/o.a.h.h/coprocessor/TestCoprocessorReporting.java.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eugene
bq.  
bq.



 [Coprocessors] Improve region server metrics to report loaded coprocessors to 
 master
 

 Key: HBASE-4070
 URL: https://issues.apache.org/jira/browse/HBASE-4070
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.3
Reporter: Mingjie Lai
Assignee: Eugene Koontz
 Attachments: HBASE-4070.patch, HBASE-4070.patch, HBASE-4070.patch, 
 master-web-ui.jpg, rs-status-web-ui.jpg


 HBASE-3512 is about listing loaded cp classes at shell. To make it more 
 generic, we need a way to report this piece of information from region to 
 master (or just at region server level). So later on, we can display the 
 loaded class names at shell as well as web console. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4335) Splits can create temporary holes in .META. that confuse clients and regionservers

2011-10-07 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4335:
-

Attachment: 4335-v4.txt

Renamed phaseX methods addressed nits...

 Splits can create temporary holes in .META. that confuse clients and 
 regionservers
 --

 Key: HBASE-4335
 URL: https://issues.apache.org/jira/browse/HBASE-4335
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: Joe Pallas
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4335-v2.txt, 4335-v3.txt, 4335-v4.txt, 4335.txt


 When a SplitTransaction is performed, three updates are done to .META.:
 1. The parent region is marked as splitting (and hence offline)
 2. The first daughter region is added (same start key as parent)
 3. The second daughter region is added (split key is start key)
 (later, the original parent region is deleted, but that's not important to 
 this discussion)
 Steps 2 and 3 are actually done concurrently by 
 SplitTransaction.DaughterOpener threads.  While the master is notified when a 
 split is complete, the only visibility that clients have is whether the 
 daughter regions have appeared in .META.
 If the second daughter is added to .META. first, then .META. will contain the 
 (offline) parent region followed by the second daughter region.  If the 
 client looks up a key that is greater than (or equal to) the split, the 
 client will find the second daughter region and use it.  If the key is less 
 than the split key, the client will find the parent region and see that it is 
 offline, triggering a retry.
 If the first daughter is added to .META. before the second daughter, there is 
 a window during which .META. has a hole: the first daughter effectively hides 
 the parent region (same start key), but there is no entry for the second 
 daughter.  A region lookup will find the first daughter for all keys in the 
 parent's range, but the first daughter does not include keys at or beyond the 
 split key.
 See HBASE-4333 and HBASE-4334 for details on how this causes problems and 
 suggestions for mitigating this in the client and regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4333) Client does not check for holes in .META.

2011-10-07 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123360#comment-13123360
 ] 

Lars Hofhansl commented on HBASE-4333:
--

The standard ClientScanner will do the right thing. I.e. scan until the end of 
the region and then use that last key to find the next region. The scan 
starting at the next region will then fail/retry.


 Client does not check for holes in .META.
 -

 Key: HBASE-4333
 URL: https://issues.apache.org/jira/browse/HBASE-4333
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4
Reporter: Joe Pallas

 If there is a temporary hole in .META., the client may get the wrong region 
 from HConnection.locateRegion.  
 HConnectionManager.HConnectionImplementation.locateRegionInMeta should check 
 the end key of the region found with getClosestRowBefore, just as it checks 
 the offline status, when it looks at the region info.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-4488) Store could miss rows during flush

2011-10-07 Thread Lars Hofhansl (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-4488:
--


Reopening for the related change to Store.compactStore

 Store could miss rows during flush
 --

 Key: HBASE-4488
 URL: https://issues.apache.org/jira/browse/HBASE-4488
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver
Affects Versions: 0.92.0, 0.94.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.92.0

 Attachments: 4488.txt


 While looking at HBASE-4344 I found that my change HBASE-4241 contains a 
 critical mistake:
 The while(scanner.next(kvs)) loop is incorrect and might miss the last edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123366#comment-13123366
 ] 

jirapos...@reviews.apache.org commented on HBASE-4218:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2308/#review2460
---



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
https://reviews.apache.org/r/2308/#comment5565

Should be 'bytes are required'



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
https://reviews.apache.org/r/2308/#comment5564

The value of i should be included in the exception.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
https://reviews.apache.org/r/2308/#comment5566

Can this logic be written without recursion ?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
https://reviews.apache.org/r/2308/#comment5567

Should this exception be called DeltaEncoderBufferTooSmallException ?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
https://reviews.apache.org/r/2308/#comment5568

Would arePartsEqual be a better name ?


- Ted


On 2011-10-08 00:51:01, Jacek Migdal wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2308/
bq.  ---
bq.  
bq.  (Updated 2011-10-08 00:51:01)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Delta encoding for key values.
bq.  
bq.  
bq.  This addresses bug HBASE-4218.
bq.  https://issues.apache.org/jira/browse/HBASE-4218
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
 1180113 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
 1180113 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
 1180113 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BitsetKeyDeltaEncoder.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CompressionState.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/CopyKeyDeltaEncoder.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncodedBlock.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoder.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderAlgorithms.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DeltaEncoderToSmallBufferException.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/DiffKeyDeltaEncoder.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/FastDiffDeltaEncoder.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/deltaencoder/PrefixKeyDeltaEncoder.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
 1180113 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
 1180113 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockDeltaEncoder.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
 1180113 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/EmptyBlockDeltaEncoder.java
 PRE-CREATION 
bq.

[jira] [Updated] (HBASE-4488) Store could miss rows during flush

2011-10-07 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4488:
-

Attachment: 4488-add.txt

Everybody OK with the addendum?


 Store could miss rows during flush
 --

 Key: HBASE-4488
 URL: https://issues.apache.org/jira/browse/HBASE-4488
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver
Affects Versions: 0.92.0, 0.94.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.92.0

 Attachments: 4488-add.txt, 4488.txt


 While looking at HBASE-4344 I found that my change HBASE-4241 contains a 
 critical mistake:
 The while(scanner.next(kvs)) loop is incorrect and might miss the last edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows

2011-10-07 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123368#comment-13123368
 ] 

Lars Hofhansl commented on HBASE-4536:
--

Yet another option is to take the smallest time of any of the store files and 
remove the family markers if they are older than that. The marker may survive 
two compactions in that case, but eventually they'll be removed.

In addition to address Jon's need, I think we can add a raw flag to the Scan 
object. If true, the scan will retrieve all available rows including deleted 
rows and delete markers. With the rest of the changes from this patch, that 
would be really easy to do.
(I assume I'd have to increment the SCAN_VERSION, correct?)


 Allow CF to retain deleted rows
 ---

 Key: HBASE-4536
 URL: https://issues.apache.org/jira/browse/HBASE-4536
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0


 Parent allows for a cluster to retain rows for a TTL or keep a minimum number 
 of versions.
 However, if a client deletes a row all version older than the delete tomb 
 stone will be remove at the next major compaction (and even at memstore flush 
 - see HBASE-4241).
 There should be a way to retain those version to guard against software error.
 I see two options here:
 1. Add a new flag HColumnDescriptor. Something like RETAIN_DELETED.
 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of 
 versions even past the delete marker.
 #1 would allow for more flexibility. #2 comes somewhat naturally with parent 
 (from a user viewpoint)
 Comments? Any other options?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4511) There is data loss when master failovers

2011-10-07 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123372#comment-13123372
 ] 

gaojinchao commented on HBASE-4511:
---

@RAM 
For this case. We can process it:
1. why the Region server can't exit ?
2. If master verifies the meta/root failed. Does master need crash? or wait for 
ServerShutDownHandler.


 There is data loss when master failovers
 

 Key: HBASE-4511
 URL: https://issues.apache.org/jira/browse/HBASE-4511
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: gaojinchao
Priority: Critical
 Fix For: 0.92.0

 Attachments: 
 org.apache.hadoop.hbase.master.TestMasterFailover-output.rar


 It goes like this:
 Master crashed ,  at the same time RS with meta is crashing, but RS doesn't 
 eixt.
 Master startups again and finds all living RS. 
 Master verifies the meta failed,  because this RS is crashing.
 Master reassigns the meta, but it doesn't split the Hlog. 
 So some meta data is loss.
 About the logs of a failover test case fail. 
 //It said that we want to kill a RS
 2011-09-28 19:54:45,694 INFO  [Thread-988] regionserver.HRegionServer(1443): 
 STOPPED: Killing for unit test
 2011-09-28 19:54:45,694 INFO  [Thread-988] master.TestMasterFailover(1007): 
 RS 192.168.2.102,54385,1317264874629 killed 
 //Rs didn't crash. 
 2011-09-28 19:54:51,763 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 master.HMaster(458): Registering server found up in zk: 
 192.168.2.102,54385,1317264874629
 2011-09-28 19:54:51,763 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 master.ServerManager(232): Registering 
 server=192.168.2.102,54385,1317264874629
 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of 
 znode /hbase/unassigned/1028785192 because node does not exist (not an error)
 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 //Meta verification failed and ressigned the meta. So all the regions in the 
 meta is loss.
 2011-09-28 19:54:51,773 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 2011-09-28 19:54:52,277 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 2011-09-28 19:54:52,782 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKAssign(264): master:54557-0x132b31adbb30005 Creating (or 
 updating) unassigned node for 1028785192 with OFFLINE state
 2011-09-28 

[jira] [Commented] (HBASE-4511) There is data loss when master failovers

2011-10-07 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123377#comment-13123377
 ] 

ramkrishna.s.vasudevan commented on HBASE-4511:
---

@Gao
This problem occured in testcase.  Can we reproduce this in real time? It would 
be great if we can reproduce so that we are clear of the actual problem?


 There is data loss when master failovers
 

 Key: HBASE-4511
 URL: https://issues.apache.org/jira/browse/HBASE-4511
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: gaojinchao
Priority: Critical
 Fix For: 0.92.0

 Attachments: 
 org.apache.hadoop.hbase.master.TestMasterFailover-output.rar


 It goes like this:
 Master crashed ,  at the same time RS with meta is crashing, but RS doesn't 
 eixt.
 Master startups again and finds all living RS. 
 Master verifies the meta failed,  because this RS is crashing.
 Master reassigns the meta, but it doesn't split the Hlog. 
 So some meta data is loss.
 About the logs of a failover test case fail. 
 //It said that we want to kill a RS
 2011-09-28 19:54:45,694 INFO  [Thread-988] regionserver.HRegionServer(1443): 
 STOPPED: Killing for unit test
 2011-09-28 19:54:45,694 INFO  [Thread-988] master.TestMasterFailover(1007): 
 RS 192.168.2.102,54385,1317264874629 killed 
 //Rs didn't crash. 
 2011-09-28 19:54:51,763 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 master.HMaster(458): Registering server found up in zk: 
 192.168.2.102,54385,1317264874629
 2011-09-28 19:54:51,763 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 master.ServerManager(232): Registering 
 server=192.168.2.102,54385,1317264874629
 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of 
 znode /hbase/unassigned/1028785192 because node does not exist (not an error)
 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 //Meta verification failed and ressigned the meta. So all the regions in the 
 meta is loss.
 2011-09-28 19:54:51,773 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 2011-09-28 19:54:52,277 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 2011-09-28 19:54:52,782 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKAssign(264): master:54557-0x132b31adbb30005 Creating (or 
 updating) unassigned node for 1028785192 with OFFLINE state
 2011-09-28 

[jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter

2011-10-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123378#comment-13123378
 ] 

Ted Yu commented on HBASE-4469:
---

+1 on patch.
Nice job.

 Avoid top row seek by looking up bloomfilter
 

 Key: HBASE-4469
 URL: https://issues.apache.org/jira/browse/HBASE-4469
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is that when seeking for the row/col in the hfile, we will go to 
 top of the row in order to check for row delete marker (delete family). 
 However, if the bloomfilter is enabled for the column family, then if a 
 delete family operation is done on a row, the row is already being added to 
 bloomfilter. We can take advantage of this factor to avoid seeking to the top 
 of row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4488) Store could miss rows during flush

2011-10-07 Thread Jerry Chen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123379#comment-13123379
 ] 

Jerry Chen commented on HBASE-4488:
---

I recall seeing some unit tests are written in the wrong while loop fashion as 
well. 

 Store could miss rows during flush
 --

 Key: HBASE-4488
 URL: https://issues.apache.org/jira/browse/HBASE-4488
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver
Affects Versions: 0.92.0, 0.94.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.92.0

 Attachments: 4488-add.txt, 4488.txt


 While looking at HBASE-4344 I found that my change HBASE-4241 contains a 
 critical mistake:
 The while(scanner.next(kvs)) loop is incorrect and might miss the last edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123380#comment-13123380
 ] 

jirapos...@reviews.apache.org commented on HBASE-4540:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/
---

(Updated 2011-10-08 05:13:32.657832)


Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.


Changes
---

This updated patch is same as uploaded at @ 07/Oct/11 14:27
Reverted the change of passing -2 for not comparing the version and address 
Ted's comment to add spaces.


Summary
---

Fix for handling HBASE-4539 and HBASE-4540.
Ran all the testcases.  Added one new testcase to verify OpenedRegionHandler 
scenarios.
Also addresses Ted's comments.


This addresses bug HBASE-4540.
https://issues.apache.org/jira/browse/HBASE-4540


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
 1179945 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
 1179945 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
 1179945 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
 1179945 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/2251/diff


Testing
---

Yes


Thanks,

ramkrishna



 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
 2011-10-05 20:50:48,290 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 2011-10-05 20:50:53,743 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
 Scanned 1 catalog row(s) and gc'd 0 unreferenced parent 

[jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-07 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123381#comment-13123381
 ] 

jirapos...@reviews.apache.org commented on HBASE-4540:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2251/#review2461
---

Ship it!


- Ted


On 2011-10-08 05:13:32, ramkrishna vasudevan wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2251/
bq.  ---
bq.  
bq.  (Updated 2011-10-08 05:13:32)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Fix for handling HBASE-4539 and HBASE-4540.
bq.  Ran all the testcases.  Added one new testcase to verify 
OpenedRegionHandler scenarios.
bq.  Also addresses Ted's comments.
bq.  
bq.  
bq.  This addresses bug HBASE-4540.
bq.  https://issues.apache.org/jira/browse/HBASE-4540
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
 1179945 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java
 1179945 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
 1179945 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
 1179945 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2251/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Yes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  ramkrishna
bq.  
bq.



 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-4540_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
 2011-10-05 20:50:48,290 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 

[jira] [Commented] (HBASE-4488) Store could miss rows during flush

2011-10-07 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123383#comment-13123383
 ] 

Lars Hofhansl commented on HBASE-4488:
--

I'll open another jira and fix all those.

 Store could miss rows during flush
 --

 Key: HBASE-4488
 URL: https://issues.apache.org/jira/browse/HBASE-4488
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver
Affects Versions: 0.92.0, 0.94.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.92.0

 Attachments: 4488-add.txt, 4488.txt


 While looking at HBASE-4344 I found that my change HBASE-4241 contains a 
 critical mistake:
 The while(scanner.next(kvs)) loop is incorrect and might miss the last edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4488) Store could miss rows during flush

2011-10-07 Thread Lars Hofhansl (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-4488.
--

Resolution: Fixed

 Store could miss rows during flush
 --

 Key: HBASE-4488
 URL: https://issues.apache.org/jira/browse/HBASE-4488
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver
Affects Versions: 0.92.0, 0.94.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.92.0

 Attachments: 4488-add.txt, 4488.txt


 While looking at HBASE-4344 I found that my change HBASE-4241 contains a 
 critical mistake:
 The while(scanner.next(kvs)) loop is incorrect and might miss the last edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4488) Store could miss rows during flush

2011-10-07 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123384#comment-13123384
 ] 

Lars Hofhansl commented on HBASE-4488:
--

created HBASE-4556

 Store could miss rows during flush
 --

 Key: HBASE-4488
 URL: https://issues.apache.org/jira/browse/HBASE-4488
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver
Affects Versions: 0.92.0, 0.94.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.92.0

 Attachments: 4488-add.txt, 4488.txt


 While looking at HBASE-4344 I found that my change HBASE-4241 contains a 
 critical mistake:
 The while(scanner.next(kvs)) loop is incorrect and might miss the last edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4556) Fix all incorrect uses of InternalScanner.next(...)

2011-10-07 Thread Lars Hofhansl (Created) (JIRA)
Fix all incorrect uses of InternalScanner.next(...)
---

 Key: HBASE-4556
 URL: https://issues.apache.org/jira/browse/HBASE-4556
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl


There are cases all over the code where InternalScanner.next(...) is not used 
correctly.

I see this a lot:
{code}
while(scanner.next(...)) {
}
{code}

The correct pattern is:
{code}
boolean more = false;
do {
   more = scanner.next(...);
} while (more);
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira