[jira] [Updated] (HBASE-6120) Few logging improvements around enabling tables

2012-05-28 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HBASE-6120:
---

Description: 
- Few log statements between Enable/Disable/Create table handler event classes 
have a typo with word "Attempting" (its misspelled "Attemping").
- Even upon an enable operation's failure, the tailing message is a mere INFO 
with a state of 'false'. This isn't as visible as I'd like it to be when 
diagnosing logs for issues. I've put it in a proper if-else for this case.

  was:
- Few log statements between Enable/Disable/Create table handler event classes 
have a typo with word "Attempting" (its misspelled "Attemping").
- Even upon an enable operation's failure, the tailing message is a mere WARN 
with a state of 'false'. This isn't as visible as I'd like it to be when 
diagnosing logs for issues. I've put it in a proper if-else for this case.


Fixed error in description: s/WARN/INFO

> Few logging improvements around enabling tables
> ---
>
> Key: HBASE-6120
> URL: https://issues.apache.org/jira/browse/HBASE-6120
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.94.0
>Reporter: Harsh J
>Priority: Trivial
> Attachments: HBASE-6120.patch
>
>
> - Few log statements between Enable/Disable/Create table handler event 
> classes have a typo with word "Attempting" (its misspelled "Attemping").
> - Even upon an enable operation's failure, the tailing message is a mere INFO 
> with a state of 'false'. This isn't as visible as I'd like it to be when 
> diagnosing logs for issues. I've put it in a proper if-else for this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-28 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284601#comment-13284601
 ] 

Zhihong Yu commented on HBASE-6088:
---

Makes sense.
Please make other suggested changes to comment.

>  Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> 
>
> Key: HBASE-6088
> URL: https://issues.apache.org/jira/browse/HBASE-6088
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: rajeshbabu
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6088_94.patch, HBASE-6088_94_2.patch, 
> HBASE-6088_trunk.patch, HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch
>
>
> Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> {noformat}
> 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
> timed out, have not heard from server in 26668ms for sessionid 
> 0x1377a75f41d0012, closing socket connection and attempting reconnect
> 2012-05-24 01:45:41,464 WARN 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
> ZooKeeper exception: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
> {noformat}
> {noformat}
> 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
> cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
> synced till here 189365
> 2012-05-24 01:45:48,474 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
> setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> java.io.IOException: Failed setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
> KeeperErrorCode = BadVersion for 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
>   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
>   ... 5 more
> 2012-05-24 01:45:48,476 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
> failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> {noformat}
> {noformat}
> 2012-05-24 01:47:28,141 ERROR 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
> not a retry
> 2012-05-24 01:47:28,142 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
> create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
> java.io.IOException: Failed create of ephemeral 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
>   at 
> org.apache.hadoop

[jira] [Assigned] (HBASE-5837) hbase shell deleteall to .META. allows insertion of malformed rowkey.

2012-05-28 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh reassigned HBASE-5837:
-

Assignee: Kevin Odell

> hbase shell deleteall to .META. allows insertion of malformed rowkey.
> -
>
> Key: HBASE-5837
> URL: https://issues.apache.org/jira/browse/HBASE-5837
> Project: HBase
>  Issue Type: Bug
>  Components: master, shell
>Affects Versions: 0.90.6
>Reporter: Jonathan Hsieh
>Assignee: Kevin Odell
>
> When using the hbase shell to manipulate meta entries, one is allowed to 
> 'delete' malformed rows (entries with less than 2 ascii 44 ',' chars).  When 
> this happens HBase servers may go down and the cluster will not be 
> restartable without manual intervention.  
> The delete results in a durable malformed rowkey in .META.'s memstore, 
> .META.'s HLog, and eventually .META.'s HFiles.  Subsequent scans to meta 
> (such as when a HMaster starts) fail in the scanner because the comparator 
> fails.  In the case of an HMaster startup, it causes an abort that kills the 
> HMaster process.
> {code}
> 12/04/18 22:07:34 FATAL master.HMaster: Unhandled exception. Starting 
> shutdown.
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.IllegalArgumentException: No 44 in 
> , length=47, offset=54
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:979)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1894)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1834)
> at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> Caused by: java.lang.IllegalArgumentException: No 44 in 
> , length=47, offset=54
> at 
> org.apache.hadoop.hbase.KeyValue.getRequiredDelimiterInReverse(KeyValue.java:1300)
> at 
> org.apache.hadoop.hbase.KeyValue$MetaKeyComparator.compareRows(KeyValue.java:1846)
> at 
> org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:130)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:257)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:114)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:2435)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:2391)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:2408)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1870)
> ... 6 more
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
> at $Proxy9.next(Unknown Source)
> at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:264)
> at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:237)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScanOfResults(MetaReader.java:220)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:1580)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.processFailover(AssignmentManager.java:221)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:422)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:295)
> 12/04/18 22:07:34 INFO master.HMaster: Aborting 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-28 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284595#comment-13284595
 ] 

rajeshbabu commented on HBASE-6088:
---

@Ted

bq.The second state should be RS_ZK_REGION_SPLIT.
As part of createNodeSplitting we transition from RS_ZK_REGION_SPLITTING to 
RS_ZK_REGION_SPLITTING.

{code}
  int transitionNodeSplitting(final ZooKeeperWatcher zkw, final HRegionInfo 
parent,
  final ServerName serverName, final int version) throws KeeperException, 
IOException {
return ZKAssign.transitionNode(zkw, parent, serverName,
  EventType.RS_ZK_REGION_SPLITTING, EventType.RS_ZK_REGION_SPLITTING, 
version);
  }
{code}

Thats why I have mentioned RS_ZK_REGION_SPLITTING as second state.
Pls correct me if wrong.

>  Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> 
>
> Key: HBASE-6088
> URL: https://issues.apache.org/jira/browse/HBASE-6088
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: rajeshbabu
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6088_94.patch, HBASE-6088_94_2.patch, 
> HBASE-6088_trunk.patch, HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch
>
>
> Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> {noformat}
> 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
> timed out, have not heard from server in 26668ms for sessionid 
> 0x1377a75f41d0012, closing socket connection and attempting reconnect
> 2012-05-24 01:45:41,464 WARN 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
> ZooKeeper exception: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
> {noformat}
> {noformat}
> 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
> cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
> synced till here 189365
> 2012-05-24 01:45:48,474 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
> setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> java.io.IOException: Failed setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
> KeeperErrorCode = BadVersion for 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
>   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
>   ... 5 more
> 2012-05-24 01:45:48,476 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
> failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> {noformat}
> {noformat}
> 2012-05-24 01:47:28,141 ERROR 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
> not a retry
> 2012-05-24 01:47:28,142 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of 

[jira] [Updated] (HBASE-6107) Distributed log splitting hangs even there is no task under /hbase/splitlog

2012-05-28 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6107:
---

Status: Patch Available  (was: Open)

> Distributed log splitting hangs even there is no task under /hbase/splitlog
> ---
>
> Key: HBASE-6107
> URL: https://issues.apache.org/jira/browse/HBASE-6107
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.96.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: hbase-6107.patch, hbase-6107_v3-new.patch, 
> hbase_6107_v2.patch, hbase_6107_v3.patch
>
>
> Sometimes, master web UI shows the distributed log splitting is going on, 
> waiting for one last task to be done.  However, in ZK, there is no task under 
> /hbase/splitlog at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6107) Distributed log splitting hangs even there is no task under /hbase/splitlog

2012-05-28 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6107:
---

Attachment: hbase-6107_v3-new.patch

> Distributed log splitting hangs even there is no task under /hbase/splitlog
> ---
>
> Key: HBASE-6107
> URL: https://issues.apache.org/jira/browse/HBASE-6107
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.96.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: hbase-6107.patch, hbase-6107_v3-new.patch, 
> hbase_6107_v2.patch, hbase_6107_v3.patch
>
>
> Sometimes, master web UI shows the distributed log splitting is going on, 
> waiting for one last task to be done.  However, in ZK, there is no task under 
> /hbase/splitlog at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6107) Distributed log splitting hangs even there is no task under /hbase/splitlog

2012-05-28 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6107:
---

Status: Open  (was: Patch Available)

> Distributed log splitting hangs even there is no task under /hbase/splitlog
> ---
>
> Key: HBASE-6107
> URL: https://issues.apache.org/jira/browse/HBASE-6107
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.96.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: hbase-6107.patch, hbase-6107_v3-new.patch, 
> hbase_6107_v2.patch, hbase_6107_v3.patch
>
>
> Sometimes, master web UI shows the distributed log splitting is going on, 
> waiting for one last task to be done.  However, in ZK, there is no task under 
> /hbase/splitlog at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5837) hbase shell deleteall to .META. allows insertion of malformed rowkey.

2012-05-28 Thread Kevin Odell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284587#comment-13284587
 ] 

Kevin Odell commented on HBASE-5837:


I am working on a patch for this.

> hbase shell deleteall to .META. allows insertion of malformed rowkey.
> -
>
> Key: HBASE-5837
> URL: https://issues.apache.org/jira/browse/HBASE-5837
> Project: HBase
>  Issue Type: Bug
>  Components: master, shell
>Affects Versions: 0.90.6
>Reporter: Jonathan Hsieh
>
> When using the hbase shell to manipulate meta entries, one is allowed to 
> 'delete' malformed rows (entries with less than 2 ascii 44 ',' chars).  When 
> this happens HBase servers may go down and the cluster will not be 
> restartable without manual intervention.  
> The delete results in a durable malformed rowkey in .META.'s memstore, 
> .META.'s HLog, and eventually .META.'s HFiles.  Subsequent scans to meta 
> (such as when a HMaster starts) fail in the scanner because the comparator 
> fails.  In the case of an HMaster startup, it causes an abort that kills the 
> HMaster process.
> {code}
> 12/04/18 22:07:34 FATAL master.HMaster: Unhandled exception. Starting 
> shutdown.
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.IllegalArgumentException: No 44 in 
> , length=47, offset=54
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:979)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1894)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1834)
> at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> Caused by: java.lang.IllegalArgumentException: No 44 in 
> , length=47, offset=54
> at 
> org.apache.hadoop.hbase.KeyValue.getRequiredDelimiterInReverse(KeyValue.java:1300)
> at 
> org.apache.hadoop.hbase.KeyValue$MetaKeyComparator.compareRows(KeyValue.java:1846)
> at 
> org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:130)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:257)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:114)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:2435)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:2391)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:2408)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1870)
> ... 6 more
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
> at $Proxy9.next(Unknown Source)
> at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:264)
> at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:237)
> at 
> org.apache.hadoop.hbase.catalog.MetaReader.fullScanOfResults(MetaReader.java:220)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:1580)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.processFailover(AssignmentManager.java:221)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:422)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:295)
> 12/04/18 22:07:34 INFO master.HMaster: Aborting 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6065) Log for flush would append a non-sequential edit in the hlog, leading to possible data loss

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284584#comment-13284584
 ] 

Hudson commented on HBASE-6065:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #30 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/30/])
HBASE-6118 Add a testcase for HBASE-6065 (Ashutosh) (Revision 1343338)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java


> Log for flush would append a non-sequential edit in the hlog, leading to 
> possible data loss
> ---
>
> Key: HBASE-6065
> URL: https://issues.apache.org/jira/browse/HBASE-6065
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6065.patch, HBASE-6065v2.patch
>
>
> After completing flush region, we will append a log edit in the hlog file 
> through HLog#completeCacheFlush.
> {code}
> public void completeCacheFlush(final byte [] encodedRegionName,
>   final byte [] tableName, final long logSeqId, final boolean 
> isMetaRegion)
> {
> ...
> HLogKey key = makeKey(encodedRegionName, tableName, logSeqId,
> System.currentTimeMillis(), HConstants.DEFAULT_CLUSTER_ID);
> ...
> }
> {code}
> when we make the hlog key, we use the seqId from the parameter, and it is 
> generated by HLog#startCacheFlush,
> Here, we may append a lower seq id edit than the last edit in the hlog file.
> If it is the last edit log in the file, it may cause data loss.
> because 
> {code}
> HRegion#replayRecoveredEditsIfAny{
> ...
> maxSeqId = Math.abs(Long.parseLong(fileName));
>   if (maxSeqId <= minSeqId) {
> String msg = "Maximum sequenceid for this log is " + maxSeqId
> + " and minimum sequenceid for the region is " + minSeqId
> + ", skipped the whole file, path=" + edits;
> LOG.debug(msg);
> continue;
>   }
> ...
> }
> {code}
> We may skip the splitted log file, because we use the lase edit's seq id as 
> its file name, and consider this seqId as the max seq id in this log file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284585#comment-13284585
 ] 

Hudson commented on HBASE-5916:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #30 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/30/])
HBASE-5916 RS restart just before master intialization we make the cluster 
non operative (Rajesh) (Revision 1343324)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


> RS restart just before master intialization we make the cluster non operative
> -
>
> Key: HBASE-5916
> URL: https://issues.apache.org/jira/browse/HBASE-5916
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1, 0.94.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-5916_94.patch, HBASE-5916_trunk.patch, 
> HBASE-5916_trunk_1.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, 
> HBASE-5916_trunk_3.patch, HBASE-5916_trunk_4.patch, 
> HBASE-5916_trunk_v5.patch, HBASE-5916_trunk_v6.patch, 
> HBASE-5916_trunk_v7.patch, HBASE-5916_trunk_v8.patch, 
> HBASE-5916_trunk_v9.patch, HBASE-5916v8.patch
>
>
> Consider a case where my master is getting restarted.  RS that was alive when 
> the master restart started, gets restarted before the master initializes the 
> ServerShutDownHandler.
> {code}
> serverShutdownHandlerEnabled = true;
> {code}
> In this case when the RS tries to register with the master, the master will 
> try to expire the server but the server cannot be expired as still the 
> serverShutdownHandler is not enabled.
> This case may happen when i have only one RS gets restarted or all the RS 
> gets restarted at the same time.(before assignRootandMeta).
> {code}
> LOG.info(message);
>   if (existingServer.getStartcode() < serverName.getStartcode()) {
> LOG.info("Triggering server recovery; existingServer " +
>   existingServer + " looks stale, new server:" + serverName);
> expireServer(existingServer);
>   }
> {code}
> If another RS is brought up then the cluster comes back to normalcy.
> May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6094) [refGuide] Improvements to new contributor docs

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284581#comment-13284581
 ] 

Hudson commented on HBASE-6094:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #30 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/30/])
HBASE-6094 [refGuide] Improvements to new contributor docs (Revision 
1343407)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/hbase-site/src/docbkx/developer.xml


> [refGuide] Improvements to new contributor docs
> ---
>
> Key: HBASE-6094
> URL: https://issues.apache.org/jira/browse/HBASE-6094
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ian Varley
>Assignee: Ian Varley
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: book_hbase_6094.xml.patch
>
>
> developer.xml
> * Expanded explanation around git & svn, and mentioning the EGit plugin
> * Expanded explanation of setting up the eclipse project
> * Extra section about basic compilation using maven and eclipse
> * Fix to tarball command that makes it maven2 compatible
> * Greatly expanded section about contributing docs, and clarification that 
> pushing generated site is only for those with permissions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6032) Port HFileBlockIndex improvement from HBASE-5987

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284580#comment-13284580
 ] 

Hudson commented on HBASE-6032:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #30 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/30/])
HBASE-6032 Port HFileBlockIndex improvement from HBASE-5987 (Revision 
1343413)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/HConstants.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockWithScanInfo.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksScanned.java


> Port HFileBlockIndex improvement from HBASE-5987
> 
>
> Key: HBASE-6032
> URL: https://issues.apache.org/jira/browse/HBASE-6032
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>Assignee: Zhihong Yu
> Fix For: 0.96.0
>
> Attachments: 6032-ports-5987-v2.txt, 6032-ports-5987.txt
>
>
> Excerpt from HBASE-5987:
> First, we propose to lookahead for one more block index so that the 
> HFileScanner would know the start key value of next data block. So if the 
> target key value for the scan(reSeekTo) is "smaller" than that start kv of 
> next data block, it means the target key value has a very high possibility in 
> the current data block (if not in current data block, then the start kv of 
> next data block should be returned. +Indexing on the start key has some 
> defects here+) and it shall NOT query the HFileBlockIndex in this case. On 
> the contrary, if the target key value is "bigger", then it shall query the 
> HFileBlockIndex. This improvement shall help to reduce the hotness of 
> HFileBlockIndex and avoid some unnecessary IdLock Contention or Index Block 
> Cache lookup.
> This JIRA is to port the fix to HBase trunk, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5987) HFileBlockIndex improvement

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284582#comment-13284582
 ] 

Hudson commented on HBASE-5987:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #30 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/30/])
HBASE-6032 Port HFileBlockIndex improvement from HBASE-5987 (Revision 
1343413)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/HConstants.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockWithScanInfo.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksScanned.java


> HFileBlockIndex improvement
> ---
>
> Key: HBASE-5987
> URL: https://issues.apache.org/jira/browse/HBASE-5987
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D3237.1.patch, D3237.2.patch, D3237.3.patch, 
> D3237.4.patch, D3237.5.patch, D3237.6.patch, D3237.7.patch, D3237.8.patch, 
> screen_shot_of_sequential_scan_profiling.png
>
>
> Recently we find out a performance problem that it is quite slow when 
> multiple requests are reading the same block of data or index. 
> From the profiling, one of the causes is the IdLock contention which has been 
> addressed in HBASE-5898. 
> Another issue is that the HFileScanner will keep asking the HFileBlockIndex 
> about the data block location for each target key value during the scan 
> process(reSeekTo), even though the target key value has already been in the 
> current data block. This issue will cause certain index block very HOT, 
> especially when it is a sequential scan.
> To solve this issue, we propose the following solutions:
> First, we propose to lookahead for one more block index so that the 
> HFileScanner would know the start key value of next data block. So if the 
> target key value for the scan(reSeekTo) is "smaller" than that start kv of 
> next data block, it means the target key value has a very high possibility in 
> the current data block (if not in current data block, then the start kv of 
> next data block should be returned. +Indexing on the start key has some 
> defects here+) and it shall NOT query the HFileBlockIndex in this case. On 
> the contrary, if the target key value is "bigger", then it shall query the 
> HFileBlockIndex. This improvement shall help to reduce the hotness of 
> HFileBlockIndex and avoid some unnecessary IdLock Contention or Index Block 
> Cache lookup.
> Secondary, we propose to push this idea a little further that the 
> HFileBlockIndex shall index on the last key value of each data block instead 
> of indexing on the start key value. The motivation is to solve the HBASE-4443 
> issue (avoid seeking to "previous" block when key you are interested in is 
> the first one of a block) as well as +the defects mentioned above+.
> For example, if the target key value is "smaller" than the start key value of 
> the data block N. There is no way for sure the target key value is in the 
> data block N or N-1. So it has to seek from data block N-1. However, if the 
> block index is based on the last key value for each data block and the target 
> key value is beween the last key value of data block N-1 and data block N, 
> then the target key value is supposed be data block N for sure. 
> As long as HBase only supports the forward scan, the last key value makes 
> more sense to be indexed on than the start key value. 
> Thanks Kannan and Mikhail for the insightful discussions and suggestions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6118) Add a testcase for HBASE-6065

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284583#comment-13284583
 ] 

Hudson commented on HBASE-6118:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #30 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/30/])
HBASE-6118 Add a testcase for HBASE-6065 (Ashutosh) (Revision 1343338)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java


> Add a testcase for HBASE-6065
> -
>
> Key: HBASE-6118
> URL: https://issues.apache.org/jira/browse/HBASE-6118
> Project: HBase
>  Issue Type: Test
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ashutosh Jindal
> Attachments: 6118-trunk.txt, HBASE-6118_0.94.patch
>
>
> It would be nice to have a testcase for HBASE-6065.  Internally we have 
> written a test case to simulate the problem.  Thought that it would be better 
> to contribute the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5977) Usage of modules

2012-05-28 Thread Matt Corgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Corgan updated HBASE-5977:
---

Attachment: Potential-HBase-Module-Descriptions-v1.pdf
Potential-HBase-Modules-v1.pdf

Jesse, Stack and I have discussed this from a few different angles to try to 
identify some of the reasons for creating modules.  The main benefit of modules 
is to isolate complex implementations behind simple interfaces.  The main 
drawback is that modules add overhead in the form of more things to open in 
eclipse and more jar files in the build.

Pasting from HBASE-5720 some arguments for creating a "codec" module that 
contains wrapper classes for individual HFile block types:
* make it more testable, like a normal in-memory data structure without having 
to set up heavyweight testing environments
* separate the encoding concerns from IO concerns. after the checksum happens, 
encoders/decoders should not even know what an IOException is
* strongly discourage people from modifying anything in the codec packages 
without knowing what they're getting into
* ensure the main project code only references the interfaces and not any codec 
internals (see if main project compiles without codecs in classpath)
* make it easier for contributors to develop and profile the codecs without 
having to become experts in all aspects of hbase
* help to simplify the main project. imagine if the gzip or snappy internals 
were sprinkled throughout the regionserver code. yikes.

Attaching Potential-HBase-Modules-v1.pdf and 
Potential-HBaseModule-Descriptions-v1.pdf to illustrate a possible roadmap for 
extracting modules.  We currently have hbase-server, and first going to "pull 
up" some files into hbase-common.  Eventually we may "push down" an 
integration-test module.  

Extracting these modules can't really be done all at once, so this is just a 
roadmap meant to start discussion.  For example, there's probably an 
opportunity to isolate some of regionserver and master code, but they also 
share a lot.  This v1 doc shows a push down of master code out of the server 
module, but we probably need to think through that in more detail.

* Link to dependency chart: 
https://docs.google.com/presentation/d/16Kf9FAFjtneWwCnpy9Bql4QhXmORf7U9uJLoRobePHQ/edit
* Link to description doc: 
https://docs.google.com/document/d/1RHrUa9qWGvIR6ZmqVYP17rS7JTPSzCFCPKNjTo-XY38/edit


> Usage of modules 
> -
>
> Key: HBASE-5977
> URL: https://issues.apache.org/jira/browse/HBASE-5977
> Project: HBase
>  Issue Type: Brainstorming
>  Components: build
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
> Attachments: Potential-HBase-Module-Descriptions-v1.pdf, 
> Potential-HBase-Modules-v1.pdf
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> With HBASE-4336, HBase will have the ability to add multiple modules for 
> different aspects of the codebase (less tests, see HBASE-4336 for details). 
> We need to set a policy for when modules should be used versus putting the 
> code into a single existing module or dispersed across modules. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5987) HFileBlockIndex improvement

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284578#comment-13284578
 ] 

Hudson commented on HBASE-5987:
---

Integrated in HBase-TRUNK #2941 (See 
[https://builds.apache.org/job/HBase-TRUNK/2941/])
HBASE-6032 Port HFileBlockIndex improvement from HBASE-5987 (Revision 
1343413)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/HConstants.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockWithScanInfo.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksScanned.java


> HFileBlockIndex improvement
> ---
>
> Key: HBASE-5987
> URL: https://issues.apache.org/jira/browse/HBASE-5987
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D3237.1.patch, D3237.2.patch, D3237.3.patch, 
> D3237.4.patch, D3237.5.patch, D3237.6.patch, D3237.7.patch, D3237.8.patch, 
> screen_shot_of_sequential_scan_profiling.png
>
>
> Recently we find out a performance problem that it is quite slow when 
> multiple requests are reading the same block of data or index. 
> From the profiling, one of the causes is the IdLock contention which has been 
> addressed in HBASE-5898. 
> Another issue is that the HFileScanner will keep asking the HFileBlockIndex 
> about the data block location for each target key value during the scan 
> process(reSeekTo), even though the target key value has already been in the 
> current data block. This issue will cause certain index block very HOT, 
> especially when it is a sequential scan.
> To solve this issue, we propose the following solutions:
> First, we propose to lookahead for one more block index so that the 
> HFileScanner would know the start key value of next data block. So if the 
> target key value for the scan(reSeekTo) is "smaller" than that start kv of 
> next data block, it means the target key value has a very high possibility in 
> the current data block (if not in current data block, then the start kv of 
> next data block should be returned. +Indexing on the start key has some 
> defects here+) and it shall NOT query the HFileBlockIndex in this case. On 
> the contrary, if the target key value is "bigger", then it shall query the 
> HFileBlockIndex. This improvement shall help to reduce the hotness of 
> HFileBlockIndex and avoid some unnecessary IdLock Contention or Index Block 
> Cache lookup.
> Secondary, we propose to push this idea a little further that the 
> HFileBlockIndex shall index on the last key value of each data block instead 
> of indexing on the start key value. The motivation is to solve the HBASE-4443 
> issue (avoid seeking to "previous" block when key you are interested in is 
> the first one of a block) as well as +the defects mentioned above+.
> For example, if the target key value is "smaller" than the start key value of 
> the data block N. There is no way for sure the target key value is in the 
> data block N or N-1. So it has to seek from data block N-1. However, if the 
> block index is based on the last key value for each data block and the target 
> key value is beween the last key value of data block N-1 and data block N, 
> then the target key value is supposed be data block N for sure. 
> As long as HBase only supports the forward scan, the last key value makes 
> more sense to be indexed on than the start key value. 
> Thanks Kannan and Mikhail for the insightful discussions and suggestions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6094) [refGuide] Improvements to new contributor docs

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284577#comment-13284577
 ] 

Hudson commented on HBASE-6094:
---

Integrated in HBase-TRUNK #2941 (See 
[https://builds.apache.org/job/HBase-TRUNK/2941/])
HBASE-6094 [refGuide] Improvements to new contributor docs (Revision 
1343407)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/hbase-site/src/docbkx/developer.xml


> [refGuide] Improvements to new contributor docs
> ---
>
> Key: HBASE-6094
> URL: https://issues.apache.org/jira/browse/HBASE-6094
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ian Varley
>Assignee: Ian Varley
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: book_hbase_6094.xml.patch
>
>
> developer.xml
> * Expanded explanation around git & svn, and mentioning the EGit plugin
> * Expanded explanation of setting up the eclipse project
> * Extra section about basic compilation using maven and eclipse
> * Fix to tarball command that makes it maven2 compatible
> * Greatly expanded section about contributing docs, and clarification that 
> pushing generated site is only for those with permissions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6032) Port HFileBlockIndex improvement from HBASE-5987

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284576#comment-13284576
 ] 

Hudson commented on HBASE-6032:
---

Integrated in HBase-TRUNK #2941 (See 
[https://builds.apache.org/job/HBase-TRUNK/2941/])
HBASE-6032 Port HFileBlockIndex improvement from HBASE-5987 (Revision 
1343413)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/HConstants.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockWithScanInfo.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksScanned.java


> Port HFileBlockIndex improvement from HBASE-5987
> 
>
> Key: HBASE-6032
> URL: https://issues.apache.org/jira/browse/HBASE-6032
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>Assignee: Zhihong Yu
> Fix For: 0.96.0
>
> Attachments: 6032-ports-5987-v2.txt, 6032-ports-5987.txt
>
>
> Excerpt from HBASE-5987:
> First, we propose to lookahead for one more block index so that the 
> HFileScanner would know the start key value of next data block. So if the 
> target key value for the scan(reSeekTo) is "smaller" than that start kv of 
> next data block, it means the target key value has a very high possibility in 
> the current data block (if not in current data block, then the start kv of 
> next data block should be returned. +Indexing on the start key has some 
> defects here+) and it shall NOT query the HFileBlockIndex in this case. On 
> the contrary, if the target key value is "bigger", then it shall query the 
> HFileBlockIndex. This improvement shall help to reduce the hotness of 
> HFileBlockIndex and avoid some unnecessary IdLock Contention or Index Block 
> Cache lookup.
> This JIRA is to port the fix to HBase trunk, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-6047) Put.has() can't determine result correctly

2012-05-28 Thread Zhihong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu resolved HBASE-6047.
---

Resolution: Fixed

> Put.has() can't determine result correctly
> --
>
> Key: HBASE-6047
> URL: https://issues.apache.org/jira/browse/HBASE-6047
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.92.1
>Reporter: Wang Qiang
>Assignee: Alex Newman
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: 
> 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly-v2.patch, 
> 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly.patch, 6047-92.txt, 
> PutTest.java
>
>
> the public method 'has(byte [] family, byte [] qualifier)' internally invoked 
> the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] 
> value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], 
> ignoreTS=true, ignoreValue=true', but there's a logical error in the body, 
> it'll enter the block
> {code}
> else if (ignoreValue) {
>   for (KeyValue kv: list) {
> if (Arrays.equals(kv.getFamily(), family) && 
> Arrays.equals(kv.getQualifier(), qualifier)
> && kv.getTimestamp() == ts) {
>   return true;
> }
>   }
> }
> {code}
> the expression 'kv.getTimestamp() == ts' in the if conditions should only 
> exist when 'ignoreTS=false', otherwise, the following code will return false!
> {code}
> Put put = new Put(Bytes.toBytes("row-01"));
> put.add(Bytes.toBytes("family-01"), Bytes.toBytes("qualifier-01"),
>   1234567L, Bytes.toBytes("value-01"));
> System.out.println(put.has(Bytes.toBytes("family-01"),
>   Bytes.toBytes("qualifier-01")));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6109) Improve RIT performances during assignment on large clusters

2012-05-28 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284573#comment-13284573
 ] 

Zhihong Yu commented on HBASE-6109:
---

It would be nice to have a test for NotifiableConcurrentSkipListMap.
{code}
+  public void waitListUpdate(long timeout) throws InterruptedException {
+synchronized (internalList){
{code}
Since internalList is actually a Map, name the above method waitForUpdate() ?
{code}
+  public void clear() {
+if (!internalList.isEmpty()) {
+  synchronized (internalList) {
{code}
Is it possible that internalList becomes empty after entering the synchronized 
block ?

For Locker,
{code}
+ * An utility class to manage a set of lock. Each lock is identified by a 
String who serves
{code}
the above should read 'A utility class to manage a set of locks. Each lock is 
identified by a String which serves'
{code}
+public class Locker {
+  private static final Log LOG = LogFactory.getLog(AssignmentManager.class);
{code}
It should be Locker.class
{code}
+  private static final int NB_CONCURRENT_LOCK = 1000;
{code}
The constant should be named NB_CONCURRENT_LOCKS.
{code}
+   * Return a lock for the given key. The lock is already lockek.
{code}
The last word should be locked.
{code}
+  String message = "Can't release the lock for " + key;
{code}
It would be nice to add more about reason.
{code}
-synchronized (regionsInTransition) {
-  nodes.removeAll(regionsInTransition.keySet());
-}
+// no lock concurrent access ok: some threads may be adding/removing items 
but its java-valid
+nodes.removeAll(regionsInTransition.keySet());
{code}
Looking at batchRemove() of 
http://www.docjar.com/html/api/java/util/ArrayList.java.html around line 669, I 
don't see synchronization.
Meaning, existence check of elements from nodes in regionsInTransition.keySet() 
may not be deterministic.


> Improve RIT performances during assignment on large clusters
> 
>
> Key: HBASE-6109
> URL: https://issues.apache.org/jira/browse/HBASE-6109
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 6109.v7.patch
>
>
> The main points in this patch are:
>  - lowering the number of copy of the RIT list
>  - lowering the number of synchronization
>  - synchronizing on a region rather than on everything
> It also contains:
>  - some fixes around the RIT notification: the list was sometimes modified 
> without a corresponding 'notify'.
>  - some tests flakiness correction, actually unrelated to this patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6047) Put.has() can't determine result correctly

2012-05-28 Thread Alex Newman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284572#comment-13284572
 ] 

Alex Newman commented on HBASE-6047:


Interesting. I hate to be dense. But I'm not sure what else I can do on this 
jira?

> Put.has() can't determine result correctly
> --
>
> Key: HBASE-6047
> URL: https://issues.apache.org/jira/browse/HBASE-6047
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.92.1
>Reporter: Wang Qiang
>Assignee: Alex Newman
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: 
> 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly-v2.patch, 
> 0001-HBASE-6047.-Put.has-can-t-determine-result-correctly.patch, 6047-92.txt, 
> PutTest.java
>
>
> the public method 'has(byte [] family, byte [] qualifier)' internally invoked 
> the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] 
> value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], 
> ignoreTS=true, ignoreValue=true', but there's a logical error in the body, 
> it'll enter the block
> {code}
> else if (ignoreValue) {
>   for (KeyValue kv: list) {
> if (Arrays.equals(kv.getFamily(), family) && 
> Arrays.equals(kv.getQualifier(), qualifier)
> && kv.getTimestamp() == ts) {
>   return true;
> }
>   }
> }
> {code}
> the expression 'kv.getTimestamp() == ts' in the if conditions should only 
> exist when 'ignoreTS=false', otherwise, the following code will return false!
> {code}
> Put put = new Put(Bytes.toBytes("row-01"));
> put.add(Bytes.toBytes("family-01"), Bytes.toBytes("qualifier-01"),
>   1234567L, Bytes.toBytes("value-01"));
> System.out.println(put.has(Bytes.toBytes("family-01"),
>   Bytes.toBytes("qualifier-01")));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6032) Port HFileBlockIndex improvement from HBASE-5987

2012-05-28 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284567#comment-13284567
 ] 

Zhihong Yu commented on HBASE-6032:
---

Integrated patch v2 to trunk.

> Port HFileBlockIndex improvement from HBASE-5987
> 
>
> Key: HBASE-6032
> URL: https://issues.apache.org/jira/browse/HBASE-6032
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>Assignee: Zhihong Yu
> Fix For: 0.96.0
>
> Attachments: 6032-ports-5987-v2.txt, 6032-ports-5987.txt
>
>
> Excerpt from HBASE-5987:
> First, we propose to lookahead for one more block index so that the 
> HFileScanner would know the start key value of next data block. So if the 
> target key value for the scan(reSeekTo) is "smaller" than that start kv of 
> next data block, it means the target key value has a very high possibility in 
> the current data block (if not in current data block, then the start kv of 
> next data block should be returned. +Indexing on the start key has some 
> defects here+) and it shall NOT query the HFileBlockIndex in this case. On 
> the contrary, if the target key value is "bigger", then it shall query the 
> HFileBlockIndex. This improvement shall help to reduce the hotness of 
> HFileBlockIndex and avoid some unnecessary IdLock Contention or Index Block 
> Cache lookup.
> This JIRA is to port the fix to HBase trunk, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6109) Improve RIT performances during assignment on large clusters

2012-05-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284565#comment-13284565
 ] 

stack commented on HBASE-6109:
--

@nkeywal Trunk should have settled now.  Can you redo your patch so its against 
the hbase root dir?

{code}+  final private Locker locker = new Locker();{code}

Is this a generic locker?  Should it be named for what its locking?

NotifiableConcurrentSkipListMap needs class comment.  It seems like its for use 
in a very particular circumstance.  It needs explaining.

Does it need to be public?  Only used in master package?   Perhaps make it 
package private then?

internalList is a bad name for the internal delegate instance.  Is 'delegatee' 
a better name than internalList?

For sure this is ok?

{code}
+while (!this.master.isStopped() &&
+  // no lock concurrent access ok: sequentially consistent
+  this.regionsInTransition.containsKey(hri.getEncodedName())) {
+  this.regionsInTransition.waitForListUpdate();
 }
{code}

We checked rit contains a name but then in a separate statement we do the 
waitForListUpdate?  What if the region we are looking for is removed between 
the check and the waitForListUpdate invocation?

Will this log be annoying?

{code}
+  LOG.info("regionState=" + regionState +
+" failoverProcessedRegions.containsKey(encodedRegionName)=" + 
failoverProcessedRegions.containsKey(encodedRegionName));
{code}

This too... '+  LOG.info("et=" + et);'?

.. and this '+LOG.info("regionInfo.isMetaTable()=" + 
regionInfo.isMetaTable());'?

Add the region removed to the log message here? +  LOG.debug("Removed 
region from reopening regions because it was closed");?

Sometimes your indents are off.  For example:

{code}
-synchronized (regionsInTransition) {
+  // We need a lock here as we're going to do a put later and we don't 
want multiple state
+  //  creation
+Reentran
{code}

There are gratuitious reformattings of code.

Is this true:

{code}
+  // no lock concurrency ok: there is a write when we update the timestamp 
but it's ok
+  //  as its the only one updating this field
+  RegionState rs = this.regionsInTransition.get(e.getKey());
{code}

How is it enforced?

Check these...

{code}

+}finally {



 or here


+  }else {

{code}

needs space after curly parens.  Sometimes you do it and sometimes you don't.



I reviewed half of the patch.

It looks great.  Nice stuff N.

> Improve RIT performances during assignment on large clusters
> 
>
> Key: HBASE-6109
> URL: https://issues.apache.org/jira/browse/HBASE-6109
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 6109.v7.patch
>
>
> The main points in this patch are:
>  - lowering the number of copy of the RIT list
>  - lowering the number of synchronization
>  - synchronizing on a region rather than on everything
> It also contains:
>  - some fixes around the RIT notification: the list was sometimes modified 
> without a corresponding 'notify'.
>  - some tests flakiness correction, actually unrelated to this patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-6094) [refGuide] Improvements to new contributor docs

2012-05-28 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-6094.
--

   Resolution: Fixed
Fix Version/s: 0.96.0
 Assignee: Ian Varley  (was: Doug Meil)
 Hadoop Flags: Reviewed

Committed to trunk.  Thanks for the patch Ian (Didn't push it out yet).

> [refGuide] Improvements to new contributor docs
> ---
>
> Key: HBASE-6094
> URL: https://issues.apache.org/jira/browse/HBASE-6094
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ian Varley
>Assignee: Ian Varley
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: book_hbase_6094.xml.patch
>
>
> developer.xml
> * Expanded explanation around git & svn, and mentioning the EGit plugin
> * Expanded explanation of setting up the eclipse project
> * Extra section about basic compilation using maven and eclipse
> * Fix to tarball command that makes it maven2 compatible
> * Greatly expanded section about contributing docs, and clarification that 
> pushing generated site is only for those with permissions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6120) Few logging improvements around enabling tables

2012-05-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284561#comment-13284561
 ] 

stack commented on HBASE-6120:
--

+1 on patch

> Few logging improvements around enabling tables
> ---
>
> Key: HBASE-6120
> URL: https://issues.apache.org/jira/browse/HBASE-6120
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.94.0
>Reporter: Harsh J
>Priority: Trivial
> Attachments: HBASE-6120.patch
>
>
> - Few log statements between Enable/Disable/Create table handler event 
> classes have a typo with word "Attempting" (its misspelled "Attemping").
> - Even upon an enable operation's failure, the tailing message is a mere WARN 
> with a state of 'false'. This isn't as visible as I'd like it to be when 
> diagnosing logs for issues. I've put it in a proper if-else for this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6105) Sweep all INFO level logging and aggressively drop to DEBUG, and from DEBUG to TRACE

2012-05-28 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6105:
-

  Tags: noob
Labels: noob  (was: )

> Sweep all INFO level logging and aggressively drop to DEBUG, and from DEBUG 
> to TRACE
> 
>
> Key: HBASE-6105
> URL: https://issues.apache.org/jira/browse/HBASE-6105
> Project: HBase
>  Issue Type: Task
>Affects Versions: 0.96.0
>Reporter: Andrew Purtell
>  Labels: noob
>
> Speaking with Arjen from Facebook ops at HBaseCon, I asked if given one 
> single request for improving HBase operability, what would that be. The 
> answer was to be less verbose at INFO log level. For example, with many 
> regions opening, anomalous events can be difficult to pick out among the 5-6 
> INFO level messages per region deployment. Where multiple INFO level messages 
> are printed in close succession, we should consider coalescing them. For all 
> INFO level messages, we should be aggressive about demoting them to DEBUG 
> level. And, since we are now increasing the verbosity at DEBUG level, the 
> same considerations should be applied there, with coalescing and demotion of 
> really detailed/low level logging to TRACE.
> Consider making this a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6115) NullPointerException is thrown when root and meta table regions are assigning to another RS.

2012-05-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284557#comment-13284557
 ] 

stack commented on HBASE-6115:
--

+1

> NullPointerException is thrown when root and meta table regions are assigning 
> to another RS.
> 
>
> Key: HBASE-6115
> URL: https://issues.apache.org/jira/browse/HBASE-6115
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: rajeshbabu
>Assignee: rajeshbabu
>Priority: Minor
> Fix For: 0.94.1
>
> Attachments: HBASE-6115_0.94.patch
>
>
> Lets suppose we have two region servers RS1 and RS2.
> If region server (RS1) having root and meta regions went down, master will 
> assign them to another region server RS2. At that time recieved 
> NullPointerException.
> {code}
> 2012-05-04 20:19:52,912 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
> Looked up root region location, 
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@25de152f;
>  serverName=
> 2012-05-04 20:19:52,914 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
> Looked up root region location, 
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@25de152f;
>  serverName=
> 2012-05-04 20:19:52,916 WARN 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Exception 
> running postOpenDeployTasks; region=1028785192
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1483)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945)
> at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:801)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:776)
> at org.apache.hadoop.hbase.catalog.MetaEditor.put(MetaEditor.java:98)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.putToCatalogTable(MetaEditor.java:88)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateLocation(MetaEditor.java:259)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaLocation(MetaEditor.java:221)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:1625)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:241)
> 2012-05-04 20:19:52,920 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Closing .META.,,1.1028785192: disabling compactions & flushes
> 2012-05-04 20:19:52,920 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Updates disabled for region .META.,,1.1028785192
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-28 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284549#comment-13284549
 ] 

Zhihong Yu commented on HBASE-6088:
---

{code}
+   * While transitioning node from RS_ZK_REGION_SPLITTING to
+   * RS_ZK_REGION_SPLITTING during region split,if zookeper went down split 
always
+   * fails for the region.Avoid this by HBASE-6088 fix. 
+   * This test case to test the znode is deleted(if created) or not in roll 
back.
{code}
The second state should be RS_ZK_REGION_SPLIT.
Here is rewritten paragraph:
{code}
+   * While transitioning node from RS_ZK_REGION_SPLITTING to
+   * RS_ZK_REGION_SPLIT during region split, if zookeeper goes down the split 
always
+   * fails for the region. HBASE-6088 fixes this scenario. 
+   * This test case verifies the znode is deleted (if created) or not in roll 
back.
{code}


>  Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> 
>
> Key: HBASE-6088
> URL: https://issues.apache.org/jira/browse/HBASE-6088
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: rajeshbabu
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6088_94.patch, HBASE-6088_94_2.patch, 
> HBASE-6088_trunk.patch, HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch
>
>
> Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> {noformat}
> 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
> timed out, have not heard from server in 26668ms for sessionid 
> 0x1377a75f41d0012, closing socket connection and attempting reconnect
> 2012-05-24 01:45:41,464 WARN 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
> ZooKeeper exception: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
> {noformat}
> {noformat}
> 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
> cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
> synced till here 189365
> 2012-05-24 01:45:48,474 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
> setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> java.io.IOException: Failed setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
> KeeperErrorCode = BadVersion for 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
>   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
>   ... 5 more
> 2012-05-24 01:45:48,476 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
> failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> {noformat}
> {noformat}
> 2012-05-24 01:47:28,141 ERROR 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
> not a retry
> 2012-05-24 01:47

[jira] [Commented] (HBASE-6115) NullPointerException is thrown when root and meta table regions are assigning to another RS.

2012-05-28 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284550#comment-13284550
 ] 

Zhihong Yu commented on HBASE-6115:
---

+1 from me.

> NullPointerException is thrown when root and meta table regions are assigning 
> to another RS.
> 
>
> Key: HBASE-6115
> URL: https://issues.apache.org/jira/browse/HBASE-6115
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: rajeshbabu
>Assignee: rajeshbabu
>Priority: Minor
> Fix For: 0.94.1
>
> Attachments: HBASE-6115_0.94.patch
>
>
> Lets suppose we have two region servers RS1 and RS2.
> If region server (RS1) having root and meta regions went down, master will 
> assign them to another region server RS2. At that time recieved 
> NullPointerException.
> {code}
> 2012-05-04 20:19:52,912 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
> Looked up root region location, 
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@25de152f;
>  serverName=
> 2012-05-04 20:19:52,914 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
> Looked up root region location, 
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@25de152f;
>  serverName=
> 2012-05-04 20:19:52,916 WARN 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Exception 
> running postOpenDeployTasks; region=1028785192
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1483)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945)
> at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:801)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:776)
> at org.apache.hadoop.hbase.catalog.MetaEditor.put(MetaEditor.java:98)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.putToCatalogTable(MetaEditor.java:88)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateLocation(MetaEditor.java:259)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaLocation(MetaEditor.java:221)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:1625)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:241)
> 2012-05-04 20:19:52,920 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Closing .META.,,1.1028785192: disabling compactions & flushes
> 2012-05-04 20:19:52,920 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Updates disabled for region .META.,,1.1028785192
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96

2012-05-28 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284532#comment-13284532
 ] 

Jonathan Hsieh commented on HBASE-6055:
---

Jesse,

Thanks for the writeup, I find having a single doc with the design summary 
really helpful and ideally something we do for all major new features. I've 
read through the document carefully, let it steep for a few days, and had some 
design-level questions.  I've skimmed HBASE-50 and will read more of the 
history more carefully later this evening.  

What is the read mechanism for snapshots like?  Does the snapshot act like a 
read-only table or is there some special external mechanism needed to read the 
data from a snapshot?  You mention having to rebuild in-memory state by 
replaying wals -- is this a recovery situation or needed in normal reads?

What is a representation of a snapshot look like in terms of META and file 
system contents?  At some point we may get called upon to repair these, I want 
to make sure there are enough breadcrumbs for this to be possible.

I'm still thinking about the two-phase part -- I think it is necessary for 
marking success or initiating failure recovery, but I'm skeptical at the moment 
about why the barriering writes is necessary.  How does this buy your more 
consistency?  Aren't we still inconsistent at the prepare point now instead?   
Can we just write the special snapshotting hlog entry at initiation of prepare, 
allowing writes to continue, then adding data elsewhere (META) to mark success 
in commit?  We could then have some compaction/flush time logic cleanup failed 
atttempt markers?





> Snapshots in HBase 0.96
> ---
>
> Key: HBASE-6055
> URL: https://issues.apache.org/jira/browse/HBASE-6055
> Project: HBase
>  Issue Type: New Feature
>  Components: client, master, regionserver, zookeeper
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Fix For: 0.96.0
>
> Attachments: Snapshots in HBase.docx
>
>
> Continuation of HBASE-50 for the current trunk. Since the implementation has 
> drastically changed, opening as a new ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284517#comment-13284517
 ] 

Hudson commented on HBASE-5916:
---

Integrated in HBase-0.94 #222 (See 
[https://builds.apache.org/job/HBase-0.94/222/])
HBASE-5916 RS restart just before master intialization we make the cluster 
non operative(RajeshBabu) (Revision 1343326)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


> RS restart just before master intialization we make the cluster non operative
> -
>
> Key: HBASE-5916
> URL: https://issues.apache.org/jira/browse/HBASE-5916
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1, 0.94.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-5916_94.patch, HBASE-5916_trunk.patch, 
> HBASE-5916_trunk_1.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, 
> HBASE-5916_trunk_3.patch, HBASE-5916_trunk_4.patch, 
> HBASE-5916_trunk_v5.patch, HBASE-5916_trunk_v6.patch, 
> HBASE-5916_trunk_v7.patch, HBASE-5916_trunk_v8.patch, 
> HBASE-5916_trunk_v9.patch, HBASE-5916v8.patch
>
>
> Consider a case where my master is getting restarted.  RS that was alive when 
> the master restart started, gets restarted before the master initializes the 
> ServerShutDownHandler.
> {code}
> serverShutdownHandlerEnabled = true;
> {code}
> In this case when the RS tries to register with the master, the master will 
> try to expire the server but the server cannot be expired as still the 
> serverShutdownHandler is not enabled.
> This case may happen when i have only one RS gets restarted or all the RS 
> gets restarted at the same time.(before assignRootandMeta).
> {code}
> LOG.info(message);
>   if (existingServer.getStartcode() < serverName.getStartcode()) {
> LOG.info("Triggering server recovery; existingServer " +
>   existingServer + " looks stale, new server:" + serverName);
> expireServer(existingServer);
>   }
> {code}
> If another RS is brought up then the cluster comes back to normalcy.
> May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6065) Log for flush would append a non-sequential edit in the hlog, leading to possible data loss

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284516#comment-13284516
 ] 

Hudson commented on HBASE-6065:
---

Integrated in HBase-0.94 #222 (See 
[https://builds.apache.org/job/HBase-0.94/222/])
HBASE-6118 Add a testcase for HBASE-6065 (Ashutosh) (Revision 1343336)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java


> Log for flush would append a non-sequential edit in the hlog, leading to 
> possible data loss
> ---
>
> Key: HBASE-6065
> URL: https://issues.apache.org/jira/browse/HBASE-6065
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6065.patch, HBASE-6065v2.patch
>
>
> After completing flush region, we will append a log edit in the hlog file 
> through HLog#completeCacheFlush.
> {code}
> public void completeCacheFlush(final byte [] encodedRegionName,
>   final byte [] tableName, final long logSeqId, final boolean 
> isMetaRegion)
> {
> ...
> HLogKey key = makeKey(encodedRegionName, tableName, logSeqId,
> System.currentTimeMillis(), HConstants.DEFAULT_CLUSTER_ID);
> ...
> }
> {code}
> when we make the hlog key, we use the seqId from the parameter, and it is 
> generated by HLog#startCacheFlush,
> Here, we may append a lower seq id edit than the last edit in the hlog file.
> If it is the last edit log in the file, it may cause data loss.
> because 
> {code}
> HRegion#replayRecoveredEditsIfAny{
> ...
> maxSeqId = Math.abs(Long.parseLong(fileName));
>   if (maxSeqId <= minSeqId) {
> String msg = "Maximum sequenceid for this log is " + maxSeqId
> + " and minimum sequenceid for the region is " + minSeqId
> + ", skipped the whole file, path=" + edits;
> LOG.debug(msg);
> continue;
>   }
> ...
> }
> {code}
> We may skip the splitted log file, because we use the lase edit's seq id as 
> its file name, and consider this seqId as the max seq id in this log file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6118) Add a testcase for HBASE-6065

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284515#comment-13284515
 ] 

Hudson commented on HBASE-6118:
---

Integrated in HBase-0.94 #222 (See 
[https://builds.apache.org/job/HBase-0.94/222/])
HBASE-6118 Add a testcase for HBASE-6065 (Ashutosh) (Revision 1343336)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java


> Add a testcase for HBASE-6065
> -
>
> Key: HBASE-6118
> URL: https://issues.apache.org/jira/browse/HBASE-6118
> Project: HBase
>  Issue Type: Test
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ashutosh Jindal
> Attachments: 6118-trunk.txt, HBASE-6118_0.94.patch
>
>
> It would be nice to have a testcase for HBASE-6065.  Internally we have 
> written a test case to simulate the problem.  Thought that it would be better 
> to contribute the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6068) Secure HBase cluster : Client not able to call some admin APIs

2012-05-28 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-6068:
---

Attachment: HBASE-6068-v3.patch

backupMasterAddressesZNode and rsZNode are checked just for Children. This 
doesn't require auth on children too.

> Secure HBase cluster : Client not able to call some admin APIs
> --
>
> Key: HBASE-6068
> URL: https://issues.apache.org/jira/browse/HBASE-6068
> Project: HBase
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.92.1, 0.94.0, 0.96.0
>Reporter: Anoop Sam John
>Assignee: Matteo Bertozzi
> Attachments: HBASE-6068-v0.patch, HBASE-6068-v1.patch, 
> HBASE-6068-v2.patch, HBASE-6068-v3.patch
>
>
> In case of secure cluster, we allow the HBase clients to read the zk nodes by 
> providing the global read permissions to all for certain nodes. These nodes 
> are the master address znode, root server znode and the clusterId znode. In 
> ZKUtil.createACL() , we can see these node names are specially handled.
> But there are some other client side admin APIs which makes a read call into 
> the zookeeper from the client. This include the isTableEnabled() call (May be 
> some other. I have seen this).  Here the client directly reads a node in the 
> zookeeper ( node created for this table ) and the data is matched to know 
> whether this is enabled or not.
> Now in secure cluster case any client can read zookeeper nodes which it needs 
> for its normal operation like the master address and root server address.  
> But what if the client calls this API? [isTableEnaled () ].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6118) Add a testcase for HBASE-6065

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284510#comment-13284510
 ] 

Hudson commented on HBASE-6118:
---

Integrated in HBase-TRUNK #2940 (See 
[https://builds.apache.org/job/HBase-TRUNK/2940/])
HBASE-6118 Add a testcase for HBASE-6065 (Ashutosh) (Revision 1343338)

 Result = SUCCESS
ramkrishna : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java


> Add a testcase for HBASE-6065
> -
>
> Key: HBASE-6118
> URL: https://issues.apache.org/jira/browse/HBASE-6118
> Project: HBase
>  Issue Type: Test
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ashutosh Jindal
> Attachments: 6118-trunk.txt, HBASE-6118_0.94.patch
>
>
> It would be nice to have a testcase for HBASE-6065.  Internally we have 
> written a test case to simulate the problem.  Thought that it would be better 
> to contribute the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6065) Log for flush would append a non-sequential edit in the hlog, leading to possible data loss

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284511#comment-13284511
 ] 

Hudson commented on HBASE-6065:
---

Integrated in HBase-TRUNK #2940 (See 
[https://builds.apache.org/job/HBase-TRUNK/2940/])
HBASE-6118 Add a testcase for HBASE-6065 (Ashutosh) (Revision 1343338)

 Result = SUCCESS
ramkrishna : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java


> Log for flush would append a non-sequential edit in the hlog, leading to 
> possible data loss
> ---
>
> Key: HBASE-6065
> URL: https://issues.apache.org/jira/browse/HBASE-6065
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6065.patch, HBASE-6065v2.patch
>
>
> After completing flush region, we will append a log edit in the hlog file 
> through HLog#completeCacheFlush.
> {code}
> public void completeCacheFlush(final byte [] encodedRegionName,
>   final byte [] tableName, final long logSeqId, final boolean 
> isMetaRegion)
> {
> ...
> HLogKey key = makeKey(encodedRegionName, tableName, logSeqId,
> System.currentTimeMillis(), HConstants.DEFAULT_CLUSTER_ID);
> ...
> }
> {code}
> when we make the hlog key, we use the seqId from the parameter, and it is 
> generated by HLog#startCacheFlush,
> Here, we may append a lower seq id edit than the last edit in the hlog file.
> If it is the last edit log in the file, it may cause data loss.
> because 
> {code}
> HRegion#replayRecoveredEditsIfAny{
> ...
> maxSeqId = Math.abs(Long.parseLong(fileName));
>   if (maxSeqId <= minSeqId) {
> String msg = "Maximum sequenceid for this log is " + maxSeqId
> + " and minimum sequenceid for the region is " + minSeqId
> + ", skipped the whole file, path=" + edits;
> LOG.debug(msg);
> continue;
>   }
> ...
> }
> {code}
> We may skip the splitted log file, because we use the lase edit's seq id as 
> its file name, and consider this seqId as the max seq id in this log file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

2012-05-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284512#comment-13284512
 ] 

Hudson commented on HBASE-5916:
---

Integrated in HBase-TRUNK #2940 (See 
[https://builds.apache.org/job/HBase-TRUNK/2940/])
HBASE-5916 RS restart just before master intialization we make the cluster 
non operative (Rajesh) (Revision 1343324)

 Result = SUCCESS
ramkrishna : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


> RS restart just before master intialization we make the cluster non operative
> -
>
> Key: HBASE-5916
> URL: https://issues.apache.org/jira/browse/HBASE-5916
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1, 0.94.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-5916_94.patch, HBASE-5916_trunk.patch, 
> HBASE-5916_trunk_1.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, 
> HBASE-5916_trunk_3.patch, HBASE-5916_trunk_4.patch, 
> HBASE-5916_trunk_v5.patch, HBASE-5916_trunk_v6.patch, 
> HBASE-5916_trunk_v7.patch, HBASE-5916_trunk_v8.patch, 
> HBASE-5916_trunk_v9.patch, HBASE-5916v8.patch
>
>
> Consider a case where my master is getting restarted.  RS that was alive when 
> the master restart started, gets restarted before the master initializes the 
> ServerShutDownHandler.
> {code}
> serverShutdownHandlerEnabled = true;
> {code}
> In this case when the RS tries to register with the master, the master will 
> try to expire the server but the server cannot be expired as still the 
> serverShutdownHandler is not enabled.
> This case may happen when i have only one RS gets restarted or all the RS 
> gets restarted at the same time.(before assignRootandMeta).
> {code}
> LOG.info(message);
>   if (existingServer.getStartcode() < serverName.getStartcode()) {
> LOG.info("Triggering server recovery; existingServer " +
>   existingServer + " looks stale, new server:" + serverName);
> expireServer(existingServer);
>   }
> {code}
> If another RS is brought up then the cluster comes back to normalcy.
> May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6118) Add a testcase for HBASE-6065

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6118:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Add a testcase for HBASE-6065
> -
>
> Key: HBASE-6118
> URL: https://issues.apache.org/jira/browse/HBASE-6118
> Project: HBase
>  Issue Type: Test
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ashutosh Jindal
> Attachments: 6118-trunk.txt, HBASE-6118_0.94.patch
>
>
> It would be nice to have a testcase for HBASE-6065.  Internally we have 
> written a test case to simulate the problem.  Thought that it would be better 
> to contribute the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6118) Add a testcase for HBASE-6065

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284497#comment-13284497
 ] 

ramkrishna.s.vasudevan commented on HBASE-6118:
---

I committed this to 0.94 and trunk.
Thanks for the patch Ashutosh and Ted.

> Add a testcase for HBASE-6065
> -
>
> Key: HBASE-6118
> URL: https://issues.apache.org/jira/browse/HBASE-6118
> Project: HBase
>  Issue Type: Test
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ashutosh Jindal
> Attachments: 6118-trunk.txt, HBASE-6118_0.94.patch
>
>
> It would be nice to have a testcase for HBASE-6065.  Internally we have 
> written a test case to simulate the problem.  Thought that it would be better 
> to contribute the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6115) NullPointerException is thrown when root and meta table regions are assigning to another RS.

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6115:
--

Attachment: HBASE-6115_0.94.patch

Patch for 0.94.  This JIRA applicable only for 0.94.

> NullPointerException is thrown when root and meta table regions are assigning 
> to another RS.
> 
>
> Key: HBASE-6115
> URL: https://issues.apache.org/jira/browse/HBASE-6115
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: rajeshbabu
>Assignee: rajeshbabu
>Priority: Minor
> Fix For: 0.94.1
>
> Attachments: HBASE-6115_0.94.patch
>
>
> Lets suppose we have two region servers RS1 and RS2.
> If region server (RS1) having root and meta regions went down, master will 
> assign them to another region server RS2. At that time recieved 
> NullPointerException.
> {code}
> 2012-05-04 20:19:52,912 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
> Looked up root region location, 
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@25de152f;
>  serverName=
> 2012-05-04 20:19:52,914 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
> Looked up root region location, 
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@25de152f;
>  serverName=
> 2012-05-04 20:19:52,916 WARN 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Exception 
> running postOpenDeployTasks; region=1028785192
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1483)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945)
> at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:801)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:776)
> at org.apache.hadoop.hbase.catalog.MetaEditor.put(MetaEditor.java:98)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.putToCatalogTable(MetaEditor.java:88)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateLocation(MetaEditor.java:259)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaLocation(MetaEditor.java:221)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:1625)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:241)
> 2012-05-04 20:19:52,920 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Closing .META.,,1.1028785192: disabling compactions & flushes
> 2012-05-04 20:19:52,920 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Updates disabled for region .META.,,1.1028785192
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284485#comment-13284485
 ] 

ramkrishna.s.vasudevan edited comment on HBASE-5916 at 5/28/12 5:32 PM:


Committed to trunk and 0.94.
Thanks for the review Chunhui, Ted and Stack.
Thanks to Rajesh for his patches also.

  was (Author: ram_krish):
Committed to trunk and 0.94.
Thanks for the review Chunhui, Ted and Stack.
Thanks for Rajesh for his patches also.
  
> RS restart just before master intialization we make the cluster non operative
> -
>
> Key: HBASE-5916
> URL: https://issues.apache.org/jira/browse/HBASE-5916
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1, 0.94.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-5916_94.patch, HBASE-5916_trunk.patch, 
> HBASE-5916_trunk_1.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, 
> HBASE-5916_trunk_3.patch, HBASE-5916_trunk_4.patch, 
> HBASE-5916_trunk_v5.patch, HBASE-5916_trunk_v6.patch, 
> HBASE-5916_trunk_v7.patch, HBASE-5916_trunk_v8.patch, 
> HBASE-5916_trunk_v9.patch, HBASE-5916v8.patch
>
>
> Consider a case where my master is getting restarted.  RS that was alive when 
> the master restart started, gets restarted before the master initializes the 
> ServerShutDownHandler.
> {code}
> serverShutdownHandlerEnabled = true;
> {code}
> In this case when the RS tries to register with the master, the master will 
> try to expire the server but the server cannot be expired as still the 
> serverShutdownHandler is not enabled.
> This case may happen when i have only one RS gets restarted or all the RS 
> gets restarted at the same time.(before assignRootandMeta).
> {code}
> LOG.info(message);
>   if (existingServer.getStartcode() < serverName.getStartcode()) {
> LOG.info("Triggering server recovery; existingServer " +
>   existingServer + " looks stale, new server:" + serverName);
> expireServer(existingServer);
>   }
> {code}
> If another RS is brought up then the cluster comes back to normalcy.
> May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284485#comment-13284485
 ] 

ramkrishna.s.vasudevan commented on HBASE-5916:
---

Committed to trunk and 0.94.
Thanks for the review Chunhui, Ted and Stack.
Thanks for Rajesh for his patches also.

> RS restart just before master intialization we make the cluster non operative
> -
>
> Key: HBASE-5916
> URL: https://issues.apache.org/jira/browse/HBASE-5916
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1, 0.94.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-5916_94.patch, HBASE-5916_trunk.patch, 
> HBASE-5916_trunk_1.patch, HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, 
> HBASE-5916_trunk_3.patch, HBASE-5916_trunk_4.patch, 
> HBASE-5916_trunk_v5.patch, HBASE-5916_trunk_v6.patch, 
> HBASE-5916_trunk_v7.patch, HBASE-5916_trunk_v8.patch, 
> HBASE-5916_trunk_v9.patch, HBASE-5916v8.patch
>
>
> Consider a case where my master is getting restarted.  RS that was alive when 
> the master restart started, gets restarted before the master initializes the 
> ServerShutDownHandler.
> {code}
> serverShutdownHandlerEnabled = true;
> {code}
> In this case when the RS tries to register with the master, the master will 
> try to expire the server but the server cannot be expired as still the 
> serverShutdownHandler is not enabled.
> This case may happen when i have only one RS gets restarted or all the RS 
> gets restarted at the same time.(before assignRootandMeta).
> {code}
> LOG.info(message);
>   if (existingServer.getStartcode() < serverName.getStartcode()) {
> LOG.info("Triggering server recovery; existingServer " +
>   existingServer + " looks stale, new server:" + serverName);
> expireServer(existingServer);
>   }
> {code}
> If another RS is brought up then the cluster comes back to normalcy.
> May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6118) Add a testcase for HBASE-6065

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284482#comment-13284482
 ] 

ramkrishna.s.vasudevan commented on HBASE-6118:
---

Thanks for the trunk testcase Ted.  I thought of preparing before commit. Can i 
commit this?

> Add a testcase for HBASE-6065
> -
>
> Key: HBASE-6118
> URL: https://issues.apache.org/jira/browse/HBASE-6118
> Project: HBase
>  Issue Type: Test
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ashutosh Jindal
> Attachments: 6118-trunk.txt, HBASE-6118_0.94.patch
>
>
> It would be nice to have a testcase for HBASE-6065.  Internally we have 
> written a test case to simulate the problem.  Thought that it would be better 
> to contribute the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284470#comment-13284470
 ] 

ramkrishna.s.vasudevan commented on HBASE-6088:
---

I am planning to commit this. Pls provide your comments.

>  Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> 
>
> Key: HBASE-6088
> URL: https://issues.apache.org/jira/browse/HBASE-6088
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: rajeshbabu
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6088_94.patch, HBASE-6088_94_2.patch, 
> HBASE-6088_trunk.patch, HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch
>
>
> Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> {noformat}
> 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
> timed out, have not heard from server in 26668ms for sessionid 
> 0x1377a75f41d0012, closing socket connection and attempting reconnect
> 2012-05-24 01:45:41,464 WARN 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
> ZooKeeper exception: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
> {noformat}
> {noformat}
> 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
> cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
> synced till here 189365
> 2012-05-24 01:45:48,474 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
> setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> java.io.IOException: Failed setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
> KeeperErrorCode = BadVersion for 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
>   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
>   ... 5 more
> 2012-05-24 01:45:48,476 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
> failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> {noformat}
> {noformat}
> 2012-05-24 01:47:28,141 ERROR 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
> not a retry
> 2012-05-24 01:47:28,142 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
> create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
> java.io.IOException: Failed create of ephemeral 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
>   at 

[jira] [Updated] (HBASE-6120) Few logging improvements around enabling tables

2012-05-28 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HBASE-6120:
---

Status: Patch Available  (was: Open)

> Few logging improvements around enabling tables
> ---
>
> Key: HBASE-6120
> URL: https://issues.apache.org/jira/browse/HBASE-6120
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.94.0
>Reporter: Harsh J
>Priority: Trivial
> Attachments: HBASE-6120.patch
>
>
> - Few log statements between Enable/Disable/Create table handler event 
> classes have a typo with word "Attempting" (its misspelled "Attemping").
> - Even upon an enable operation's failure, the tailing message is a mere WARN 
> with a state of 'false'. This isn't as visible as I'd like it to be when 
> diagnosing logs for issues. I've put it in a proper if-else for this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6120) Few logging improvements around enabling tables

2012-05-28 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HBASE-6120:
---

Attachment: HBASE-6120.patch

Patch that fixes both of those issues described in description.

> Few logging improvements around enabling tables
> ---
>
> Key: HBASE-6120
> URL: https://issues.apache.org/jira/browse/HBASE-6120
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.94.0
>Reporter: Harsh J
>Priority: Trivial
> Attachments: HBASE-6120.patch
>
>
> - Few log statements between Enable/Disable/Create table handler event 
> classes have a typo with word "Attempting" (its misspelled "Attemping").
> - Even upon an enable operation's failure, the tailing message is a mere WARN 
> with a state of 'false'. This isn't as visible as I'd like it to be when 
> diagnosing logs for issues. I've put it in a proper if-else for this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284464#comment-13284464
 ] 

Hadoop QA commented on HBASE-6088:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12529969/HBASE-6088_94_2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2024//console

This message is automatically generated.

>  Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> 
>
> Key: HBASE-6088
> URL: https://issues.apache.org/jira/browse/HBASE-6088
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: rajeshbabu
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6088_94.patch, HBASE-6088_94_2.patch, 
> HBASE-6088_trunk.patch, HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch
>
>
> Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> {noformat}
> 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
> timed out, have not heard from server in 26668ms for sessionid 
> 0x1377a75f41d0012, closing socket connection and attempting reconnect
> 2012-05-24 01:45:41,464 WARN 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
> ZooKeeper exception: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
> {noformat}
> {noformat}
> 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
> cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
> synced till here 189365
> 2012-05-24 01:45:48,474 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
> setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> java.io.IOException: Failed setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
> KeeperErrorCode = BadVersion for 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
>   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
>   ... 5 more
> 2012-05-24 01:45:48,476 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
> failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> {noformat}
> {noformat}
> 2012-05-24 01:47:28,141 ERROR 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
> not a retry
> 2012-05-24 01:47:28,142 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; 

[jira] [Created] (HBASE-6120) Few logging improvements around enabling tables

2012-05-28 Thread Harsh J (JIRA)
Harsh J created HBASE-6120:
--

 Summary: Few logging improvements around enabling tables
 Key: HBASE-6120
 URL: https://issues.apache.org/jira/browse/HBASE-6120
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.94.0
Reporter: Harsh J
Priority: Trivial


- Few log statements between Enable/Disable/Create table handler event classes 
have a typo with word "Attempting" (its misspelled "Attemping").
- Even upon an enable operation's failure, the tailing message is a mere WARN 
with a state of 'false'. This isn't as visible as I'd like it to be when 
diagnosing logs for issues. I've put it in a proper if-else for this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-28 Thread rajeshbabu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu updated HBASE-6088:
--

Attachment: HBASE-6088_94_2.patch

>  Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> 
>
> Key: HBASE-6088
> URL: https://issues.apache.org/jira/browse/HBASE-6088
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: rajeshbabu
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6088_94.patch, HBASE-6088_94_2.patch, 
> HBASE-6088_trunk.patch, HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch
>
>
> Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> {noformat}
> 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
> timed out, have not heard from server in 26668ms for sessionid 
> 0x1377a75f41d0012, closing socket connection and attempting reconnect
> 2012-05-24 01:45:41,464 WARN 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
> ZooKeeper exception: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
> {noformat}
> {noformat}
> 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
> cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
> synced till here 189365
> 2012-05-24 01:45:48,474 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
> setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> java.io.IOException: Failed setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
> KeeperErrorCode = BadVersion for 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
>   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
>   ... 5 more
> 2012-05-24 01:45:48,476 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
> failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> {noformat}
> {noformat}
> 2012-05-24 01:47:28,141 ERROR 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
> not a retry
> 2012-05-24 01:47:28,142 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
> create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
> java.io.IOException: Failed create of ephemeral 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
>   at 
> org.ap

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-28 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284461#comment-13284461
 ] 

rajeshbabu commented on HBASE-6088:
---

Updated patch for 94.

>  Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> 
>
> Key: HBASE-6088
> URL: https://issues.apache.org/jira/browse/HBASE-6088
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: rajeshbabu
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6088_94.patch, HBASE-6088_94_2.patch, 
> HBASE-6088_trunk.patch, HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch
>
>
> Region splitting not happened for long time due to ZK exception while 
> creating RS_ZK_SPLITTING node
> {noformat}
> 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
> timed out, have not heard from server in 26668ms for sessionid 
> 0x1377a75f41d0012, closing socket connection and attempting reconnect
> 2012-05-24 01:45:41,464 WARN 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
> ZooKeeper exception: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
> {noformat}
> {noformat}
> 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
> cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
> synced till here 189365
> 2012-05-24 01:45:48,474 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
> setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> java.io.IOException: Failed setting SPLITTING znode on 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
> KeeperErrorCode = BadVersion for 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
>   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
>   ... 5 more
> 2012-05-24 01:45:48,476 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
> failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
> {noformat}
> {noformat}
> 2012-05-24 01:47:28,141 ERROR 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
> not a retry
> 2012-05-24 01:47:28,142 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of 
> ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
> create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
> java.io.IOException: Failed create of ephemeral 
> /hbase/unassigned/bd1079bf948c672e493432020dc0e144
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.ex

[jira] [Updated] (HBASE-6119) Region server logs its own address at the end of getMaster()

2012-05-28 Thread Zhihong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-6119:
--

Attachment: 6119-trunk.txt

Trivial patch for trunk.

> Region server logs its own address at the end of getMaster()
> 
>
> Key: HBASE-6119
> URL: https://issues.apache.org/jira/browse/HBASE-6119
> Project: HBase
>  Issue Type: Bug
>Reporter: Zhihong Yu
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 6119-trunk.txt
>
>
> I saw the following in region server log where a.ebay.com is region server 
> itself:
> {code}
> 2012-05-28 08:56:35,315 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at 
> a.ebay.com/10.115.13.20:60020
> {code}
> We should be logging the address of master

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6119) Region server logs its own address at the end of getMaster()

2012-05-28 Thread Zhihong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-6119:
--

 Priority: Minor  (was: Major)
Fix Version/s: 0.96.0

> Region server logs its own address at the end of getMaster()
> 
>
> Key: HBASE-6119
> URL: https://issues.apache.org/jira/browse/HBASE-6119
> Project: HBase
>  Issue Type: Bug
>Reporter: Zhihong Yu
>Priority: Minor
> Fix For: 0.96.0
>
>
> I saw the following in region server log where a.ebay.com is region server 
> itself:
> {code}
> 2012-05-28 08:56:35,315 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at 
> a.ebay.com/10.115.13.20:60020
> {code}
> We should be logging the address of master

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6119) Region server logs its own address at the end of getMaster()

2012-05-28 Thread Zhihong Yu (JIRA)
Zhihong Yu created HBASE-6119:
-

 Summary: Region server logs its own address at the end of 
getMaster()
 Key: HBASE-6119
 URL: https://issues.apache.org/jira/browse/HBASE-6119
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Yu


I saw the following in region server log where a.ebay.com is region server 
itself:
{code}
2012-05-28 08:56:35,315 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at 
a.ebay.com/10.115.13.20:60020
{code}
We should be logging the address of master

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6118) Add a testcase for HBASE-6065

2012-05-28 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284440#comment-13284440
 ] 

Zhihong Yu commented on HBASE-6118:
---

I applied the patch manually to trunk and the new test passed.
Will attach patch for trunk soon.

I renamed the test case.

> Add a testcase for HBASE-6065
> -
>
> Key: HBASE-6118
> URL: https://issues.apache.org/jira/browse/HBASE-6118
> Project: HBase
>  Issue Type: Test
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ashutosh Jindal
> Attachments: 6118-trunk.txt, HBASE-6118_0.94.patch
>
>
> It would be nice to have a testcase for HBASE-6065.  Internally we have 
> written a test case to simulate the problem.  Thought that it would be better 
> to contribute the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6118) Add a testcase for HBASE-6065

2012-05-28 Thread Zhihong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-6118:
--

Attachment: 6118-trunk.txt

> Add a testcase for HBASE-6065
> -
>
> Key: HBASE-6118
> URL: https://issues.apache.org/jira/browse/HBASE-6118
> Project: HBase
>  Issue Type: Test
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ashutosh Jindal
> Attachments: 6118-trunk.txt, HBASE-6118_0.94.patch
>
>
> It would be nice to have a testcase for HBASE-6065.  Internally we have 
> written a test case to simulate the problem.  Thought that it would be better 
> to contribute the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6118) Add a testcase for HBASE-6065

2012-05-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284422#comment-13284422
 ] 

Hadoop QA commented on HBASE-6118:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12529962/HBASE-6118_0.94.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2022//console

This message is automatically generated.

> Add a testcase for HBASE-6065
> -
>
> Key: HBASE-6118
> URL: https://issues.apache.org/jira/browse/HBASE-6118
> Project: HBase
>  Issue Type: Test
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ashutosh Jindal
> Attachments: HBASE-6118_0.94.patch
>
>
> It would be nice to have a testcase for HBASE-6065.  Internally we have 
> written a test case to simulate the problem.  Thought that it would be better 
> to contribute the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6118) Add a testcase for HBASE-6065

2012-05-28 Thread Ashutosh Jindal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Jindal updated HBASE-6118:
---

Attachment: HBASE-6118_0.94.patch

Uploaded patch for 0.94 .

> Add a testcase for HBASE-6065
> -
>
> Key: HBASE-6118
> URL: https://issues.apache.org/jira/browse/HBASE-6118
> Project: HBase
>  Issue Type: Test
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ashutosh Jindal
> Attachments: HBASE-6118_0.94.patch
>
>
> It would be nice to have a testcase for HBASE-6065.  Internally we have 
> written a test case to simulate the problem.  Thought that it would be better 
> to contribute the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6118) Add a testcase for HBASE-6065

2012-05-28 Thread Ashutosh Jindal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Jindal updated HBASE-6118:
---

Fix Version/s: (was: 0.94.1)
 Hadoop Flags: Reviewed
   Status: Patch Available  (was: Open)

> Add a testcase for HBASE-6065
> -
>
> Key: HBASE-6118
> URL: https://issues.apache.org/jira/browse/HBASE-6118
> Project: HBase
>  Issue Type: Test
>Reporter: ramkrishna.s.vasudevan
>Assignee: Ashutosh Jindal
> Attachments: HBASE-6118_0.94.patch
>
>
> It would be nice to have a testcase for HBASE-6065.  Internally we have 
> written a test case to simulate the problem.  Thought that it would be better 
> to contribute the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

2012-05-28 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284416#comment-13284416
 ] 

Zhihong Yu commented on HBASE-6059:
---

I would listen to opinion from people who are more familiar with store files 
about the current solution.

> Replaying recovered edits would make deleted data exist again
> -
>
> Key: HBASE-6059
> URL: https://issues.apache.org/jira/browse/HBASE-6059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: 6059v6.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, 
> HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause 
> deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 
> has no store files, so its seqId is 0, so the edit log of put data will be 
> replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-6111) CLONE - Map tasks not local to RS

2012-05-28 Thread Tim Robertson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Robertson resolved HBASE-6111.
--

Resolution: Won't Fix

Closing this, as it is an issue in our environment that we're investigating and 
also using a version of code that is rather different to current trunk (e.g. 
trunk TableInputFormatBase does a reverseDNS(...) whereas the old one simply 
uses what is in the InetSocketAddress host address).

> CLONE - Map tasks not local to RS
> -
>
> Key: HBASE-6111
> URL: https://issues.apache.org/jira/browse/HBASE-6111
> Project: HBase
>  Issue Type: Bug
>  Components: mapred, master, regionserver
>Affects Versions: 0.20.2, 0.90.4
> Environment: DN, TT and RS running on the same nodes, all using CDH3. 
>  Ganglia monitoring everything.
>Reporter: Tim Robertson
>
> I have started seeing this issue in our environment.  HBASE-1672 was closed 
> as non reproducible, so I cloned it here.
> I have a 367M record table, compressed with snappy, and running a vanilla MR 
> SCAN with no filters spawns 441 Mappers.  The cluster currently has 216 slots 
> for mappers, and the first wave all report 100% data-local mappers.  As the 
> second wave of mappers come up they don't get run locally to the RS and data 
> locality drops.
> This kills our environment, as it saturates the network at 120M which is very 
> clear on ganglia.
> I am really happy to help diagnose this, but need some guidance on what to 
> do.  I don't know enough yet about how task assignment works in MR to 
> determine why the machines are picking up random tasks for their second 
> effort and not one for the local RS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6118) Add a testcase for HBASE-6065

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)
ramkrishna.s.vasudevan created HBASE-6118:
-

 Summary: Add a testcase for HBASE-6065
 Key: HBASE-6118
 URL: https://issues.apache.org/jira/browse/HBASE-6118
 Project: HBase
  Issue Type: Test
Reporter: ramkrishna.s.vasudevan
Assignee: Ashutosh Jindal
 Fix For: 0.94.1


It would be nice to have a testcase for HBASE-6065.  Internally we have written 
a test case to simulate the problem.  Thought that it would be better to 
contribute the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284373#comment-13284373
 ] 

ramkrishna.s.vasudevan commented on HBASE-6055:
---

Nice doc Jesse.

> Snapshots in HBase 0.96
> ---
>
> Key: HBASE-6055
> URL: https://issues.apache.org/jira/browse/HBASE-6055
> Project: HBase
>  Issue Type: New Feature
>  Components: client, master, regionserver, zookeeper
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Fix For: 0.96.0
>
> Attachments: Snapshots in HBase.docx
>
>
> Continuation of HBASE-50 for the current trunk. Since the implementation has 
> drastically changed, opening as a new ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284369#comment-13284369
 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
---

@Stack
Could you take a look at this solution and and patch?
@Ted
Is this ok to be committed? My concern was with creating an empty store file 
now. 

> Replaying recovered edits would make deleted data exist again
> -
>
> Key: HBASE-6059
> URL: https://issues.apache.org/jira/browse/HBASE-6059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: 6059v6.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, 
> HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause 
> deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 
> has no store files, so its seqId is 0, so the edit log of put data will be 
> replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96

2012-05-28 Thread gaojinchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284366#comment-13284366
 ] 

gaojinchao commented on HBASE-6055:
---

Hi Jesse, Are you working this feature? I am interested in it.  I will study 
your code.
one question, When we are creating snapshots,  Do we need stop the balance?

> Snapshots in HBase 0.96
> ---
>
> Key: HBASE-6055
> URL: https://issues.apache.org/jira/browse/HBASE-6055
> Project: HBase
>  Issue Type: New Feature
>  Components: client, master, regionserver, zookeeper
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Fix For: 0.96.0
>
> Attachments: Snapshots in HBase.docx
>
>
> Continuation of HBASE-50 for the current trunk. Since the implementation has 
> drastically changed, opening as a new ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6117) Revisit default condition added to Switch cases in Trunk

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)
ramkrishna.s.vasudevan created HBASE-6117:
-

 Summary: Revisit default condition added to Switch cases in Trunk
 Key: HBASE-6117
 URL: https://issues.apache.org/jira/browse/HBASE-6117
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: ramkrishna.s.vasudevan
Assignee: Ashutosh Jindal
 Fix For: 0.96.0


We found that in some cases the default case in switch block was just throwing 
illegalArg Exception. There are cases where we may get some other state for 
which we should not throw IllegalArgException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6116) Allow parallel HDFS writes for HLogs.

2012-05-28 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-6116:


 Summary: Allow parallel HDFS writes for HLogs.
 Key: HBASE-6116
 URL: https://issues.apache.org/jira/browse/HBASE-6116
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl


In HDFS-1783 I adapted Dhrubas changes to be used in Hadoop trunk.
This issue will include the necessary reflection changes to optionally enable 
this for the WALs in HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5959) Add other load balancers

2012-05-28 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284348#comment-13284348
 ] 

Elliott Clark commented on HBASE-5959:
--

Right now the two most variable(read and write) are not used they have a 0 
co-efficient.  I'll need to add in a lot more plumbing to get a moving average 
going and this patch is getting a little too long in the tooth as it is.
 
They are the whole point (write requests were the original impetus for this 
patch).  Every time someone talks about load balancing they have a different 
need.  That need is not encompassed by one metric or one single best fit.  If 
it was then the default loadbalnacer would be enough.

Tuning this is a pretty difficult challenge (if the defaults are deviated 
from), however with the complexity comes a more powerful ability to move 
read/write/memory load.

> Add other load balancers
> 
>
> Key: HBASE-5959
> URL: https://issues.apache.org/jira/browse/HBASE-5959
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HBASE-5959-0.patch, HBASE-5959-1.patch, 
> HBASE-5959-2.patch, HBASE-5959-3.patch, HBASE-5959-6.patch, 
> HBASE-5959-7.patch, HBASE-5959-8.patch, HBASE-5959-9.patch, 
> HBASE-5959.D3189.1.patch, HBASE-5959.D3189.2.patch, HBASE-5959.D3189.3.patch, 
> HBASE-5959.D3189.4.patch, HBASE-5959.D3189.5.patch, HBASE-5959.D3189.6.patch, 
> HBASE-5959.D3189.7.patch
>
>
> Now that balancers are pluggable we should give some options.b

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6115) NullPointerException is thrown when root and meta table regions are assigning to another RS.

2012-05-28 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284343#comment-13284343
 ] 

rajeshbabu commented on HBASE-6115:
---

This is not happening in master startup. But while processing SSH,after calling 
assign to root there is no waiting mechanism to make sure root assignment 
completed.


> NullPointerException is thrown when root and meta table regions are assigning 
> to another RS.
> 
>
> Key: HBASE-6115
> URL: https://issues.apache.org/jira/browse/HBASE-6115
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: rajeshbabu
>Assignee: rajeshbabu
>Priority: Minor
> Fix For: 0.94.1
>
>
> Lets suppose we have two region servers RS1 and RS2.
> If region server (RS1) having root and meta regions went down, master will 
> assign them to another region server RS2. At that time recieved 
> NullPointerException.
> {code}
> 2012-05-04 20:19:52,912 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
> Looked up root region location, 
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@25de152f;
>  serverName=
> 2012-05-04 20:19:52,914 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
> Looked up root region location, 
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@25de152f;
>  serverName=
> 2012-05-04 20:19:52,916 WARN 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Exception 
> running postOpenDeployTasks; region=1028785192
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1483)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945)
> at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:801)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:776)
> at org.apache.hadoop.hbase.catalog.MetaEditor.put(MetaEditor.java:98)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.putToCatalogTable(MetaEditor.java:88)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateLocation(MetaEditor.java:259)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaLocation(MetaEditor.java:221)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:1625)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:241)
> 2012-05-04 20:19:52,920 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Closing .META.,,1.1028785192: disabling compactions & flushes
> 2012-05-04 20:19:52,920 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Updates disabled for region .META.,,1.1028785192
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284327#comment-13284327
 ] 

ramkrishna.s.vasudevan commented on HBASE-5682:
---

@Lars
See HBASE-6115.  As we are not waiting for the root location to come up we get 
NPE now.

> Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 
> only)
> --
>
> Key: HBASE-5682
> URL: https://issues.apache.org/jira/browse/HBASE-5682
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5682-all-v2.txt, 5682-all-v3.txt, 5682-all-v4.txt, 
> 5682-all.txt, 5682-v2.txt, 5682.txt
>
>
> Just realized that without this HBASE-4805 is broken.
> I.e. there's no point keeping a persistent HConnection around if it can be 
> rendered permanently unusable if the ZK connection is lost temporarily.
> Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to 
> backport)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6115) NullPointerException is thrown when root and meta table regions are assigning to another RS.

2012-05-28 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284305#comment-13284305
 ] 

rajeshbabu commented on HBASE-6115:
---

This is because we are not waiting for root region location to set in root 
region tracker.
{code}
  this.rootRegionTracker.waitRootRegionLocation(this.rpcTimeout);
{code}

This line is present in 90,92 and trunk but missing in 94

> NullPointerException is thrown when root and meta table regions are assigning 
> to another RS.
> 
>
> Key: HBASE-6115
> URL: https://issues.apache.org/jira/browse/HBASE-6115
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: rajeshbabu
>Assignee: rajeshbabu
>Priority: Minor
> Fix For: 0.94.1
>
>
> Lets suppose we have two region servers RS1 and RS2.
> If region server (RS1) having root and meta regions went down, master will 
> assign them to another region server RS2. At that time recieved 
> NullPointerException.
> {code}
> 2012-05-04 20:19:52,912 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
> Looked up root region location, 
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@25de152f;
>  serverName=
> 2012-05-04 20:19:52,914 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
> Looked up root region location, 
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@25de152f;
>  serverName=
> 2012-05-04 20:19:52,916 WARN 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Exception 
> running postOpenDeployTasks; region=1028785192
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1483)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945)
> at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:801)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:776)
> at org.apache.hadoop.hbase.catalog.MetaEditor.put(MetaEditor.java:98)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.putToCatalogTable(MetaEditor.java:88)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateLocation(MetaEditor.java:259)
> at 
> org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaLocation(MetaEditor.java:221)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:1625)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:241)
> 2012-05-04 20:19:52,920 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Closing .META.,,1.1028785192: disabling compactions & flushes
> 2012-05-04 20:19:52,920 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Updates disabled for region .META.,,1.1028785192
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6115) NullPointerException is thrown when root and meta table regions are assigning to another RS.

2012-05-28 Thread rajeshbabu (JIRA)
rajeshbabu created HBASE-6115:
-

 Summary: NullPointerException is thrown when root and meta table 
regions are assigning to another RS.
 Key: HBASE-6115
 URL: https://issues.apache.org/jira/browse/HBASE-6115
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: rajeshbabu
Assignee: rajeshbabu
Priority: Minor
 Fix For: 0.94.1


Lets suppose we have two region servers RS1 and RS2.
If region server (RS1) having root and meta regions went down, master will 
assign them to another region server RS2. At that time recieved 
NullPointerException.
{code}
2012-05-04 20:19:52,912 DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
Looked up root region location, 
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@25de152f;
 serverName=
2012-05-04 20:19:52,914 DEBUG 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
Looked up root region location, 
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@25de152f;
 serverName=
2012-05-04 20:19:52,916 WARN 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Exception 
running postOpenDeployTasks; region=1028785192
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1483)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:801)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:776)
at org.apache.hadoop.hbase.catalog.MetaEditor.put(MetaEditor.java:98)
at 
org.apache.hadoop.hbase.catalog.MetaEditor.putToCatalogTable(MetaEditor.java:88)
at 
org.apache.hadoop.hbase.catalog.MetaEditor.updateLocation(MetaEditor.java:259)
at 
org.apache.hadoop.hbase.catalog.MetaEditor.updateMetaLocation(MetaEditor.java:221)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:1625)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:241)
2012-05-04 20:19:52,920 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Closing .META.,,1.1028785192: disabling compactions & flushes
2012-05-04 20:19:52,920 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Updates disabled for region .META.,,1.1028785192

{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira