[jira] [Commented] (HBASE-5798) NPE running hbck on 0.94 out of reportTablesInFlux

2012-04-17 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256249#comment-13256249
 ] 

Anoop Sam John commented on HBASE-5798:
---

Got a doubt on my patch now
We should track the skipped regions?  or the included regions in the 1st run...


> NPE running hbck on 0.94 out of reportTablesInFlux
> --
>
> Key: HBASE-5798
> URL: https://issues.apache.org/jira/browse/HBASE-5798
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.94.0, 0.96.0
>Reporter: stack
>Assignee: Anoop Sam John
> Attachments: HBASE-5798_94.patch, HBASE-5798_trunk.patch
>
>
> Got this playing w/ hbck going against the 0.94RC:
> {code}
> 12/04/16 17:03:14 INFO util.HBaseFsck: getHTableDescriptors == tableNames => 
> []
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:553)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:344)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:380)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3033)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5798) NPE running hbck on 0.94 out of reportTablesInFlux

2012-04-16 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254879#comment-13254879
 ] 

Anoop Sam John commented on HBASE-5798:
---

Jon, I can provide a patch tomorrow addressing both the points I have 
mentioned.[If it is ok with you]


> NPE running hbck on 0.94 out of reportTablesInFlux
> --
>
> Key: HBASE-5798
> URL: https://issues.apache.org/jira/browse/HBASE-5798
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.94.0, 0.96.0
>Reporter: stack
>Assignee: Jonathan Hsieh
>
> Got this playing w/ hbck going against the 0.94RC:
> {code}
> 12/04/16 17:03:14 INFO util.HBaseFsck: getHTableDescriptors == tableNames => 
> []
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:553)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:344)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:380)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3033)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5798) NPE running hbck on 0.94 out of reportTablesInFlux

2012-04-16 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254852#comment-13254852
 ] 

Anoop Sam John commented on HBASE-5798:
---

@Jon
 Yes null check I also dont like to put...:)
Also what about 2. When HBCK rerun after the fix we can set timelag =0?


> NPE running hbck on 0.94 out of reportTablesInFlux
> --
>
> Key: HBASE-5798
> URL: https://issues.apache.org/jira/browse/HBASE-5798
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Jonathan Hsieh
>
> Got this playing w/ hbck going against the 0.94RC:
> {code}
> 12/04/16 17:03:14 INFO util.HBaseFsck: getHTableDescriptors == tableNames => 
> []
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:553)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:344)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:380)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3033)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5798) NPE running hbck on 0.94 out of reportTablesInFlux

2012-04-16 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254837#comment-13254837
 ] 

Anoop Sam John commented on HBASE-5798:
---

@Ram , Yes this is the same issue.. I got the reason.
The scenario is like this as in our test.
There is one table and there was a case of one region of that table was not 
assigned with any of the RS. HBCK tool fixing this issue. After that HBCK will 
run again.
At this time getHTableDescriptors () is not finding any table in the cluster 
and return null and so reportTablesInFlux() -> errors.print("Number of Tables: 
" + allTables.length); gives a NPE

Why at this time no tables getting out of getHTableDescriptors () [Even though 
one table is there in the cluster is] this table is modified recently. HBCK 
just changed the HRegionInfo of the region of the table by assigning it to one 
of the RS.

For fix
1. We need null check in reportTablesInFlux() I think
2. When HBCK rerun after the fix we can set timelag =0?

> NPE running hbck on 0.94 out of reportTablesInFlux
> --
>
> Key: HBASE-5798
> URL: https://issues.apache.org/jira/browse/HBASE-5798
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Jonathan Hsieh
>
> Got this playing w/ hbck going against the 0.94RC:
> {code}
> 12/04/16 17:03:14 INFO util.HBaseFsck: getHTableDescriptors == tableNames => 
> []
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:553)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:344)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:380)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3033)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5635) If getTaskList() returns null splitlogWorker is down. It wont serve any requests.

2012-04-15 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254535#comment-13254535
 ] 

Anoop Sam John commented on HBASE-5635:
---

As per the patch the below variable is of no use now
{code}
this.zkretries = conf.getLong("hbase.splitlog.zk.retries", 3);
{code}


> If getTaskList() returns null splitlogWorker is down. It wont serve any 
> requests. 
> --
>
> Key: HBASE-5635
> URL: https://issues.apache.org/jira/browse/HBASE-5635
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.92.1
>Reporter: Kristam Subba Swathi
> Attachments: HBASE-5635.1.patch, HBASE-5635.2.patch, HBASE-5635.patch
>
>
> During the hlog split operation if all the zookeepers are down ,then the 
> paths will be returned as null and the splitworker thread wil be exited
> Now this regionserver wil not be able to acquire any other tasks since the 
> splitworker thread is exited
> Please find the attached code for more details
> {code}
> private List getTaskList() {
> for (int i = 0; i < zkretries; i++) {
>   try {
> return (ZKUtil.listChildrenAndWatchForNewChildren(this.watcher,
> this.watcher.splitLogZNode));
>   } catch (KeeperException e) {
> LOG.warn("Could not get children of znode " +
> this.watcher.splitLogZNode, e);
> try {
>   Thread.sleep(1000);
> } catch (InterruptedException e1) {
>   LOG.warn("Interrupted while trying to get task list ...", e1);
>   Thread.currentThread().interrupt();
>   return null;
> }
>   }
> }
> {code}
> in the org.apache.hadoop.hbase.regionserver.SplitLogWorker 
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5781) Zookeeper session got closed while trying to assign the region to RS using hbck -fix

2012-04-15 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254283#comment-13254283
 ] 

Anoop Sam John commented on HBASE-5781:
---

Yes finally code can be removed I think... [Any way HBCK is short living 
process also] Thanks Jon for the patch. Sorry I could not check it yday night.
@Lars we are eagerly waiting for the good news from you regarding 94 release..:)

> Zookeeper session got closed while trying to assign the region to RS using 
> hbck -fix
> 
>
> Key: HBASE-5781
> URL: https://issues.apache.org/jira/browse/HBASE-5781
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
>Reporter: Kristam Subba Swathi
>Assignee: Jonathan Hsieh
>Priority: Critical
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: hbase-5781.patch
>
>
> After running the hbck in the cluster ,it is found that one region is not 
> assigned
> So the hbck -fix is used to fix this 
> But the assignment didnt happen since the zookeeper session is closed
> Please find the attached trace for more details
> -
> Trying to fix unassigned region...
> 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, 
> waiting for it to become assigned: {NAME => 
> 'ufdr,002300,179123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY => 
> '002300', ENDKEY => '002311', ENCODED => 00871fbd7583512e12c4eb38e900be8d,}
> 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
> Closed zookeeper sessionid=0x236738a263a
> 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a263a closed
> ERROR: Region { meta => 
> ufdr,010444,179123857.01594219211d0035b9586f98954462e1., hdfs => 
> hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed 
> => } not deployed on any region server.
> Trying to fix unassigned region...
> 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down
> 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a263a Unable 
> to set watcher on znode (/hbase)
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
> at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356)
> at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894)
> 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: 
> hconnection-0x236738a263a Received unexpected KeeperException, 
> re-throwing exception
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNo

[jira] [Commented] (HBASE-5781) Zookeeper session got closed while trying to assign the region to RS using hbck -fix

2012-04-14 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254164#comment-13254164
 ] 

Anoop Sam John commented on HBASE-5781:
---

@Jon
Yes removing this finally close makes the test run and fix the issues that HBCK 
finds..  :)

I will take a detailed look at the code on Monday so that we can close the 
issue... Or if u are giving a patch it is fine ;)

> Zookeeper session got closed while trying to assign the region to RS using 
> hbck -fix
> 
>
> Key: HBASE-5781
> URL: https://issues.apache.org/jira/browse/HBASE-5781
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.94.0
>Reporter: Kristam Subba Swathi
>Assignee: Jonathan Hsieh
>
> After running the hbck in the cluster ,it is found that one region is not 
> assigned
> So the hbck -fix is used to fix this 
> But the assignment didnt happen since the zookeeper session is closed
> Please find the attached trace for more details
> -
> Trying to fix unassigned region...
> 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, 
> waiting for it to become assigned: {NAME => 
> 'ufdr,002300,179123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY => 
> '002300', ENDKEY => '002311', ENCODED => 00871fbd7583512e12c4eb38e900be8d,}
> 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
> Closed zookeeper sessionid=0x236738a263a
> 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a263a closed
> ERROR: Region { meta => 
> ufdr,010444,179123857.01594219211d0035b9586f98954462e1., hdfs => 
> hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed 
> => } not deployed on any region server.
> Trying to fix unassigned region...
> 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down
> 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a263a Unable 
> to set watcher on znode (/hbase)
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
> at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356)
> at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894)
> 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: 
> hconnection-0x236738a263a Received unexpected KeeperException, 
> re-throwing exception
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)

[jira] [Commented] (HBASE-5781) Zookeeper session got closed while trying to assign the region to RS using hbck -fix

2012-04-14 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254146#comment-13254146
 ] 

Anoop Sam John commented on HBASE-5781:
---

In HBaseFsckRepair.waitUntilAssigned
{code}
finally {
  try {
connection.close();
  } catch (IOException ioe) {
throw ioe;
  }
}
{code}

This close caused the exception as per my observation. This method is being 
called in assignmentRepair, regionConsistencyRepair and metaRepair...

Now if the HBCK fix needs to fix all these kind of issues or atleast 2 of these 
issues, the close would happen before the other fix..  I have not done detailed 
check..  Just observation as per the logs..

Mean while tested the current RC for 0.94 version


> Zookeeper session got closed while trying to assign the region to RS using 
> hbck -fix
> 
>
> Key: HBASE-5781
> URL: https://issues.apache.org/jira/browse/HBASE-5781
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: Kristam Subba Swathi
>Assignee: Jonathan Hsieh
>
> After running the hbck in the cluster ,it is found that one region is not 
> assigned
> So the hbck -fix is used to fix this 
> But the assignment didnt happen since the zookeeper session is closed
> Please find the attached trace for more details
> -
> Trying to fix unassigned region...
> 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, 
> waiting for it to become assigned: {NAME => 
> 'ufdr,002300,179123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY => 
> '002300', ENDKEY => '002311', ENCODED => 00871fbd7583512e12c4eb38e900be8d,}
> 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
> Closed zookeeper sessionid=0x236738a263a
> 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a263a closed
> ERROR: Region { meta => 
> ufdr,010444,179123857.01594219211d0035b9586f98954462e1., hdfs => 
> hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed 
> => } not deployed on any region server.
> Trying to fix unassigned region...
> 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down
> 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a263a Unable 
> to set watcher on znode (/hbase)
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
> at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356)
> at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894)
> 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: 
> hconnection-0x236738a263a Received unexpected KeeperException, 
> re-throwing exception
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTrac

[jira] [Commented] (HBASE-5360) [uberhbck] Add options for how to handle offline split parents.

2012-04-12 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253154#comment-13253154
 ] 

Anoop Sam John commented on HBASE-5360:
---

@Jon
HBASE-5719 Handling this issue?

> [uberhbck] Add options for how to handle offline split parents. 
> 
>
> Key: HBASE-5360
> URL: https://issues.apache.org/jira/browse/HBASE-5360
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Affects Versions: 0.90.7, 0.92.1, 0.94.0
>Reporter: Jonathan Hsieh
>
> In a recent case, we attempted to repair a cluster that suffered from 
> HBASE-4238 that had about 6-7 generations of "leftover" split data.  The hbck 
> repair options in an development version of HBASE-5128 treat HDFS as ground 
> truth but didn't check SPLIT and OFFLINE flags only found in meta.  The net 
> effect was that it essentially attempted to merge many regions back into its 
> eldest geneneration's parent's range.  
> More safe guards to prevent "mega-merges" are being added on HBASE-5128.
> This issue would automate the handling of the "mega-merge" avoiding cases 
> such as "lingering grandparents".  The strategy here would be to add more 
> checks against .META., and perform part of the catalog janitor's 
> responsibilities for lingering grandparents.  This would potentially include 
> options to sideline regions, deleting grandparent regions, min size for 
> sidelining, and mechanisms for cleaning .META..  
> Note: There already exists an mechanism to reload these regions -- the bulk 
> loaded mechanisms in LoadIncrementalHFiles can be used to re-add grandparents 
> (automatically splitting them if necessary) to HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5488) Fixed OfflineMetaRepair bug

2012-04-12 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253136#comment-13253136
 ] 

Anoop Sam John commented on HBASE-5488:
---

Is this fix needed for 0.94 also?

> Fixed OfflineMetaRepair bug 
> 
>
> Key: HBASE-5488
> URL: https://issues.apache.org/jira/browse/HBASE-5488
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Minor
> Fix For: 0.92.2
>
> Attachments: HBASE-5488-branch92.patch, HBASE-5488-trunk.patch, 
> HBASE-5488_branch90.txt
>
>
> I want to use "OfflineMetaRepair" tools and found onbody fix this bugs. I 
> will make a patch.
> > 12/01/05 23:23:30 ERROR util.HBaseFsck: Bailed out due to:
> > java.lang.IllegalArgumentException: Wrong FS: hdfs:// 
> > us01-ciqps1-name01.carrieriq.com:9000/hbase/M2M-INTEGRATION-MM_TION-13
> > 25190318714/0003d2ede27668737e192d8430dbe5d0/.regioninfo,
> > expected: file:///
> >at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:352)
> >at
> > org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:47)
> >at
> > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:368)
> >at
> > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
> >at
> > org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:126)
> >at
> > org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:284)
> >at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:398)
> >at
> > org.apache.hadoop.hbase.util.HBaseFsck.loadMetaEntry(HBaseFsck.java:256)
> >at
> > org.apache.hadoop.hbase.util.HBaseFsck.loadTableInfo(HBaseFsck.java:284)
> >at
> > org.apache.hadoop.hbase.util.HBaseFsck.rebuildMeta(HBaseFsck.java:402)
> >at
> > org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair.main(OfflineMetaRe

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]

2012-04-12 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253133#comment-13253133
 ] 

Anoop Sam John commented on HBASE-4379:
---

@Jon
>From the code I think this case is not covered in new HBCK also. As per the 
>patch it will report error. Just like we fix the 
>FIRST_REGION_STARTKEY_NOT_EMPTY , can we fix this case also?

> [hbck] Does not complain about tables with no end region [Z,]
> -
>
> Key: HBASE-4379
> URL: https://issues.apache.org/jira/browse/HBASE-4379
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.90.5, 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: 
> 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch, 
> hbase-4379.v2.patch
>
>
> hbck does not detect or have an error condition when the last region of a 
> table is missing (end key != '').

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4094) improve hbck tool to fix more hbase problem

2012-04-12 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253125#comment-13253125
 ] 

Anoop Sam John commented on HBASE-4094:
---

This may be closed duplicate as HBASE-5128 handles these valid scenarios now.

{quote}
14.ERROR_CODE.LAST_REGION_ENDKEY_NOT_EMPTY(new add)--->treat it as a hole 
problem(ERROR_CODE.HOLE_IN_REGION_CHAIN). 
{quote}
This check is not there now. But there is another issue HBASE-4379 on this.

> improve hbck tool to fix more hbase problem
> ---
>
> Key: HBASE-4094
> URL: https://issues.apache.org/jira/browse/HBASE-4094
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Affects Versions: 0.90.3
>Reporter: feng xu
> Fix For: 0.90.7
>
> Attachments: HbaseFsck_TableChain.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3505) hbck should be able to fix case where region is missing from META but on FS

2012-04-12 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252492#comment-13252492
 ] 

Anoop Sam John commented on HBASE-3505:
---

Uber HBCK is handling this case now right? This issue can be closed?

> hbck should be able to fix case where region is missing from META but on FS
> ---
>
> Key: HBASE-3505
> URL: https://issues.apache.org/jira/browse/HBASE-3505
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Todd Lipcon
>Assignee: Jonathan Hsieh
> Attachments: hbase-3505.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5664) CP hooks in Scan flow for fast forward when filter filters out a row

2012-03-29 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242048#comment-13242048
 ] 

Anoop Sam John commented on HBASE-5664:
---

Previosly there was no use to give CP hook as there were no seek available in 
RegionScanner. Now with HBASE-5520 reseek() is provided in RegionScanner, which 
the CP hook can use for fast forward.

> CP hooks in Scan flow for fast forward when filter filters out a row
> 
>
> Key: HBASE-5664
> URL: https://issues.apache.org/jira/browse/HBASE-5664
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Affects Versions: 0.92.1
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 0.96.0
>
>
> In HRegion.nextInternal(int limit, String metric)
>   We have while(true) loop so as to fetch a next result which satisfies 
> filter condition. When Filter filters out the current fetched row we call 
> nextRow(byte [] currentRow) before going with the next row.
> {code}
> if (results.isEmpty() || filterRow()) {
> // this seems like a redundant step - we already consumed the row
> // there're no left overs.
> // the reasons for calling this method are:
> // 1. reset the filters.
> // 2. provide a hook to fast forward the row (used by subclasses)
> nextRow(currentRow);
> {code}
> // 2. provide a hook to fast forward the row (used by subclasses)
> We can provide same feature of fast forward support for the CP also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5664) CP hooks in Scan flow for fast forward when filter filters out a row

2012-03-29 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242047#comment-13242047
 ] 

Anoop Sam John commented on HBASE-5664:
---

Previously seeking was not possible on the RegionScanner. Now with HBASE-5520 
CP can do a fast forward with reseek() on RegionScanner.
Use case of this new hook


> CP hooks in Scan flow for fast forward when filter filters out a row
> 
>
> Key: HBASE-5664
> URL: https://issues.apache.org/jira/browse/HBASE-5664
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Affects Versions: 0.92.1
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 0.96.0
>
>
> In HRegion.nextInternal(int limit, String metric)
>   We have while(true) loop so as to fetch a next result which satisfies 
> filter condition. When Filter filters out the current fetched row we call 
> nextRow(byte [] currentRow) before going with the next row.
> {code}
> if (results.isEmpty() || filterRow()) {
> // this seems like a redundant step - we already consumed the row
> // there're no left overs.
> // the reasons for calling this method are:
> // 1. reset the filters.
> // 2. provide a hook to fast forward the row (used by subclasses)
> nextRow(currentRow);
> {code}
> // 2. provide a hook to fast forward the row (used by subclasses)
> We can provide same feature of fast forward support for the CP also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5617) Provide coprocessor hooks in put flow while rollbackMemstore.

2012-03-28 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240981#comment-13240981
 ] 

Anoop Sam John commented on HBASE-5617:
---

{quote}
Do we need two hooks for memstore rollback ?

{quote}
+1 for pre and post... As per the use case one can use pre or post hook...

> Provide coprocessor hooks in put flow while rollbackMemstore.
> -
>
> Key: HBASE-5617
> URL: https://issues.apache.org/jira/browse/HBASE-5617
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.96.0
>
> Attachments: HBASE-5617_1.patch
>
>
> With coprocessors hooks while put happens we have the provision to create new 
> puts to other tables or regions.  These puts can be done with writeToWal as 
> false.
> In 0.94 and above the puts are first written to memstore and then to WAL.  If 
> any failure in the WAL append or sync the memstore is rollbacked.  
> Now the problem is that if the put that happens in the main flow fails there 
> is no way to rollback the 
> puts that happened in the prePut.
> We can add coprocessor hooks to like pre/postRoolBackMemStore.  Is any one 
> hook enough here?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-23 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236441#comment-13236441
 ] 

Anoop Sam John commented on HBASE-5564:
---

{quote}
In bulkload, if multiple records are having same timestamp, then the last KV 
entry processed by reducer only will be persisted (TreeSet in Reducer)
{quote}

The 1st KV processed by the Reducer right...

Yes agree with you which one is the latest might not be possible to be 
predicted in the reducer side...

> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Fix For: 0.96.0
>
> Attachments: 5564.lint, HBASE-5564_trunk.1.patch, 
> HBASE-5564_trunk.1.patch, HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-20 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233417#comment-13233417
 ] 

Anoop Sam John commented on HBASE-5564:
---

Comment from Jesse Yates
{quote}
The question is, if you have a TSV file with the same row key, which value 
should be considered the most recent version? Should any of them - maybe that 
is actually a problem and we want to have a warning/error when that occurs?
{quote}

Do we need to handle this? The issue is TreeSet used by PutSortReducer and 
KeyValueSortReducer as mentioned by Laxman. 
In normal data insertion using Puts, all the duplicate values will go into the 
memstore (and finally to HFiles) and while scan the last entered one will get 
retrieved. In this bulk load case the 1st data only will get inserted as DS 
avoid the duplicates. Is this a behaviour mismatch?  But this depends on which 
entry in the TSV file needs to be considered as the recent version.If we say 
that last entry coming in the file is the recent version.


> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Attachments: HBASE-5564_trunk.1.patch, HBASE-5564_trunk.1.patch, 
> HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-19 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233238#comment-13233238
 ] 

Anoop Sam John commented on HBASE-5564:
---

@Laxman
ImportTsv
{code}
+// If timestamp option is not specified, use current system time.
+long timstamp = conf.getLong(TIMESTAMP_CONF_KEY, 
System.currentTimeMillis());
+
+// Set it back to replace invalid timestamp (non-numeric) with current 
system time
+conf.setLong(TIMESTAMP_CONF_KEY, timstamp);
{code}

Doing this will use the same TS across all the mappers. Is this the intention 
for this change? So in TsvImporterMapper, 
conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, 0) will always have value to get 
from conf.

> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Attachments: HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5520) Support reseek() at RegionScanner

2012-03-16 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231843#comment-13231843
 ] 

Anoop Sam John commented on HBASE-5520:
---

@Lars
The co processor preScannerNext() method will be getting called from the 
HRegionServer
{code}
// Call coprocessor. Get region info from scanner.
  HRegion region = getRegion(s.getRegionInfo().getRegionName());
  if (region != null && region.getCoprocessorHost() != null) {
Boolean bypass = region.getCoprocessorHost().preScannerNext(s,
results, nbRows);
{code}

So by the time co processor calls this seek, there wont be any region op 
already started. Pls correct me if I am wrong.


> Support reseek() at RegionScanner
> -
>
> Key: HBASE-5520
> URL: https://issues.apache.org/jira/browse/HBASE-5520
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Anoop Sam John
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.96.0
>
> Attachments: HBASE-5520_1.patch, HBASE-5520_2.patch, 
> HBASE-5520_3.patch
>
>
> reseek() is not supported currently at the RegionScanner level. We can 
> support the same.
> This is created following the discussion under HBASE-2038

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5520) Support reseek() at RegionScanner

2012-03-15 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230907#comment-13230907
 ] 

Anoop Sam John commented on HBASE-5520:
---

@Lars
Thanks for your thoughts and suggestion.
Yes we felt that this will be a nice feature for the coprocessor guy..

Now we support seek to the begin boundary of row.
Will it be nice to support seek to the end of a given row also.

Co processor dont know the exact rowid to be seeked... But it know to skip till 
some rowId.. In that case if we allow to seek to the end boundary of a row , it 
would be useful.. What do u say?

In our case we dont have this use case now. But it might be useful for some 
other co processor guys 

> Support reseek() at RegionScanner
> -
>
> Key: HBASE-5520
> URL: https://issues.apache.org/jira/browse/HBASE-5520
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Anoop Sam John
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.96.0
>
> Attachments: HBASE-5520_1.patch, HBASE-5520_2.patch, 
> HBASE-5520_3.patch
>
>
> reseek() is not supported currently at the RegionScanner level. We can 
> support the same.
> This is created following the discussion under HBASE-2038

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5584) Coprocessor hooks can be called in the respective handlers

2012-03-15 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230890#comment-13230890
 ] 

Anoop Sam John commented on HBASE-5584:
---

+1 on additional hooks.
Different usecases can be supported..  As per the need user can use which one 
they want  :)
Thanks Andrew

> Coprocessor hooks can be called in the respective handlers
> --
>
> Key: HBASE-5584
> URL: https://issues.apache.org/jira/browse/HBASE-5584
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.96.0
>
>
> Following points can be changed w.r.t to coprocessors
> -> Call preCreate, postCreate, preEnable, postEnable, etc. in their 
> respective handlers
> -> Currently it is called in the HMaster thus making the postApis async w.r.t 
> the handlers
> -> Similar is the case with the balancer.
> with current behaviour once we are in the postEnable(for eg) we any way need 
> to wait for the main enable handler to 
> be completed.
> We should ensure that we dont wait in the main thread so again we need to 
> spawn a thread and wait on that.
> On the other hand if the pre and post api is called on the handlers then only 
> that handler thread will be
> used in the pre/post apis
> If the above said plan is ok i can prepare a patch for all such related 
> changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5517) Region Server Coprocessor : Suggestion for change when next() call with nbRows>1

2012-03-05 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222409#comment-13222409
 ] 

Anoop Sam John commented on HBASE-5517:
---

In HRegionServer next(final long scannerId, int nbRows) 
{code}
for (int i = 0; i < nbRows
  && currentScanResultSize < maxScannerResultSize; i++) {
requestCount.incrementAndGet();
{code}
Here if next() is called with nbRows=10 we are treating it as 10 requests came 
to RS. We treat it as 10 different operations on the RS.In that case we better 
contact the CP 10 times rather than 1 time?  Correct me if I am wrong...:)

> Region Server Coprocessor : Suggestion for change when next() call with 
> nbRows>1
> 
>
> Key: HBASE-5517
> URL: https://issues.apache.org/jira/browse/HBASE-5517
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Affects Versions: 0.92.0
>Reporter: Anoop Sam John
>
> Originated from the discussion under HBASE-2038 [Coprocessor based IHBase]
> Currently preNext() and postNext() will be called once for a next() call into 
> HRegionServer.
> But if the next() is being called with nbRows>1, co processor should provide 
> a chance to do some operation before, after every next() calls into region as 
> part of call next(int scannerId, int nbRows).
> In case of usage of coprocessor with IHBase, before making any calls of 
> next() into a Region, we need to make a reseek() to a row based on the index 
> information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5520) Support seek() reseek() at RegionScanner

2012-03-05 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222341#comment-13222341
 ] 

Anoop Sam John commented on HBASE-5520:
---

We can support only seek() and reseek() at the row boundary level.
We can take any of the below approaches
1. The APIs make use of the rowkey and timestamp only from the KeyValue passed.
2. Check at the RegionScannerImpl level that it is not having the CF, qualifier 
in the passed KV. If so throw exception. Only the KV can have the rowkey and 
timestamp also.[It is ok.Timestamp can be there...]
3. Dont bother let the seek happen. But may be dangerous??
4. We can give the signature of the seek() and reseek() at the RegionScanner as 
seek( byte[] rowKey ) reseek( byte[] rowKey )? So that the seek will be always 
to the begin KV of the row in every CF. [ if CF contains that key ]

> Support seek() reseek() at RegionScanner
> 
>
> Key: HBASE-5520
> URL: https://issues.apache.org/jira/browse/HBASE-5520
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Anoop Sam John
>
> seek() reseek() is not supported currently at the RegionScanner level. We can 
> support the same.
> This is created following the discussion under HBASE-2038

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2038) Coprocessors: Region level indexing

2012-03-05 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222340#comment-13222340
 ] 

Anoop Sam John commented on HBASE-2038:
---

Or may be we can give the signature of the seek() and reseek() at the 
RegionScanner as seek( byte[] rowKey ) reseek( byte[] rowKey )?
So that the seek will be always to the begin KV of the row in every CF. [ if CF 
contains that key ]

> Coprocessors: Region level indexing
> ---
>
> Key: HBASE-2038
> URL: https://issues.apache.org/jira/browse/HBASE-2038
> Project: HBase
>  Issue Type: New Feature
>  Components: coprocessors
>Reporter: Andrew Purtell
>Priority: Minor
>
> HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a 
> good goalpost for coprocessor environment design -- there should be enough of 
> it so region level indexing can be reimplemented as a coprocessor without any 
> loss of functionality. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2038) Coprocessors: Region level indexing

2012-03-05 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222300#comment-13222300
 ] 

Anoop Sam John commented on HBASE-2038:
---

@Lars
I have created HBASE-5520 for the support for seek() and reseek() at the 
RegionScanner.  As I mentioned in the comment we need row boundary seeks only. 
Yes it might be complex wrt the other kind of seeks. We can support only seek() 
and reseek() at the row boundary level only at the RegionScanner?
We can take any of the below approaches
1. The APIs make use of the rowkey and timestamp only from the KeyValue passed.
2. Check at the RegionScannerImpl level that it is not having the CF, qualifier 
in the passed KV. If so throw exception.  Only the KV can have the rowkey and 
timestamp also.[It is ok.Timestamp can be there...]
3. Dont bother let the seek happen. But may be dangerous??

Pls give ur valuable suggestions

Me and Ram started working with this.


>From the co processor preNext() we can call reseek with 
>KeyValue.createFirstOnRow(final byte [] row)

> Coprocessors: Region level indexing
> ---
>
> Key: HBASE-2038
> URL: https://issues.apache.org/jira/browse/HBASE-2038
> Project: HBase
>  Issue Type: New Feature
>  Components: coprocessors
>Reporter: Andrew Purtell
>Priority: Minor
>
> HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a 
> good goalpost for coprocessor environment design -- there should be enough of 
> it so region level indexing can be reimplemented as a coprocessor without any 
> loss of functionality. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5512) Add support for INCLUDE_AND_SEEK_USING_HINT

2012-03-05 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1390#comment-1390
 ] 

Anoop Sam John commented on HBASE-5512:
---

@Lars
  As u said under HBASE_2038 I dont think we can use this way to do the 
IHBase work :(  Mainly how the filter can return INCLUDE_AND_SEEK..   WHat we 
wanted ideally is at the row level say include this row and seek to next row. 
But any way at row level the filter contains boolean return either include the 
row or not.

I feel making an attempt to support the seek() reseek() at the region scanner 
level..

> Add support for INCLUDE_AND_SEEK_USING_HINT
> ---
>
> Key: HBASE-5512
> URL: https://issues.apache.org/jira/browse/HBASE-5512
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
>Assignee: Lars Hofhansl
> Attachments: 5512-v2.txt, 5512.txt
>
>
> This came up from HBASE-2038
> From Anoop:
> - What we wanted from the filter is include a row and then seek to the next 
> row which we are interested in. I cant see such a facility with our Filter 
> right now. Correct me if I am wrong. So suppose we already seeked to one row 
> and this need to be included in the result, then the Filter should return 
> INCLUDE. Then when the next next() call happens, then only we can return a 
> SEEK_USING_HINT. So one extra row reading is needed. This might create even 
> one unwanted HFileBlock fetch (who knows).
> Can we add reseek() at higher level?
> From Lars:
> Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the 
> INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, 
> I'm happy to do that, if that's the route we want to go with this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5510) Change in LB.randomAssignment(List servers) API

2012-03-03 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221812#comment-13221812
 ] 

Anoop Sam John commented on HBASE-5510:
---

@Ted
 Addressed your comments in the new patch.
Thanks

> Change in LB.randomAssignment(List servers) API
> ---
>
> Key: HBASE-5510
> URL: https://issues.apache.org/jira/browse/HBASE-5510
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Anoop Sam John
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBase-5510.patch, HBase-5510_2.patch
>
>
>  In LB there is randomAssignment(List) API which will be 
> used by AM to assign
>  a region from a down RS. [This will be also used in other cases like call to 
> assign() API from client]
>  I feel it would be better to pass the HRegionInfo also into this method. 
> When the LB making a choice for a region
>  assignment, when one RS is down, it would be nice that the LB knows for 
> which region it is doing this server selection.
> +Scenario+
>  While one RS down, we wanted the regions to get moved to other RSs but a set 
> of regions stay together. We are having custom load balancer but with the 
> current way of LB interface this is not possible. Another way is I can allow 
> a random assignment of the regions at the RS down time. Later with a cluster 
> balance I can balance the regions as I need. But this might make regions 
> assign 1st to one RS and then again move to another. Also for some time 
> period my business use case can not get satisfied.
> Also I have seen some issue in JIRA which speaks about making sure that Root 
> and META regions always sit in some specific RSs. With the current LB API 
> this wont be possible in future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2038) Coprocessors: Region level indexing

2012-03-03 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221804#comment-13221804
 ] 

Anoop Sam John commented on HBASE-2038:
---

Created https://issues.apache.org/jira/browse/HBASE-5517 for the co processor 
change

> Coprocessors: Region level indexing
> ---
>
> Key: HBASE-2038
> URL: https://issues.apache.org/jira/browse/HBASE-2038
> Project: HBase
>  Issue Type: New Feature
>  Components: coprocessors
>Reporter: Andrew Purtell
>Priority: Minor
>
> HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a 
> good goalpost for coprocessor environment design -- there should be enough of 
> it so region level indexing can be reimplemented as a coprocessor without any 
> loss of functionality. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2038) Coprocessors: Region level indexing

2012-03-03 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221797#comment-13221797
 ] 

Anoop Sam John commented on HBASE-2038:
---

{quote}
@HBASE-5521: I started working on that, but I am starting to question the 
usefulness.
A filter is per KeyValue (at least the method that allows for seeking). So, 
many KeyValues flow through the Filter for a single row, and the filter needs 
to seek separately for each ColumnFamily (as explained above and on the mailing 
list).
So the gain from this would be fairly minimal (which I guess is why we do not 
have this).
For example a row with many column would need to issue many INCLUDE's and only 
for the last KeyVakue (and how would it know it's the last?) issue 
INCLUDE_AND_SEEK..
{quote}

Lars,   I was also thinking on this yesterday after seeing the patch. I wanted 
to give a test case try run before commenting :) 

Regarding you 1st comment, In our above discussion scenario of seek() we need a 
row boundary seek.. Yes all the stores ( memstore and all store files in that 
store) need to get seeked to needed point. Let me see more on this on Monday. 
we had done small changes and tested this once. I mean we were able to seek to 
row boundaries.

Thanks a lot Lars for your work and suggestion

@Ram: Yes we can file a Jira for co processor support for next( int nbrows)?


> Coprocessors: Region level indexing
> ---
>
> Key: HBASE-2038
> URL: https://issues.apache.org/jira/browse/HBASE-2038
> Project: HBase
>  Issue Type: New Feature
>  Components: coprocessors
>Reporter: Andrew Purtell
>Priority: Minor
>
> HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a 
> good goalpost for coprocessor environment design -- there should be enough of 
> it so region level indexing can be reimplemented as a coprocessor without any 
> loss of functionality. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5512) Add support for INCLUDE_AND_SEEK_USING_HINT

2012-03-03 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221559#comment-13221559
 ] 

Anoop Sam John commented on HBASE-5512:
---

ScanQueryMatcher
} else if (filterResponse == ReturnCode.INCLUDE_AND_SEEK_NEXT_USING_HINT) {
+
   }
You need make filterSeek=true? 

> Add support for INCLUDE_AND_SEEK_USING_HINT
> ---
>
> Key: HBASE-5512
> URL: https://issues.apache.org/jira/browse/HBASE-5512
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
>Assignee: Lars Hofhansl
> Attachments: 5512.txt
>
>
> This came up from HBASE-2038
> From Anoop:
> - What we wanted from the filter is include a row and then seek to the next 
> row which we are interested in. I cant see such a facility with our Filter 
> right now. Correct me if I am wrong. So suppose we already seeked to one row 
> and this need to be included in the result, then the Filter should return 
> INCLUDE. Then when the next next() call happens, then only we can return a 
> SEEK_USING_HINT. So one extra row reading is needed. This might create even 
> one unwanted HFileBlock fetch (who knows).
> Can we add reseek() at higher level?
> From Lars:
> Yep, for that we'd need to add INCLUDE_AND_SEEK_USING_HINT (similar to the 
> INCLUDE_AND_SEEK_NEXT_ROW that we already have). Shouldn't be hard to add, 
> I'm happy to do that, if that's the route we want to go with this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5510) Change in LB.randomAssignment(List servers) API

2012-03-03 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221556#comment-13221556
 ] 

Anoop Sam John commented on HBASE-5510:
---

Pls see the patch.

> Change in LB.randomAssignment(List servers) API
> ---
>
> Key: HBASE-5510
> URL: https://issues.apache.org/jira/browse/HBASE-5510
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Anoop Sam John
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBase-5510.patch
>
>
>  In LB there is randomAssignment(List) API which will be 
> used by AM to assign
>  a region from a down RS. [This will be also used in other cases like call to 
> assign() API from client]
>  I feel it would be better to pass the HRegionInfo also into this method. 
> When the LB making a choice for a region
>  assignment, when one RS is down, it would be nice that the LB knows for 
> which region it is doing this server selection.
> +Scenario+
>  While one RS down, we wanted the regions to get moved to other RSs but a set 
> of regions stay together. We are having custom load balancer but with the 
> current way of LB interface this is not possible. Another way is I can allow 
> a random assignment of the regions at the RS down time. Later with a cluster 
> balance I can balance the regions as I need. But this might make regions 
> assign 1st to one RS and then again move to another. Also for some time 
> period my business use case can not get satisfied.
> Also I have seen some issue in JIRA which speaks about making sure that Root 
> and META regions always sit in some specific RSs. With the current LB API 
> this wont be possible in future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2038) Coprocessors: Region level indexing

2012-03-03 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221541#comment-13221541
 ] 

Anoop Sam John commented on HBASE-2038:
---

@Lars

{quote}
Seeking is done one level (or two actually) level deeper.
Seeking is done in the StoreScanners, coprocessors see RegionScanners.

It is not entirely clear to me where to hook this up in that API.
{quote}

Yes at RegionScanners level we dont have seek() or reseek(). It is one level 
down @ KeyValueHeap level.
Will it be correct to add seek() reseek() behaviours at RegionScanner level?[ 
We just need to delegate seek() or reseek() calls into KeyValueHeap  object 
within the RegionScanner...]

If so it would be very easy to do a reseek() to the needed row at the 
coprocessor preScannerNext().
next() will take the needed row.

What do you say? Correct me if my suggestion is wrong.

> Coprocessors: Region level indexing
> ---
>
> Key: HBASE-2038
> URL: https://issues.apache.org/jira/browse/HBASE-2038
> Project: HBase
>  Issue Type: New Feature
>  Components: coprocessors
>Reporter: Andrew Purtell
>Priority: Minor
>
> HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a 
> good goalpost for coprocessor environment design -- there should be enough of 
> it so region level indexing can be reimplemented as a coprocessor without any 
> loss of functionality. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5510) Change in LB.randomAssignment(List servers) API

2012-03-02 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221537#comment-13221537
 ] 

Anoop Sam John commented on HBASE-5510:
---

@Ted
  Thanks for your reply. R1-S1 mapping can be found out using the start and 
end keys of both these regions. Our design will be so that whenever we see S1 
when will be able to judge which R1 is its buddy..  am I clear to you? But now 
the issue is randomAssignment() will not reveal the region for which the 
selection is happening now. If we can get this we will be able to do our 
business scenario using our LB..  :)

@Ram : Thanks for the explanation comment.

Thanks

> Change in LB.randomAssignment(List servers) API
> ---
>
> Key: HBASE-5510
> URL: https://issues.apache.org/jira/browse/HBASE-5510
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Anoop Sam John
>Assignee: ramkrishna.s.vasudevan
>
>  In LB there is randomAssignment(List) API which will be 
> used by AM to assign
>  a region from a down RS. [This will be also used in other cases like call to 
> assign() API from client]
>  I feel it would be better to pass the HRegionInfo also into this method. 
> When the LB making a choice for a region
>  assignment, when one RS is down, it would be nice that the LB knows for 
> which region it is doing this server selection.
> +Scenario+
>  While one RS down, we wanted the regions to get moved to other RSs but a set 
> of regions stay together. We are having custom load balancer but with the 
> current way of LB interface this is not possible. Another way is I can allow 
> a random assignment of the regions at the RS down time. Later with a cluster 
> balance I can balance the regions as I need. But this might make regions 
> assign 1st to one RS and then again move to another. Also for some time 
> period my business use case can not get satisfied.
> Also I have seen some issue in JIRA which speaks about making sure that Root 
> and META regions always sit in some specific RSs. With the current LB API 
> this wont be possible in future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2038) Coprocessors: Region level indexing

2012-03-02 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220981#comment-13220981
 ] 

Anoop Sam John commented on HBASE-2038:
---

Hi Lars,
{quote}It might be possible to provide a custom filter to do that.{quote}

- What we wanted from the filter is include a row and then seek to the next row 
which we are interested in. I cant see such a facility with our Filter right 
now. Correct me if I am wrong. So suppose we already seeked to one row and this 
need to be included in the result, then the Filter should return INCLUDE. Then 
when the next next() call happens, then only we can return a SEEK_USING_HINT. 
So one extra row reading is needed. This might create even one unwanted 
HFileBlock fetch (who knows).
Can we add reseek() at higher level?
If you have suggestion pls give me.

Thanks
Anoop

> Coprocessors: Region level indexing
> ---
>
> Key: HBASE-2038
> URL: https://issues.apache.org/jira/browse/HBASE-2038
> Project: HBase
>  Issue Type: New Feature
>  Components: coprocessors
>Reporter: Andrew Purtell
>Priority: Minor
>
> HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a 
> good goalpost for coprocessor environment design -- there should be enough of 
> it so region level indexing can be reimplemented as a coprocessor without any 
> loss of functionality. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4491) HBase Locality Checker

2012-02-23 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214617#comment-13214617
 ] 

Anoop Sam John commented on HBASE-4491:
---

@Liyin: Looks very useful. Any update on this new feature

> HBase Locality Checker
> --
>
> Key: HBASE-4491
> URL: https://issues.apache.org/jira/browse/HBASE-4491
> Project: HBase
>  Issue Type: New Feature
>Reporter: Liyin Tang
>Assignee: Liyin Tang
>
> If we run data node and region server in the same physical machine, region 
> server will be benefit if the store files for its serving regions have a 
> local replica in the data node process.
> So for each regions, there exists a best locality region server which has 
> most local blocks for this region.
> The HBase Locality Checker will show how many regions is running on its best 
> locality region server. 
> The higher the number is, the more performance benefits HBase can get from 
> data locality.
> Also there would be a followup task to use these region locality information 
> for region assignment. Assignment manager will prefer assign regions to its 
> best locality region server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2038) Coprocessors: Region level indexing

2012-02-06 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202057#comment-13202057
 ] 

Anoop Sam John commented on HBASE-2038:
---

Hi Alex,
Thanks for your reply...  Yes I had seen your past comment..I am checking 
the trunk code for the co processor for this work as of now...

What is your comment on my first comment, that the HRegionServer next(final 
long scannerId, int nbRows) calls the co processor preScannerNext() by passing 
the RegionScanner. On this we can not make a seek()..

Thanks
Anoop


> Coprocessors: Region level indexing
> ---
>
> Key: HBASE-2038
> URL: https://issues.apache.org/jira/browse/HBASE-2038
> Project: HBase
>  Issue Type: New Feature
>  Components: coprocessors
>Reporter: Andrew Purtell
>Priority: Minor
>
> HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a 
> good goalpost for coprocessor environment design -- there should be enough of 
> it so region level indexing can be reimplemented as a coprocessor without any 
> loss of functionality. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2038) Coprocessors: Region level indexing

2012-02-06 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201205#comment-13201205
 ] 

Anoop Sam John commented on HBASE-2038:
---

Hi Lars,
  I am also trying for a secondary index and I have seen the IHBase 
concept being good.. But we need this to be moved to coprocessor based so that 
the kernel code of HBase need not be different for the secondary index. IHBase 
makes the scan go through all the regions ( as u said ) but they will skip and 
seek to the later positions in the heap avoid so many possible data read from 
HDFS etc...
When I saw the current co processor, we call preScannerNext() from 
HRegionServer next(final long scannerId, int nbRows)  and pass the 
RegionScanner here to the co processor.  But as per the IHBase way, within the 
co processor we should be able to seek to the correct row where the indexed col 
val equals our value. But we can not do this as of now as RegionScanner seek() 
not there. 

Also this preScannerNext() will be called once before the actual next(final 
long scannerId, int nbRows) call happening on the region. Here as per the cache 
value at the client side the nbRows might be more than one. Now suppose this is 
nbRows=2 and in the region we have 2 rows one at some what in the middle part 
of an HFile and the other at another HFile. Now as per IHBase we should 1st 
seek to the 1st position of the row and after reading this data should seek to 
the next position. Now as per the current way of calling of preScannerNext() 
this wont be possible. So I think we might need some change in these area?  
What do u say?

Mean while what is your plan to continue with the way of IHBase storing the 
index in memory for each of the region or some change in this?

> Coprocessors: Region level indexing
> ---
>
> Key: HBASE-2038
> URL: https://issues.apache.org/jira/browse/HBASE-2038
> Project: HBase
>  Issue Type: New Feature
>  Components: coprocessors
>Reporter: Andrew Purtell
>Priority: Minor
>
> HBASE-2037 is a good candidate to be done as coprocessor. It also serve as a 
> good goalpost for coprocessor environment design -- there should be enough of 
> it so region level indexing can be reimplemented as a coprocessor without any 
> loss of functionality. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap

2011-12-21 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174653#comment-13174653
 ] 

Anoop Sam John commented on HBASE-5088:
---

@Lars  I mean replace the TreeMap in SoftValueSortedMap with 
ConcurrentSkipListMap.
Yes we need the SoftValueSortedMap as it maintains Soft refs...



> A concurrency issue on SoftValueSortedMap
> -
>
> Key: HBASE-5088
> URL: https://issues.apache.org/jira/browse/HBASE-5088
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4, 0.94.0
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
>
> SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
> synchronized. If we use this method to add/delete elements, it's ok.
> But in HConnectionManager#getCachedLocation, it use headMap to get a view 
> from SoftValueSortedMap#internalMap. Once we operate 
> on this view map(like add/delete) in other threads, a concurrency issue may 
> occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap

2011-12-21 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174613#comment-13174613
 ] 

Anoop Sam John commented on HBASE-5088:
---

Can we use a thread safe impl of the SortedMap  ConcurrentSkipListMap?


> A concurrency issue on SoftValueSortedMap
> -
>
> Key: HBASE-5088
> URL: https://issues.apache.org/jira/browse/HBASE-5088
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4, 0.94.0
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
>
> SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
> synchronized. If we use this method to add/delete elements, it's ok.
> But in HConnectionManager#getCachedLocation, it use heapMap to get a view 
> from SoftValueSortedMap#internalMap. Once we operate 
> on this view map(like add/delete) in other threads, a concurrency issue may 
> occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap

2011-12-21 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174612#comment-13174612
 ] 

Anoop Sam John commented on HBASE-5088:
---

Good catch Jieshan

SoftValueSortedMap.headMap ()  returns a new Object of SoftValueSortedMap.
This new object's internalMap is a view of the original internalMap which is 
refered by the 1st SoftValueSortedMap..  But the view always refer and operates 
on the actual TreeMap datastructure which is not Thread safe
Now 2 threads can operate on the 2 SoftValueSortedMap objects concurrently 
which will result in concurrent operation on a single TreeMap object..

> A concurrency issue on SoftValueSortedMap
> -
>
> Key: HBASE-5088
> URL: https://issues.apache.org/jira/browse/HBASE-5088
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4, 0.94.0
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
>
> SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
> synchronized. If we use this method to add/delete elements, it's ok.
> But in HConnectionManager#getCachedLocation, it use heapMap to get a view 
> from SoftValueSortedMap#internalMap. Once we operate 
> on this view map(like add/delete) in other threads, a concurrency issue may 
> occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-14 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169264#comment-13169264
 ] 

Anoop Sam John commented on HBASE-5009:
---

Due to the max wait time out elapse, when we drop the split attempt, we just 
shutdown the thread pool which do the StoreFileSplitter tasks. This will not 
guarantee the stop of the threads . Dont you think this could be an issue?

After the time out the split log thread will start the rollback but some 
threads which it had started still might be there alive and can do some work 
afterwards. 
Ram - The split rollback need to ensure the closure of these threads also?


> Failure of creating split dir if it already exists prevents splits from 
> happening further
> -
>
> Key: HBASE-5009
> URL: https://issues.apache.org/jira/browse/HBASE-5009
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>
> The scenario is
> -> The split of a region takes a long time
> -> The deletion of the splitDir fails due to HDFS problems.
> -> Subsequent splits also fail after that.
> {code}
> private static void createSplitDir(final FileSystem fs, final Path splitdir)
>   throws IOException {
> if (fs.exists(splitdir)) throw new IOException("Splitdir already exits? " 
> + splitdir);
> if (!fs.mkdirs(splitdir)) throw new IOException("Failed create of " + 
> splitdir);
>   }
> {code}
> Correct me if am wrong? If it is an issue can we change the behaviour of 
> throwing exception?
> Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira