[jira] [Created] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-20 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-5849:


 Summary: On first cluster startup, RS aborts if root znode is not 
available
 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar


When launching a fresh new cluster, the master has to be started first, which 
might create race conditions for starting master and rs at the same time. 

Master startup code is smt like this: 
 - establish zk connection
 - create root znodes in zk (/hbase)
 - create ephemeral node for master /hbase/master, 

 Region server start up code is smt like this: 
 - establish zk connection
 - check whether the root znode (/hbase) is there. If not, shutdown. 
 - wait for the master to create znodes /hbase/master

So, the problem is on the very first launch of the cluster, RS aborts to start 
since /hbase znode might not have been created yet (only the master creates it 
if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent 
cluster starts, it does not matter which order the servers are started. So this 
affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259927#comment-13259927
 ] 

Enis Soztutar commented on HBASE-5849:
--

Upon inspecting further, it seems the patch for HBASE-4138 added the check for 
the base server at region server start code. While it makes sense to check for 
znode.parent from the client side, we should not do that for the regionserver. 

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar

 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5849:
-

Attachment: HBASE-5849_v1.patch

Attaching a simple patch. Applies to trunk, 0.92 and 0.94 branches. 

Tested this with pseudo-distributed setup on my laptop, by first launching 
regionserver, and observing that it does actually wait for the master to boot 
up, instead of aborting. I'll try to come up with a boot order unit test 
shortly.

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-5849_v1.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5849:
-

Status: Open  (was: Patch Available)

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-5849_v1.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5849:
-

Attachment: HBASE-5849_v2.patch

Thanks Stack for taking a look into this. I have added a unit test for boot 
order for the cluster. 

To answer you earlier comment, I think the region server should just keep 
waiting until there is an active master. 

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-5849_v1.patch, HBASE-5849_v2.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5849:
-

Status: Patch Available  (was: Open)

Rerunning hudson for patch v2. 

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-5849_v1.patch, HBASE-5849_v2.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-24 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260791#comment-13260791
 ] 

Enis Soztutar commented on HBASE-5849:
--

Interesting that Hudson did not report any test failures. let me dig down to 
this. 

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5342) Grant/Revoke global permissions

2012-04-24 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261149#comment-13261149
 ] 

Enis Soztutar commented on HBASE-5342:
--

@Matteo, I do not plan to work on this in the near future, feel free to take a 
shot. As Gary mentioned, there is already the infrastructure to manage and 
distribute ACL changes to region servers. I think for this, we should just 
reuse those. For the hbase shell, we just need to make table argument optional, 
and change the AccessControlProtocol.grant()/revoke() methods to accept 
Permission objects rather than TablePermission objects. 

 Grant/Revoke global permissions
 ---

 Key: HBASE-5342
 URL: https://issues.apache.org/jira/browse/HBASE-5342
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar

 HBASE-3025 introduced simple ACLs based on coprocessors. It defines 
 global/table/cf/cq level permissions. However, there is no way to 
 grant/revoke global level permissions, other than the hbase.superuser conf 
 setting. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-24 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5849:
-

Attachment: HBASE-5849_v4.patch
HBASE-5849_v4-0.92.patch
HBASE-5849_v4.patch

I have found 2 issues, that caused timeouts in 0.92 branch: 
1. hbase dir was not setup to use the temp dir under target/, but used the 
default one under /tmp/hadoop-${username}, so running the  test on 0.92 causes 
rs to not come up if you have dirty data under /tmp/. 
2. giving timeouts like @Test(timeout=xxx) causes 0.92 master to not shutdown 
properly. I could not inspect this further, there might be an issue with 
surefire. 

As a result, I updated the patch to first boot up a mini dfs, and setup the 
hbase dir. And I also removed the timeouts (the test runner (maven) will 
timeout instead if something goes wrong).

All my tests for trunk,0.94, and 0.92 seem to pass.  

@Ted, @Stack, can you please try the patch to see whether you can replicate?

On an unrelated note, the ResourceChecker notifies that some of the daemon 
threads (like LruBlockCache.EvictionThread) are not shutdown properly (even 
when using MiniHBaseCluster, and shutting down properly). Any idea, whether we 
should dig into that?

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, 
 HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, 
 HBASE-5849_v4.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-24 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5849:
-

Status: Patch Available  (was: Reopened)

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, 
 HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, 
 HBASE-5849_v4.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-24 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5849:
-

Attachment: HBASE-5849_v4.patch

Reattaching for Jenkins. 

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, 
 HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, 
 HBASE-5849_v4.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4821) A fully automated comprehensive distributed integration test for HBase

2012-04-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261855#comment-13261855
 ] 

Enis Soztutar commented on HBASE-4821:
--

DD and I also want to commit some resources into developing/maintaining/running 
such tests. We are also willing to allocate some  cluster resources into 
running the tests for extended periods of time. 

@Mikhail, do you have anything planned yet? To go further with this, I think a 
short test design doc would be a great start, wdyt? 

@Keith, @Stack, do you think we should port goraci inside hbase or bigtop? 

@Roman, I love the idea that bigtop provides services for deployment, and 
running e2e (end to end) tests. But in my experience, maintaining the actual 
tests (code, logic, etc) will be a lot easier if the code resides inside hbase. 
Does bigtop provide that kind of use case?

 A fully automated comprehensive distributed integration test for HBase
 --

 Key: HBASE-4821
 URL: https://issues.apache.org/jira/browse/HBASE-4821
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Critical

 To properly verify that a particular version of HBase is good for production 
 deployment we need a better way to do real cluster testing after incremental 
 changes. Running unit tests is good, but we also need to deploy HBase to a 
 cluster, run integration tests, load tests, Thrift server tests, kill some 
 region servers, kill the master, and produce a report. All of this needs to 
 happen in 20-30 minutes with minimal manual intervention. I think this way we 
 can combine agile development with high stability of the codebase. I am 
 envisioning a high-level framework written in a scripting language (e.g. 
 Python) that would abstract external operations such as deploy to test 
 cluster, kill a particular server, run load test A, run load test B 
 (we already have a few kinds of load tests implemented in Java, and we could 
 write a Thrift load test in Python). This tool should also produce 
 intermediate output, allowing to catch problems early and restart the test.
 No implementation has yet been done. Any ideas or suggestions are welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4821) A fully automated comprehensive distributed integration test for HBase

2012-04-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261973#comment-13261973
 ] 

Enis Soztutar commented on HBASE-4821:
--

Yeah, it makes sense. Agreed that we want to run HBase MR kind of tests as both 
unit tests and #2 tests at a larger scale. What I wanted to ask actually was 
whether bigtop already provides such an API, or shall we develop one in bigtop. 
One other consideration is to abstract away the data for the tests. When run in 
a local cluster, we want to finish in a reasonable time, but when run on a 
5-node cluster or a 100-node cluster, the tests should reasonable stress the 
cluster accordingly.  

 A fully automated comprehensive distributed integration test for HBase
 --

 Key: HBASE-4821
 URL: https://issues.apache.org/jira/browse/HBASE-4821
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Critical

 To properly verify that a particular version of HBase is good for production 
 deployment we need a better way to do real cluster testing after incremental 
 changes. Running unit tests is good, but we also need to deploy HBase to a 
 cluster, run integration tests, load tests, Thrift server tests, kill some 
 region servers, kill the master, and produce a report. All of this needs to 
 happen in 20-30 minutes with minimal manual intervention. I think this way we 
 can combine agile development with high stability of the codebase. I am 
 envisioning a high-level framework written in a scripting language (e.g. 
 Python) that would abstract external operations such as deploy to test 
 cluster, kill a particular server, run load test A, run load test B 
 (we already have a few kinds of load tests implemented in Java, and we could 
 write a Thrift load test in Python). This tool should also produce 
 intermediate output, allowing to catch problems early and restart the test.
 No implementation has yet been done. Any ideas or suggestions are welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4821) A fully automated comprehensive distributed integration test for HBase

2012-04-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262131#comment-13262131
 ] 

Enis Soztutar commented on HBASE-4821:
--

Yep, I was referring to a shim layer + utilities to run against deployed or 
local cluster. Let me check out what we have in bigtop. 

 A fully automated comprehensive distributed integration test for HBase
 --

 Key: HBASE-4821
 URL: https://issues.apache.org/jira/browse/HBASE-4821
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Critical

 To properly verify that a particular version of HBase is good for production 
 deployment we need a better way to do real cluster testing after incremental 
 changes. Running unit tests is good, but we also need to deploy HBase to a 
 cluster, run integration tests, load tests, Thrift server tests, kill some 
 region servers, kill the master, and produce a report. All of this needs to 
 happen in 20-30 minutes with minimal manual intervention. I think this way we 
 can combine agile development with high stability of the codebase. I am 
 envisioning a high-level framework written in a scripting language (e.g. 
 Python) that would abstract external operations such as deploy to test 
 cluster, kill a particular server, run load test A, run load test B 
 (we already have a few kinds of load tests implemented in Java, and we could 
 write a Thrift load test in Python). This tool should also produce 
 intermediate output, allowing to catch problems early and restart the test.
 No implementation has yet been done. Any ideas or suggestions are welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262137#comment-13262137
 ] 

Enis Soztutar commented on HBASE-5849:
--

Thanks all for pursuing this. From the failed Hudson builds: 
https://builds.apache.org/job/HBase-TRUNK-security/183/
https://builds.apache.org/job/HBase-TRUNK/2811/testReport/
https://builds.apache.org/job/HBase-0.92/390/
https://builds.apache.org/job/HBase-0.94-security/21/
None of the tests seem related. 

@Stack, for EvictionThread, I guess since the git repo is falling behind, I 
might not have your recent changes (I'm so lazy to checkout from svn). Although 
I saw also some other daemon threads (like a couple of IPC Client threads, 
etc). Let me dig into that later, and see if we can improve on that. I'll open 
another jira if I find anything interesting. 

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, 
 HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, 
 HBASE-5849_v4.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5888) Clover profile in build

2012-04-26 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-5888:


 Summary: Clover profile in build
 Key: HBASE-5888
 URL: https://issues.apache.org/jira/browse/HBASE-5888
 Project: HBase
  Issue Type: Improvement
  Components: build, test
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar


Clover is disabled right now. I would like to add a profile that enables clover 
reports. We can also backport this to 0.92, and 0.94, since we are also 
interested in test coverage for those branches. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5888) Clover profile in build

2012-04-26 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5888:
-

Attachment: hbase-clover_v1.patch

Patch against trunk. I'll provide backwards patches once we are settled. 

Replicating the patch comment: 

Profile for running clover. You need to have a clover license under 
~/.clover.license for ${clover.version}
 or you can provide the license with 
-Dmaven.clover.licenseLocation=/path/to/license. Committers can find
 the license under 
https://svn.apache.org/repos/private/committers/donated-licenses/clover/
 Note that clover 2.6.3 does not run with maven 3, so you have to use 
maven2. The report will be generated
 under target/site/clover/index.html when you run
 MAVEN_OPTS=-Xmx2048m mvn clean test -Pclover site

 Clover profile in build
 ---

 Key: HBASE-5888
 URL: https://issues.apache.org/jira/browse/HBASE-5888
 Project: HBase
  Issue Type: Improvement
  Components: build, test
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: hbase-clover_v1.patch


 Clover is disabled right now. I would like to add a profile that enables 
 clover reports. We can also backport this to 0.92, and 0.94, since we are 
 also interested in test coverage for those branches. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5385) Delete table/column should delete stored permissions on -acl- table

2012-04-30 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13265212#comment-13265212
 ] 

Enis Soztutar commented on HBASE-5385:
--

Looks good. Can we add:
1. Audit logging AccessController.AUDITLOG
2. On preCreateTable and preAddColumn, ensure that the acl table is empty for 
the table / column. We might still have residual acl entries if smt goes wrong. 
If so, we should refuse creating a table by throwing a kind of access control 
exception. 

Andrew, any comments? 

 Delete table/column should delete stored permissions on -acl- table  
 -

 Key: HBASE-5385
 URL: https://issues.apache.org/jira/browse/HBASE-5385
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.94.0
Reporter: Enis Soztutar
Assignee: Matteo Bertozzi
 Attachments: HBASE-5385-v0.patch, HBASE-5385-v1.patch


 Deleting the table or a column does not cascade to the stored permissions at 
 the -acl- table. We should also remove those permissions, otherwise, it can 
 be a security leak, where freshly created tables contain permissions from 
 previous same-named tables. We might also want to ensure, upon table 
 creation, that no entries are already stored at the -acl- table. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5888) Clover profile in build

2012-04-30 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5888:
-

Attachment: HBASE-5358_v2.patch

Updated the patch to ignore generated packages (thrift.generated, 
protobuf.generated), since they are skewing coverage results. 

I uploaded a sample report for 0.92 here:
http://people.apache.org/~enis/hbase-clover/

 Clover profile in build
 ---

 Key: HBASE-5888
 URL: https://issues.apache.org/jira/browse/HBASE-5888
 Project: HBase
  Issue Type: Improvement
  Components: build, test
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-5358_v2.patch, hbase-clover_v1.patch


 Clover is disabled right now. I would like to add a profile that enables 
 clover reports. We can also backport this to 0.92, and 0.94, since we are 
 also interested in test coverage for those branches. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5888) Clover profile in build

2012-04-30 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5888:
-

Status: Patch Available  (was: Open)

 Clover profile in build
 ---

 Key: HBASE-5888
 URL: https://issues.apache.org/jira/browse/HBASE-5888
 Project: HBase
  Issue Type: Improvement
  Components: build, test
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-5358_v2.patch, hbase-clover_v1.patch


 Clover is disabled right now. I would like to add a profile that enables 
 clover reports. We can also backport this to 0.92, and 0.94, since we are 
 also interested in test coverage for those branches. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5888) Clover profile in build

2012-04-30 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5888:
-

Status: Open  (was: Patch Available)

 Clover profile in build
 ---

 Key: HBASE-5888
 URL: https://issues.apache.org/jira/browse/HBASE-5888
 Project: HBase
  Issue Type: Improvement
  Components: build, test
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: hbase-clover_v1.patch


 Clover is disabled right now. I would like to add a profile that enables 
 clover reports. We can also backport this to 0.92, and 0.94, since we are 
 also interested in test coverage for those branches. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5888) Clover profile in build

2012-04-30 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5888:
-

Attachment: (was: HBASE-5358_v2.patch)

 Clover profile in build
 ---

 Key: HBASE-5888
 URL: https://issues.apache.org/jira/browse/HBASE-5888
 Project: HBase
  Issue Type: Improvement
  Components: build, test
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: hbase-clover_v1.patch


 Clover is disabled right now. I would like to add a profile that enables 
 clover reports. We can also backport this to 0.92, and 0.94, since we are 
 also interested in test coverage for those branches. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5888) Clover profile in build

2012-04-30 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5888:
-

Attachment: hbase-clover_v2.patch

Uploaded wrong patch. This should be the one.

 Clover profile in build
 ---

 Key: HBASE-5888
 URL: https://issues.apache.org/jira/browse/HBASE-5888
 Project: HBase
  Issue Type: Improvement
  Components: build, test
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: hbase-clover_v1.patch, hbase-clover_v2.patch


 Clover is disabled right now. I would like to add a profile that enables 
 clover reports. We can also backport this to 0.92, and 0.94, since we are 
 also interested in test coverage for those branches. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5888) Clover profile in build

2012-04-30 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5888:
-

Status: Patch Available  (was: Open)

 Clover profile in build
 ---

 Key: HBASE-5888
 URL: https://issues.apache.org/jira/browse/HBASE-5888
 Project: HBase
  Issue Type: Improvement
  Components: build, test
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: hbase-clover_v1.patch, hbase-clover_v2.patch


 Clover is disabled right now. I would like to add a profile that enables 
 clover reports. We can also backport this to 0.92, and 0.94, since we are 
 also interested in test coverage for those branches. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5968) Proper html escaping for region names

2012-05-08 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-5968:


 Summary: Proper html escaping for region names
 Key: HBASE-5968
 URL: https://issues.apache.org/jira/browse/HBASE-5968
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar


I noticed that we are not doing html escaping for the rs/master web interfaces, 
so you can end up generating html like: 
{code}
tr
  tdci,,\xEEp/T\xBE\xC0,1336471826990.fc5a943e75ce8521b1ccdaf72d2c96c8./td
  
  td
a 
href=http://hrt24n06.cc1.ygridcore.net:60030/;hrt24n06.cc1.ygridcore.net:60030/a
  /td
  
  td,\xEEp/T\xBE\xC0/td
  td-n\xA8\xE0\x15\xDD\x80!/td
  td2966724/td
/tr
{code}

This obviously does not render properly. 

Also, my crazy theory is that it can be a security risk. Since the region name 
is computed from table rows, which are most of the time user input. Thus if  
the rows contain a script onload= or similar, then that will be executed on 
the developer's browser having possibly access to dev environment. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5968) Proper html escaping for region names

2012-05-08 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5968:
-

Description: 
I noticed that we are not doing html escaping for the rs/master web interfaces, 
so you can end up generating html like: 
{code}
tr
  tdci,,\xEEp/T\xBE\xC0,1336471826990.fc5a943e75ce8521b1ccdaf72d2c96c8./td
  
  td
a href=hostnamehostname/a
  /td
  
  td,\xEEp/T\xBE\xC0/td
  td-n\xA8\xE0\x15\xDD\x80!/td
  td2966724/td
/tr
{code}

This obviously does not render properly. 

Also, my crazy theory is that it can be a security risk. Since the region name 
is computed from table rows, which are most of the time user input. Thus if  
the rows contain a script onload= or similar, then that will be executed on 
the developer's browser having possibly access to dev environment. 


  was:
I noticed that we are not doing html escaping for the rs/master web interfaces, 
so you can end up generating html like: 
{code}
tr
  tdci,,\xEEp/T\xBE\xC0,1336471826990.fc5a943e75ce8521b1ccdaf72d2c96c8./td
  
  td
a 
href=http://hrt24n06.cc1.ygridcore.net:60030/;hrt24n06.cc1.ygridcore.net:60030/a
  /td
  
  td,\xEEp/T\xBE\xC0/td
  td-n\xA8\xE0\x15\xDD\x80!/td
  td2966724/td
/tr
{code}

This obviously does not render properly. 

Also, my crazy theory is that it can be a security risk. Since the region name 
is computed from table rows, which are most of the time user input. Thus if  
the rows contain a script onload= or similar, then that will be executed on 
the developer's browser having possibly access to dev environment. 



 Proper html escaping for region names
 -

 Key: HBASE-5968
 URL: https://issues.apache.org/jira/browse/HBASE-5968
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar

 I noticed that we are not doing html escaping for the rs/master web 
 interfaces, so you can end up generating html like: 
 {code}
 tr
   
 tdci,,\xEEp/T\xBE\xC0,1336471826990.fc5a943e75ce8521b1ccdaf72d2c96c8./td
   
   td
 a href=hostnamehostname/a
   /td
   
   td,\xEEp/T\xBE\xC0/td
   td-n\xA8\xE0\x15\xDD\x80!/td
   td2966724/td
 /tr
 {code}
 This obviously does not render properly. 
 Also, my crazy theory is that it can be a security risk. Since the region 
 name is computed from table rows, which are most of the time user input. Thus 
 if  the rows contain a script onload= or similar, then that will be 
 executed on the developer's browser having possibly access to dev 
 environment. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5754) data lost with gora continuous ingest test (goraci)

2012-05-08 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271036#comment-13271036
 ] 

Enis Soztutar commented on HBASE-5754:
--

In one of my 0.92.x tests on a 10 node cluster, 250M inserts, I did manage to 
get the verify to fail: 
{code}
12/05/08 11:11:18 INFO mapred.JobClient:   goraci.Verify$Counts
12/05/08 11:11:18 INFO mapred.JobClient: UNDEFINED=972506
12/05/08 11:11:18 INFO mapred.JobClient: REFERENCED=248051318
12/05/08 11:11:18 INFO mapred.JobClient: UNREFERENCED=972506
12/05/08 11:11:18 INFO mapred.JobClient:   Map-Reduce Framework
12/05/08 11:11:18 INFO mapred.JobClient: Map input records=249023824
{code}

Notice that map input records is 1M less that 250M, which indicates that the 
inputformat did not provide all records in the table. The missing rows all 
belong to the single region. I rerun the test again after a couple of hours, 
and it passed. But the failed test created 244 maps, instead of 246, which is 
the current region count, so I am suspecting there is something wrong in the 
split calculation or in the supposed transactional behavior for split/balance 
operations in the meta table. I am still inspecting the code and the logs, but 
any pointers are welcome. 

 data lost with gora continuous ingest test (goraci)
 ---

 Key: HBASE-5754
 URL: https://issues.apache.org/jira/browse/HBASE-5754
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1
 Environment: 10 node test cluster
Reporter: Eric Newton
Assignee: stack

 Keith Turner re-wrote the accumulo continuous ingest test using gora, which 
 has both hbase and accumulo back-ends.
 I put a billion entries into HBase, and ran the Verify map/reduce job.  The 
 verification failed because about 21K entries were missing.  The goraci 
 [README|https://github.com/keith-turner/goraci] explains the test, and how it 
 detects missing data.
 I re-ran the test with 100 million entries, and it verified successfully.  
 Both of the times I tested using a billion entries, the verification failed.
 If I run the verification step twice, the results are consistent, so the 
 problem is
 probably not on the verify step.
 Here's the versions of the various packages:
 ||package||version||
 |hadoop|0.20.205.0|
 |hbase|0.92.1|
 |gora|http://svn.apache.org/repos/asf/gora/trunk r1311277|
 |goraci|https://github.com/ericnewton/goraci  tagged 2012-04-08|
 The change I made to goraci was to configure it for hbase and to allow it to 
 build properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5986) Clients can see holes in the META table when regions are being split

2012-05-10 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-5986:


 Summary: Clients can see holes in the META table when regions are 
being split
 Key: HBASE-5986
 URL: https://issues.apache.org/jira/browse/HBASE-5986
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar


We found this issue when running large scale ingestion tests for HBASE-5754. 
The problem is that the .META. table updates are not atomic while splitting a 
region. In SplitTransaction, there is a time lap between the marking the parent 
offline, and adding of daughters to the META table. This can result in clients 
using MetaScanner, of HTable.getStartEndKeys (used by the TableInputFormat) 
missing regions which are made just offline, but the daughters are not added 
yet. 

This is also related to HBASE-4335. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5986) Clients can see holes in the META table when regions are being split

2012-05-10 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5986:
-

Attachment: HBASE-5986-test_v1.patch

Attaching a unit test to illustrate the problem. The test fails for me when 
splitting the first region, the client sees 0 regions for some time. 

 Clients can see holes in the META table when regions are being split
 

 Key: HBASE-5986
 URL: https://issues.apache.org/jira/browse/HBASE-5986
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-5986-test_v1.patch


 We found this issue when running large scale ingestion tests for HBASE-5754. 
 The problem is that the .META. table updates are not atomic while splitting a 
 region. In SplitTransaction, there is a time lap between the marking the 
 parent offline, and adding of daughters to the META table. This can result in 
 clients using MetaScanner, of HTable.getStartEndKeys (used by the 
 TableInputFormat) missing regions which are made just offline, but the 
 daughters are not added yet. 
 This is also related to HBASE-4335. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5986) Clients can see holes in the META table when regions are being split

2012-05-10 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13272918#comment-13272918
 ] 

Enis Soztutar commented on HBASE-5986:
--

Possible fixes I can think of: 
1. Keep MetaScanner/MetaReader as non-consistent (as it is), but allow for a 
consistent view for getting table regions. Since single row puts are atomic, 
when the parent region is mutated to be offline, the HRI for daughters are 
added to the row. So on MetaScanner.allTableRegions and similar calls, we can 
keep track of daughter regions from split parents and return them to the 
client. 
2. Make MetaScanner consistent, in that, whenever it sees a split parent, it 
blocks until the daughters are available. 
3. We have region-local transactions now, so if we ensure that the rows for 
parent and daughters will be served from the same META region, then we can 
update all three rows atomically. Maybe we can come up with a META-specific 
split policy to ensure split-regions go to the same META region. 
Thoughts? 

 Clients can see holes in the META table when regions are being split
 

 Key: HBASE-5986
 URL: https://issues.apache.org/jira/browse/HBASE-5986
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-5986-test_v1.patch


 We found this issue when running large scale ingestion tests for HBASE-5754. 
 The problem is that the .META. table updates are not atomic while splitting a 
 region. In SplitTransaction, there is a time lap between the marking the 
 parent offline, and adding of daughters to the META table. This can result in 
 clients using MetaScanner, of HTable.getStartEndKeys (used by the 
 TableInputFormat) missing regions which are made just offline, but the 
 daughters are not added yet. 
 This is also related to HBASE-4335. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5986) Clients can see holes in the META table when regions are being split

2012-05-10 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13272952#comment-13272952
 ] 

Enis Soztutar commented on HBASE-5986:
--

bq. I think we are assuming in many other places that META only has a single 
region.
The ROOT is -by-design- one region, but META is not, right?

bq. Is there another alternative, such as adding the daughter regions first, 
and then have HTable disentangle conflicts?
I have thought about this as well, but then there is a time window in which you 
have both the parent and daughter regions online, and parent not marked as 
split. So the client again has to resolve that the returned regions are 
overlapping. 


 Clients can see holes in the META table when regions are being split
 

 Key: HBASE-5986
 URL: https://issues.apache.org/jira/browse/HBASE-5986
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-5986-test_v1.patch


 We found this issue when running large scale ingestion tests for HBASE-5754. 
 The problem is that the .META. table updates are not atomic while splitting a 
 region. In SplitTransaction, there is a time lap between the marking the 
 parent offline, and adding of daughters to the META table. This can result in 
 clients using MetaScanner, of HTable.getStartEndKeys (used by the 
 TableInputFormat) missing regions which are made just offline, but the 
 daughters are not added yet. 
 This is also related to HBASE-4335. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5986) Clients can see holes in the META table when regions are being split

2012-05-10 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13272987#comment-13272987
 ] 

Enis Soztutar commented on HBASE-5986:
--

I have implemented approach 1 by adding split daughters to the returned map 
from MetaScanner.allTableRegions(). But then the problem is that, we are 
returning regions which does not yet exists in the META table, so any 
subsequent getRegion call will fail. 

Thinking a bit more about 3, I think we already guarantee that the region split 
parent, and daughters fall into the same META region. Let's say we have two 
regions region1 and region2, with start keys start_key*, and timestamps ts* 
respectively. 

Before split:
{code} 
table start_key1 ts1 encoded_name1
table start_key2 ts2 encoded_name2
{code} 

Now, if we split region1, daughters will be sorted after region1, and before 
region2:
{code} 
table start_key1 ts1 encoded_name1 offline split
table start_key1 ts3 encoded_name1
table mid_key1 ts3 encoded_name1
table start_key2 ts2 encoded_name2
{code} 

we know this since we have the invariants ts3  ts1 
(SplitTransaction.getDaughterRegionIdTimestamp()) and start_key1  mid_key1  
start_key2. Even if we have a region boundary between start_key1 and start_key2 
in the META table, the daughters will be co-located with the parent. The only 
exception is that while the user table is split, we have a concurrent split for 
the META table, and the new region boundary is chosen to be between the parent 
and daughters. With some effort, we can prevent this, but it seems to be very 
highly unlikely. 

So, if my analysis is correct, that means option 3 seems like the best choice, 
since this will not complicate the meta scan code. The problem is that, there 
is no internal API to do multi-row transcations other than using the 
coprocessor. Should we think of allowing that w/o coprocessors?

@Lars, does HRegion.mutateRowsWithLock() guarantee that a concurrent scanner 
won't see partial changes? 

 Clients can see holes in the META table when regions are being split
 

 Key: HBASE-5986
 URL: https://issues.apache.org/jira/browse/HBASE-5986
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-5986-test_v1.patch


 We found this issue when running large scale ingestion tests for HBASE-5754. 
 The problem is that the .META. table updates are not atomic while splitting a 
 region. In SplitTransaction, there is a time lap between the marking the 
 parent offline, and adding of daughters to the META table. This can result in 
 clients using MetaScanner, of HTable.getStartEndKeys (used by the 
 TableInputFormat) missing regions which are made just offline, but the 
 daughters are not added yet. 
 This is also related to HBASE-4335. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6000) Cleanup where we keep .proto files

2012-05-15 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13276371#comment-13276371
 ] 

Enis Soztutar commented on HBASE-6000:
--

.proto files are source files, so +1 for putting them under src/main. 
bq. In my opinion it should not be necessary to have protoc installed to build 
HBase, just like it's not necessary to have the Thrift compiler available
+1 to that. I think we should make out-of-the-box compilation as easy as 
possible. If we commit the generated sources under src/, it should be ok. Also 
+1 on having generated in the package name. We have some maven targets 
depending on that convention.

 Cleanup where we keep .proto files
 --

 Key: HBASE-6000
 URL: https://issues.apache.org/jira/browse/HBASE-6000
 Project: HBase
  Issue Type: Bug
Reporter: stack

 I see Andrew for his pb work over in rest has .protos files under 
 src/main/resources.  We should unify where these files live.  The recently 
 added .protos place them under src/main/protobuf  Its confusing.
 The thift idl files are here under resources too.
 Seems like we should move src/main/protobuf under src/resources to be 
 consistent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6025) Expose Hadoop Metrics through JSON Rest interface

2012-05-16 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13277279#comment-13277279
 ] 

Enis Soztutar commented on HBASE-6025:
--

How is this different than http://region-server:60030/jmx ? It works with 
hadoop-1.0.1+ (HBASE-5309)

 Expose Hadoop Metrics through JSON Rest interface
 -

 Key: HBASE-6025
 URL: https://issues.apache.org/jira/browse/HBASE-6025
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6025) Expose Hadoop Dynamic Metrics through JSON Rest interface

2012-05-16 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13277481#comment-13277481
 ] 

Enis Soztutar commented on HBASE-6025:
--

Makes sense. Thanks for changing the issue title. Is the plan to add them to 
jmx, which in turn will make them available under /jmx? 

 Expose Hadoop Dynamic Metrics through JSON Rest interface
 -

 Key: HBASE-6025
 URL: https://issues.apache.org/jira/browse/HBASE-6025
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6025) Expose Hadoop Dynamic Metrics through JSON Rest interface

2012-05-17 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278011#comment-13278011
 ] 

Enis Soztutar commented on HBASE-6025:
--

bq. You saying that what comes out of jmx will show in a hadoop webapp servlet 
mounted at /jmx?
Yes, it comes from the Hadoop HttpServer class, which adds JMXJsonServlet to 
serve for /jmx. That servlet finds all registered MBeans (including metrics) 
and exports them via json. 

bq. we ship 1.0.0 even in 0.94 which needs to be fixed
hadoop-1.0.3 is recently released with fixes for non-oracle jvms and other bug 
fixes. We can switch to that. 

bq. We should add a 'metrics' link along the top beside the log level, thread 
dump, etc. servlets
Agreed. I learned about this from the Hadoop folks. We should make it more 
visible.

bq. Oh, it looks like the jmx per-region stuff is showing under /jmx because 
Elliott already added this to 0.94 and trunk
In my near-trunk test, I saw the dynamic metrics json, but nothing reported 
underneath (did not spend much time on it). If it works for you, than what you 
outlined seems like a good plan. 


 Expose Hadoop Dynamic Metrics through JSON Rest interface
 -

 Key: HBASE-6025
 URL: https://issues.apache.org/jira/browse/HBASE-6025
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6025) Expose Hadoop Dynamic Metrics through JSON Rest interface

2012-05-17 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278019#comment-13278019
 ] 

Enis Soztutar commented on HBASE-6025:
--

I see. Then, +1 for HBASE-5802.
@Elliot, do you see the exception that Stack has pasted? 

 Expose Hadoop Dynamic Metrics through JSON Rest interface
 -

 Key: HBASE-6025
 URL: https://issues.apache.org/jira/browse/HBASE-6025
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6034) Upgrade Hadoop dependency for 0.92 branch

2012-05-17 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278103#comment-13278103
 ] 

Enis Soztutar commented on HBASE-6034:
--

Shall we do this for 92, 94 and trunk? 

 Upgrade Hadoop dependency for 0.92 branch
 -

 Key: HBASE-6034
 URL: https://issues.apache.org/jira/browse/HBASE-6034
 Project: HBase
  Issue Type: Task
Affects Versions: 0.92.2
Reporter: Andrew Purtell
Priority: Minor
 Attachments: 6034.092.txt, 6034.094.txt


 0.92 branch currently depends on Hadoop 1.0.0, but this has been moved to the 
 archive. The earliest release on www.apache.org/dist/ is 1.0.1. Consider 
 moving up?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6025) Expose Hadoop Dynamic Metrics through JSON Rest interface

2012-05-17 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6025:
-

Attachment: hbase-jmx.patch

Attaching simple patch to add /jmx links to the rs/master web UI's. 

 Expose Hadoop Dynamic Metrics through JSON Rest interface
 -

 Key: HBASE-6025
 URL: https://issues.apache.org/jira/browse/HBASE-6025
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
 Attachments: hbase-jmx.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6009) Changes for HBASE-5209 are technically incompatible

2012-05-18 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279075#comment-13279075
 ] 

Enis Soztutar commented on HBASE-6009:
--

+1 for the release note on 0.92.1. 

 Changes for HBASE-5209 are technically incompatible
 ---

 Key: HBASE-6009
 URL: https://issues.apache.org/jira/browse/HBASE-6009
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1, 0.94.0
Reporter: David S. Wang

 The additions to add backup masters to ClusterStatus are technically 
 incompatible between clients and servers.  Older clients will basically not 
 read the extra bits that the newer server pushes for the backup masters, thus 
 screwing up the serialization for the next blob in the pipe.
 For the Writable, we can add a total size field for ClusterStatus at the 
 beginning, or we can have start and end markers.  I can make a patch for 
 either approach; interested in whatever folks have to suggest.  Would be good 
 to get this in soon to limit the damage to 0.92.1 (don't know if we can get 
 this in in time for 0.94.0).
 Either change will make us forward-compatible starting with when the change 
 goes in, but will not fix the backwards incompatibility, which we will have 
 to mark with a release note as there have already been releases with this 
 change.
 Hopefully we can do this in a cleaner way when wire compat rolls around in 
 0.96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-05-21 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-6060:


 Summary: Regions's in OPENING state from failed regionservers 
takes a long time to recover
 Key: HBASE-6060
 URL: https://issues.apache.org/jira/browse/HBASE-6060
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Reporter: Enis Soztutar
Assignee: Enis Soztutar


we have seen a pattern in tests, that the regions are stuck in OPENING state 
for a very long time when the region server who is opening the region fails. My 
understanding of the process: 
 
 - master calls rs to open the region. If rs is offline, a new plan is 
generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
HMaster.assign()
 - RegionServer, starts opening a region, changes the state in znode. But that 
znode is not ephemeral. (see ZkAssign)
 - Rs transitions zk node from OFFLINE to OPENING. See 
OpenRegionHandler.process()
 - rs then opens the region, and changes znode from OPENING to OPENED
 - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
state, and the master just waits for rs to change the region state, but since 
rs is down, that wont happen. 
 - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
against these kind of conditions. It periodically checks (every 10 sec by 
default) the regions in transition to see whether they timedout 
(hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
which explains what you and I are seeing. 
 - ServerShutdownHandler in Master does not reassign regions in OPENING state, 
although it handles other states. 

Lowering that threshold from the configuration is one option, but still I think 
we can do better. 

Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-05-21 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280469#comment-13280469
 ] 

Enis Soztutar commented on HBASE-6060:
--

Thanks Andrew for the pointer. Agreed that lowering the timeout can have deeper 
impacts. We should fix the issue properly instead. 

 Regions's in OPENING state from failed regionservers takes a long time to 
 recover
 -

 Key: HBASE-6060
 URL: https://issues.apache.org/jira/browse/HBASE-6060
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Reporter: Enis Soztutar
Assignee: Enis Soztutar

 we have seen a pattern in tests, that the regions are stuck in OPENING state 
 for a very long time when the region server who is opening the region fails. 
 My understanding of the process: 
  
  - master calls rs to open the region. If rs is offline, a new plan is 
 generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
 master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
 HMaster.assign()
  - RegionServer, starts opening a region, changes the state in znode. But 
 that znode is not ephemeral. (see ZkAssign)
  - Rs transitions zk node from OFFLINE to OPENING. See 
 OpenRegionHandler.process()
  - rs then opens the region, and changes znode from OPENING to OPENED
  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
 state, and the master just waits for rs to change the region state, but since 
 rs is down, that wont happen. 
  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
 against these kind of conditions. It periodically checks (every 10 sec by 
 default) the regions in transition to see whether they timedout 
 (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
 which explains what you and I are seeing. 
  - ServerShutdownHandler in Master does not reassign regions in OPENING 
 state, although it handles other states. 
 Lowering that threshold from the configuration is one option, but still I 
 think we can do better. 
 Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5986) Clients can see holes in the META table when regions are being split

2012-05-25 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5986:
-

Attachment: HBASE-5986-0.94.patch
HBASE-5986-0.92.patch

Attaching patches for 0.92 and 0.94 branches. They are direct ports of the v3 
patch, but 0.92 patch also includes HRegionServer.getOnlineRegions(byte[] 
tableName) function directly copied from 0.94, since we need it. I have 
discovered this when testing with 0.92, so I would like it to make into it. 


One minor mishap from my part is that the v3 patch which went into trunk 
includes an unrelated change in RegionServerDynamicStatistics. Related issue is 
HBASE-6025. Although the change is trivial ,changing 
RegionServerDynamicStatistics to extend hbase-specific MetricsMBeanBase rather 
than hadoop-specific MetricsDynamicMBeanBase, we may want to note this, or 
revert that part. Backport patches does not include this change. 
Sorry for the trouble guys. 

 Clients can see holes in the META table when regions are being split
 

 Key: HBASE-5986
 URL: https://issues.apache.org/jira/browse/HBASE-5986
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: 5986-v2.txt, HBASE-5986-0.92.patch, 
 HBASE-5986-0.94.patch, HBASE-5986-test_v1.patch, HBASE-5986_v3.patch


 We found this issue when running large scale ingestion tests for HBASE-5754. 
 The problem is that the .META. table updates are not atomic while splitting a 
 region. In SplitTransaction, there is a time lap between the marking the 
 parent offline, and adding of daughters to the META table. This can result in 
 clients using MetaScanner, of HTable.getStartEndKeys (used by the 
 TableInputFormat) missing regions which are made just offline, but the 
 daughters are not added yet. 
 This is also related to HBASE-4335. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5986) Clients can see holes in the META table when regions are being split

2012-05-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283735#comment-13283735
 ] 

Enis Soztutar commented on HBASE-5986:
--

@Ted, I did run TestEndToEndSplitTransaction, but not the whole suite. Let me 
do that. 

 Clients can see holes in the META table when regions are being split
 

 Key: HBASE-5986
 URL: https://issues.apache.org/jira/browse/HBASE-5986
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: 5986-v2.txt, HBASE-5986-0.92.patch, 
 HBASE-5986-0.94.patch, HBASE-5986-test_v1.patch, HBASE-5986_v3.patch


 We found this issue when running large scale ingestion tests for HBASE-5754. 
 The problem is that the .META. table updates are not atomic while splitting a 
 region. In SplitTransaction, there is a time lap between the marking the 
 parent offline, and adding of daughters to the META table. This can result in 
 clients using MetaScanner, of HTable.getStartEndKeys (used by the 
 TableInputFormat) missing regions which are made just offline, but the 
 daughters are not added yet. 
 This is also related to HBASE-4335. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5986) Clients can see holes in the META table when regions are being split

2012-05-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283881#comment-13283881
 ] 

Enis Soztutar commented on HBASE-5986:
--

Here are the test results for 0.94: 
{code}
Tests run: 551, Failures: 0, Errors: 0, Skipped: 0
...
Tests run: 932, Failures: 1, Errors: 2, Skipped: 9

Failed tests:   
testShutdownSimpleFixup(org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster):
 expected:1 but was:0

Tests in error: 
  
testDelayedRpcImmediateReturnValue(org.apache.hadoop.hbase.ipc.TestDelayedRpc): 
Call to /127.0.0.1:53586 failed on socket timeout exception: 
java.net.SocketTimeoutException: 1000 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/127.0.0.1:53623 remote=/127.0.0.1:53586]
  testLocalHBaseCluster(org.apache.hadoop.hbase.TestLocalHBaseCluster): Master 
not initialized after 200 seconds
{code}

I rerun the tests locally with success, except TestLocalHBaseCluster. But it 
fails on 0.94 HEAD as well for me. 

For 0.92:
{code}

Results :

Failed tests:   
testMultipleResubmits(org.apache.hadoop.hbase.master.TestSplitLogManager)
  testcomputeHDFSBlocksDistribution(org.apache.hadoop.hbase.util.TestFSUtils)

Tests in error:
  testClusterRestart(org.apache.hadoop.hbase.master.TestRestartCluster): 
org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
  
testWholesomeSplit(org.apache.hadoop.hbase.regionserver.TestSplitTransaction): 
Failed delete of 
/homes/hortonde/enis/code/hbase-0.92/target/test-data/af023188-0b23-4f9d-a9bc-a074e94e57f8/org.apache.hadoop.hbase.regionserver.TestSplitTransaction/table/7c59b6677ad46bf3f652a83de1e62bcb
  testRollback(org.apache.hadoop.hbase.regionserver.TestSplitTransaction): 
Target HLog directory already exists: 
/homes/hortonde/enis/code/hbase-0.92/target/test-data/af023188-0b23-4f9d-a9bc-a074e94e57f8/org.apache.hadoop.hbase.regionserver.TestSplitTransaction/logs
  testRollback(org.apache.hadoop.hbase.regionserver.TestSplitTransaction)
  loadTest[0](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): test 
timed out after 12 milliseconds
  loadTest[0](org.apache.hadoop.hbase.util.TestMiniClusterLoadParallel): test 
timed out after 12 milliseconds

Tests run: 1135, Failures: 2, Errors: 6, Skipped: 8
{code} 

Also run those failed tests locally with success. It seems we can go ahead with 
0.92 and 0.94 if you don't have any concerns. 

 Clients can see holes in the META table when regions are being split
 

 Key: HBASE-5986
 URL: https://issues.apache.org/jira/browse/HBASE-5986
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: 5986-v2.txt, HBASE-5986-0.92.patch, 
 HBASE-5986-0.94.patch, HBASE-5986-test_v1.patch, HBASE-5986_v3.patch


 We found this issue when running large scale ingestion tests for HBASE-5754. 
 The problem is that the .META. table updates are not atomic while splitting a 
 region. In SplitTransaction, there is a time lap between the marking the 
 parent offline, and adding of daughters to the META table. This can result in 
 clients using MetaScanner, of HTable.getStartEndKeys (used by the 
 TableInputFormat) missing regions which are made just offline, but the 
 daughters are not added yet. 
 This is also related to HBASE-4335. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6135) Style the Web UI to use Twitter's Bootstrap.

2012-05-30 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285910#comment-13285910
 ] 

Enis Soztutar commented on HBASE-6135:
--

+1 to bootstrap.

 Style the Web UI to use Twitter's Bootstrap.
 

 Key: HBASE-6135
 URL: https://issues.apache.org/jira/browse/HBASE-6135
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
 Fix For: 0.96.0


 Our web ui has lagged a little bit behind.  While it's not a huge deal, it is 
 one of the first things that new people see.  As such styling it a little bit 
 better would put a good foot forward.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6096) AccessController v2

2012-05-30 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286125#comment-13286125
 ] 

Enis Soztutar commented on HBASE-6096:
--

bq. The superuser shortcut should be removed.
We need something like a superuser, so that if somehow there is a mixup of 
grants, we can fix it. But as Andrew suggest, just using the service principal 
should be good enough. 

bq. Also we could drop the owner concept
+1. It is better to manage all permissions from one place.

 AccessController v2
 ---

 Key: HBASE-6096
 URL: https://issues.apache.org/jira/browse/HBASE-6096
 Project: HBase
  Issue Type: Umbrella
  Components: security
Affects Versions: 0.96.0, 0.94.1
Reporter: Andrew Purtell

 Umbrella issue for iteration on the initial AccessController drop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-05-31 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286795#comment-13286795
 ] 

Enis Soztutar commented on HBASE-6060:
--

@Ramkrishna, that is great. I have also noticed regions in CLOSING to stay in 
RIT as well, and strangely enough, showing the master as their assigned server. 
Do you think that it can be related? 

 Regions's in OPENING state from failed regionservers takes a long time to 
 recover
 -

 Key: HBASE-6060
 URL: https://issues.apache.org/jira/browse/HBASE-6060
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Reporter: Enis Soztutar
Assignee: Enis Soztutar

 we have seen a pattern in tests, that the regions are stuck in OPENING state 
 for a very long time when the region server who is opening the region fails. 
 My understanding of the process: 
  
  - master calls rs to open the region. If rs is offline, a new plan is 
 generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
 master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
 HMaster.assign()
  - RegionServer, starts opening a region, changes the state in znode. But 
 that znode is not ephemeral. (see ZkAssign)
  - Rs transitions zk node from OFFLINE to OPENING. See 
 OpenRegionHandler.process()
  - rs then opens the region, and changes znode from OPENING to OPENED
  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
 state, and the master just waits for rs to change the region state, but since 
 rs is down, that wont happen. 
  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
 against these kind of conditions. It periodically checks (every 10 sec by 
 default) the regions in transition to see whether they timedout 
 (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
 which explains what you and I are seeing. 
  - ServerShutdownHandler in Master does not reassign regions in OPENING 
 state, although it handles other states. 
 Lowering that threshold from the configuration is one option, but still I 
 think we can do better. 
 Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6152) Split abort is not handled properly

2012-06-01 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287838#comment-13287838
 ] 

Enis Soztutar commented on HBASE-6152:
--

I think the problem is that the master offlines the region at step 3, however, 
the parent region is recovered, and onlined by RS. So all other region 
transitions fail for the master. 

 Split abort is not handled properly
 ---

 Key: HBASE-6152
 URL: https://issues.apache.org/jira/browse/HBASE-6152
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Devaraj Das
Assignee: Devaraj Das

 I ran into this:
 1. RegionServer started to split a region(R), but the split was taking a long 
 time, and hence the split was aborted
 2. As part of cleanup, the RS deleted the ZK node that it created initially 
 for R
 3. The master (AssignmentManager) noticed the node deletion, and made R 
 offline
 4. The RS recovered from the failure, and at some point of time, tried to do 
 the split again.
 5. The master got an event RS_ZK_REGION_SPLIT but the server gave an error 
 like - Received SPLIT for region R from server RS but it doesn't exist 
 anymore,..
 6. The RS apparently did the split successfully this time, but is stuck on 
 the master to delete the znode for the region. It kept on saying - 
 org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the 
 master to process the split for R and it was stuck there forever. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6160) META entries from daughters can be deleted before parent entries

2012-06-04 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288875#comment-13288875
 ] 

Enis Soztutar commented on HBASE-6160:
--

One option for a fix is to ensure that META entry for the parent region is 
deleted before deleting the META entry, and do the META entry deletion 
recursively.

 META entries from daughters can be deleted before parent entries
 

 Key: HBASE-6160
 URL: https://issues.apache.org/jira/browse/HBASE-6160
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.92.2, 0.94.0, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar

 HBASE-5986 fixed and issue, where the client sees the META entry for the 
 parent, but not the children. However, after the fix, we have seen the 
 following issue in tests: 
 Region A is split to - B, C
 Region B is split to - D, E
 After some time, META entry for B is deleted since it is not needed anymore, 
 but META entry for Region A stays in META (C still refers it). In this case, 
 the client throws RegionOfflineException for B. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6160) META entries from daughters can be deleted before parent entries

2012-06-04 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288941#comment-13288941
 ] 

Enis Soztutar commented on HBASE-6160:
--

The exception: 
{code}
12/06/04 06:50:41 ERROR security.UserGroupInformation: 
PriviledgedActionException as: 
cause:org.apache.hadoop.hbase.client.RegionOfflineException: Split daughter 
region 
TestLoadAndVerify_1338798130970,\\xA2\x04\x00\x00\x00\x00\x00/48_0,1338800158687.50a4617eead34cad335a8dfa727d177d.
 cannot be found in META.
Exception in thread main 
org.apache.hadoop.hbase.client.RegionOfflineException: Split daughter region 
TestLoadAndVerify_1338798130970,\\xA2\x04\x00\x00\x00\x00\x00/48_0,1338800158687.50a4617eead34cad335a8dfa727d177d.
 cannot be found in META.
at 
org.apache.hadoop.hbase.client.MetaScanner$BlockingMetaScannerVisitor.processRow(MetaScanner.java:433)
at 
org.apache.hadoop.hbase.client.MetaScanner$TableMetaScannerVisitor.processRow(MetaScanner.java:490)
at 
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:227)
at 
org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:57)
at 
org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:136)
at 
org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:133)
at 
org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:361)
at 
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:133)
at 
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:108)
at 
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:86)
at 
org.apache.hadoop.hbase.client.MetaScanner.allTableRegions(MetaScanner.java:326)
at 
org.apache.hadoop.hbase.client.HTable.getRegionLocations(HTable.java:499)
at 
org.apache.hadoop.hbase.client.HTable.getStartEndKeys(HTable.java:452)
at 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:132)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979)

{code}

So the region in question is 
{code}
50a4617eead34cad335a8dfa727d177d
{code}
and from the logs we see that {{25d9c4ff574a37bd95bf5e5be6d618dd}} is split 
into {{1dc74065583c67b3916c4ed158cb53fa}} and 
{{50a4617eead34cad335a8dfa727d177d}}

{code}
./hbase-hbase-regionserver-ip-10-226-65-102.log:2012-06-04 04:56:02,855 INFO 
org.apache.hadoop.hbase.regionserver.SplitRequest: Region split, META updated, 
and report to master. 
Parent=TestLoadAndVerify_1338798130970,[\x02\x01\x00\x00\x00\x00\x00/71_0,1338799021182.25d9c4ff574a37bd95bf5e5be6d618dd.,
 new regions: 
TestLoadAndVerify_1338798130970,[\x02\x01\x00\x00\x00\x00\x00/71_0,1338800158687.1dc74065583c67b3916c4ed158cb53fa.,
 
TestLoadAndVerify_1338798130970,\\xA2\x04\x00\x00\x00\x00\x00/48_0,1338800158687.50a4617eead34cad335a8dfa727d177d..
 Split took 4sec
{code}

After some time, {{50a4617eead34cad335a8dfa727d177d}} is further split into 
two: 

{code}
./hbase-hbase-regionserver-ip-10-226-65-102.log:2012-06-04 05:41:13,488 INFO 
org.apache.hadoop.hbase.regionserver.SplitRequest: Region split, META updated, 
and report to master. Parent=TestLoadAndVerify_1
338798130970,\\xA2\x04\x00\x00\x00\x00\x00/48_0,1338800158687.50a4617eead34cad335a8dfa727d177d.,
 new regions: 
TestLoadAndVerify_1338798130970,\\xA2\x04\x00\x00\x00\x00\x00/48_0,1338802866393.16288
65d7fa8e9eec3a7d8073465296e., 
TestLoadAndVerify_1338798130970,]y\x04\x00\x00\x00\x00\x00/47_0,1338802866393.413cafe6c61426e26254c197e8c0a6ba..
 Split took 7sec
{code}

Further time passes, and CatalogJanitor deletes the META entry for that region:
{code}
./hbase-hbase-master-ip-10-144-69-91.log:2012-06-04 05:47:16,688 DEBUG 
org.apache.hadoop.hbase.master.CatalogJanitor: Deleting region 
TestLoadAndVerify_1338798130970,\\xA2\x04\x00\x00\x00\x00\x00/48_0,1338800158687.50a4617eead34cad335a8dfa727d177d.
 because daughter splits no longer hold references
./hbase-hbase-master-ip-10-144-69-91.log:2012-06-04 05:47:18,103 INFO 
org.apache.hadoop.hbase.catalog.MetaEditor: Deleted daughters references, 
qualifier=splitA and qualifier=splitB, from parent 
TestLoadAndVerify_1338798130970,\\xA2\x04\x00\x00\x00\x00\x00/48_0,1338800158687.50a4617eead34cad335a8dfa727d177d.
./hbase-hbase-master-ip-10-144-69-91.log:2012-06-04 05:47:18,103 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegion: DELETING region 
hdfs://ip-10-10-50-98.ec2.internal:8020/apps/hbase/data/TestLoadAndVerify_1338798130970/50a4617eead34cad335a8dfa727d177d
./hbase-hbase-master-ip-10-144-69-91.log:2012-06-04 05:47:18,145 INFO 
org.apache.hadoop.hbase.catalog.MetaEditor: Deleted region 

[jira] [Updated] (HBASE-6160) META entries from daughters can be deleted before parent entries

2012-06-04 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6160:
-

Attachment: HBASE-6160_v1.patch

Attaching a patch for trunk. 
- Changes CatalogJanitor to not delete split parents, whose parents are still 
in META.
- Adds a test case

 META entries from daughters can be deleted before parent entries
 

 Key: HBASE-6160
 URL: https://issues.apache.org/jira/browse/HBASE-6160
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.92.2, 0.94.0, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6160_v1.patch


 HBASE-5986 fixed and issue, where the client sees the META entry for the 
 parent, but not the children. However, after the fix, we have seen the 
 following issue in tests: 
 Region A is split to - B, C
 Region B is split to - D, E
 After some time, META entry for B is deleted since it is not needed anymore, 
 but META entry for Region A stays in META (C still refers it). In this case, 
 the client throws RegionOfflineException for B. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6160) META entries from daughters can be deleted before parent entries

2012-06-04 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6160:
-

Status: Patch Available  (was: Open)

 META entries from daughters can be deleted before parent entries
 

 Key: HBASE-6160
 URL: https://issues.apache.org/jira/browse/HBASE-6160
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.94.0, 0.92.2, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6160_v1.patch


 HBASE-5986 fixed and issue, where the client sees the META entry for the 
 parent, but not the children. However, after the fix, we have seen the 
 following issue in tests: 
 Region A is split to - B, C
 Region B is split to - D, E
 After some time, META entry for B is deleted since it is not needed anymore, 
 but META entry for Region A stays in META (C still refers it). In this case, 
 the client throws RegionOfflineException for B. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6160) META entries from daughters can be deleted before parent entries

2012-06-05 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289572#comment-13289572
 ] 

Enis Soztutar commented on HBASE-6160:
--

@Ramkrishna
yes, ideally that is the case. But we may end up with this, if for example, the 
regions are non-uniform. I think we still have to prioritize the ref files in 
compaction, since they also prevent splitting further. 

 META entries from daughters can be deleted before parent entries
 

 Key: HBASE-6160
 URL: https://issues.apache.org/jira/browse/HBASE-6160
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.92.2, 0.94.0, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6160_v1.patch


 HBASE-5986 fixed and issue, where the client sees the META entry for the 
 parent, but not the children. However, after the fix, we have seen the 
 following issue in tests: 
 Region A is split to - B, C
 Region B is split to - D, E
 After some time, META entry for B is deleted since it is not needed anymore, 
 but META entry for Region A stays in META (C still refers it). In this case, 
 the client throws RegionOfflineException for B. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6160) META entries from daughters can be deleted before parent entries

2012-06-05 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6160:
-

Attachment: HBASE-6160_v2.patch

v2 patch addressing Ted's comments. 

 META entries from daughters can be deleted before parent entries
 

 Key: HBASE-6160
 URL: https://issues.apache.org/jira/browse/HBASE-6160
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.92.2, 0.94.0, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6160_v1.patch, HBASE-6160_v2.patch


 HBASE-5986 fixed and issue, where the client sees the META entry for the 
 parent, but not the children. However, after the fix, we have seen the 
 following issue in tests: 
 Region A is split to - B, C
 Region B is split to - D, E
 After some time, META entry for B is deleted since it is not needed anymore, 
 but META entry for Region A stays in META (C still refers it). In this case, 
 the client throws RegionOfflineException for B. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6160) META entries from daughters can be deleted before parent entries

2012-06-05 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289618#comment-13289618
 ] 

Enis Soztutar commented on HBASE-6160:
--

Thanks Stack, you beat me to the 92, and 94 patches. 

 META entries from daughters can be deleted before parent entries
 

 Key: HBASE-6160
 URL: https://issues.apache.org/jira/browse/HBASE-6160
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.92.2, 0.94.0, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.1

 Attachments: HBASE-6160_v1.patch, HBASE-6160_v2.patch, 
 HBASE-6160_v2.patch, HBASE-6160v2092.txt


 HBASE-5986 fixed and issue, where the client sees the META entry for the 
 parent, but not the children. However, after the fix, we have seen the 
 following issue in tests: 
 Region A is split to - B, C
 Region B is split to - D, E
 After some time, META entry for B is deleted since it is not needed anymore, 
 but META entry for Region A stays in META (C still refers it). In this case, 
 the client throws RegionOfflineException for B. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6168) [replication] Add replication zookeeper state documentation to replication.html

2012-06-06 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13290580#comment-13290580
 ] 

Enis Soztutar commented on HBASE-6168:
--

Great doc. minor issues: 
 - the peer name does not have to be an integer, AFAIK. 
 - we can add lock znodes for RS failover.  

 [replication] Add replication zookeeper state documentation to 
 replication.html
 ---

 Key: HBASE-6168
 URL: https://issues.apache.org/jira/browse/HBASE-6168
 Project: HBase
  Issue Type: Improvement
  Components: documentation, replication
Affects Versions: 0.96.0
Reporter: Chris Trezzo
Assignee: Chris Trezzo
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-6168.patch, HBASE-6168v2.patch


 Add a detailed explanation about the zookeeper state that HBase replication 
 maintains.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5372) Table mutation operations should check table level rights, not global rights

2012-06-07 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar reassigned HBASE-5372:


Assignee: Laxman  (was: Enis Soztutar)

Sure by all means go ahead, I'll assign the issue to you. 

 Table mutation operations should check table level rights, not global rights 
 -

 Key: HBASE-5372
 URL: https://issues.apache.org/jira/browse/HBASE-5372
 Project: HBase
  Issue Type: Sub-task
  Components: security
Reporter: Enis Soztutar
Assignee: Laxman

 getUserPermissions(tableName)/grant/revoke and drop/modify table operations 
 should not check for global CREATE/ADMIN rights, but table CREATE/ADMIN 
 rights. The reasoning is that if a user is able to admin or read from a 
 table, she should be able to read the table's permissions. We can choose 
 whether we want only READ or ADMIN permissions for getUserPermission(). Since 
 we check for global permissions first for table permissions, configuring 
 table access using global permissions will continue to work. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-06-07 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291488#comment-13291488
 ] 

Enis Soztutar commented on HBASE-6060:
--

This issue, and the other related issues Ram has recently fixed makes me very 
nervous about all the state combinations distributed between zk / meta / 
rs-memory and master-memory. After this is done, do you think we can come up 
with a simpler design? I do not have any particular idea, so just spitballing 
here.  

 Regions's in OPENING state from failed regionservers takes a long time to 
 recover
 -

 Key: HBASE-6060
 URL: https://issues.apache.org/jira/browse/HBASE-6060
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Reporter: Enis Soztutar
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1, 0.92.3

 Attachments: 6060-94-v3.patch, 6060-94-v4.patch, 6060-94-v4_1.patch, 
 6060-94-v4_1.patch, 6060-trunk.patch, 6060-trunk.patch, 6060-trunk_2.patch, 
 6060-trunk_3.patch, HBASE-6060-92.patch, HBASE-6060-94.patch


 we have seen a pattern in tests, that the regions are stuck in OPENING state 
 for a very long time when the region server who is opening the region fails. 
 My understanding of the process: 
  
  - master calls rs to open the region. If rs is offline, a new plan is 
 generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
 master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
 HMaster.assign()
  - RegionServer, starts opening a region, changes the state in znode. But 
 that znode is not ephemeral. (see ZkAssign)
  - Rs transitions zk node from OFFLINE to OPENING. See 
 OpenRegionHandler.process()
  - rs then opens the region, and changes znode from OPENING to OPENED
  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
 state, and the master just waits for rs to change the region state, but since 
 rs is down, that wont happen. 
  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
 against these kind of conditions. It periodically checks (every 10 sec by 
 default) the regions in transition to see whether they timedout 
 (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
 which explains what you and I are seeing. 
  - ServerShutdownHandler in Master does not reassign regions in OPENING 
 state, although it handles other states. 
 Lowering that threshold from the configuration is one option, but still I 
 think we can do better. 
 Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6192) Document ACL matrix in the book

2012-06-08 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-6192:


 Summary: Document ACL matrix in the book
 Key: HBASE-6192
 URL: https://issues.apache.org/jira/browse/HBASE-6192
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.96.0
Reporter: Enis Soztutar


We have an excellent matrix at 
https://issues.apache.org/jira/secure/attachment/12531252/Security-ACL%20Matrix.pdf
 for ACL. Once the changes are done, we can adapt that and put it in the book, 
also add some more documentation about the new authorization features. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5947) Check for valid user/table/family/qualifier and acl state

2012-06-11 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292951#comment-13292951
 ] 

Enis Soztutar commented on HBASE-5947:
--

bq. No news on that... check for column qualifier require a deep scan or 
keeping ref-counted qualifiers somewhere.
For qualifiers, I think it is fine to not enforce that they exists, but we 
should check for table / cf. For preCreateTable, and postDelete, we have to do 
the scan on ACL table, not on the actual table, no? 

 Check for valid user/table/family/qualifier and acl state
 -

 Key: HBASE-5947
 URL: https://issues.apache.org/jira/browse/HBASE-5947
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl

 HBase Shell grant/revoke doesn't check for valid user or 
 table/family/qualifier so can you end up having rights for something that 
 doesn't exists.
 We might also want to ensure, upon table/column creation, that no entries are 
 already stored at the acl table. We might still have residual acl entries if 
 something goes wrong, in postDeleteTable(), postDeleteColumn().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5947) Check for valid user/table/family/qualifier and acl state

2012-06-11 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292959#comment-13292959
 ] 

Enis Soztutar commented on HBASE-5947:
--

Are we sure we want to check for users? 

 Check for valid user/table/family/qualifier and acl state
 -

 Key: HBASE-5947
 URL: https://issues.apache.org/jira/browse/HBASE-5947
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl

 HBase Shell grant/revoke doesn't check for valid user or 
 table/family/qualifier so can you end up having rights for something that 
 doesn't exists.
 We might also want to ensure, upon table/column creation, that no entries are 
 already stored at the acl table. We might still have residual acl entries if 
 something goes wrong, in postDeleteTable(), postDeleteColumn().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4391) Add ability to start RS as root and call mlockall

2012-06-11 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292975#comment-13292975
 ] 

Enis Soztutar commented on HBASE-4391:
--

I've seen smt similar in accumulo code base: 
http://svn.apache.org/viewvc/accumulo/trunk/server/src/main/c%2B%2B/mlock/
http://svn.apache.org/viewvc/accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/MLock.java?view=log

 Add ability to start RS as root and call mlockall
 -

 Key: HBASE-4391
 URL: https://issues.apache.org/jira/browse/HBASE-4391
 Project: HBase
  Issue Type: New Feature
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.96.0

 Attachments: HBASE-4391-v0.patch


 A common issue we've seen in practice is that users oversubscribe their 
 region servers with too many MR tasks, etc. As soon as the machine starts 
 swapping, the RS grinds to a halt, loses ZK session, aborts, etc.
 This can be combatted by starting the RS as root, calling mlockall(), and 
 then setuid down to the hbase user. We should not require this, but we should 
 provide it as an option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5947) Check for valid user/table/family/qualifier and acl state

2012-06-11 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293075#comment-13293075
 ] 

Enis Soztutar commented on HBASE-5947:
--

Then let's reduce the scope for this issue to be: 
 - Check for table / cf existence in grant. not sure about revoke, since we may 
end up in an inconsistent state between ACL and table metadata, so revoke can 
just remove what is available in ACL table. 
 - Ensure that there is no table/cf/qualifier level permissions are stored in 
ACL in preCreateTable 

 Check for valid user/table/family/qualifier and acl state
 -

 Key: HBASE-5947
 URL: https://issues.apache.org/jira/browse/HBASE-5947
 Project: HBase
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: acl

 HBase Shell grant/revoke doesn't check for valid user or 
 table/family/qualifier so can you end up having rights for something that 
 doesn't exists.
 We might also want to ensure, upon table/column creation, that no entries are 
 already stored at the acl table. We might still have residual acl entries if 
 something goes wrong, in postDeleteTable(), postDeleteColumn().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6201) HBase integration/system tests

2012-06-11 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-6201:


 Summary: HBase integration/system tests
 Key: HBASE-6201
 URL: https://issues.apache.org/jira/browse/HBASE-6201
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar


Integration and general system tests have been discussed previously, and the 
conclusion is that we need to unify how we do release candidate testing 
(HBASE-6091).

In this issue, I would like to discuss and agree on a general plan, and open 
subtickets for execution so that we can carry out most of the tests in 
HBASE-6091 automatically. 

Initially, here is what I have in mind: 

1. Create hbase-it (or hbase-tests) containing forward port of HBASE-4454 
(without any tests). This will allow integration test to be run with
 {code}
  mvn verify
 {code}

2. Add ability to run all integration/system tests on a given cluster. Smt 
like: 
 {code}
  mvn verify -Dconf=/etc/hbase/conf/
 {code}
should run the test suite on the given cluster. (Right now we can launch some 
of the tests (TestAcidGuarantees) from command line). Most of the system tests 
will be client side, and interface with the cluster through public APIs. We 
need a tool on top of MiniHBaseCluster or improve HBaseTestingUtility, so that 
tests can interface with the mini cluster or the actual cluster uniformly.

3. Port candidate unit tests to the integration tests module. Some of the 
candidates are: 
 - TestAcidGuarantees / TestAtomicOperation
 - TestRegionBalancing (HBASE-6053)
 - TestFullLogReconstruction
 - TestMasterFailover
 - TestImportExport
 - TestMultiVersions / TestKeepDeletes
 - TestFromClientSide
 - TestShell and src/test/ruby
 - TestRollingRestart
 - Test**OnCluster
 - Balancer tests

These tests should continue to be run as unit tests w/o any change in 
semantics. However, given an actual cluster, they should use that, instead of 
spinning a mini cluster.  

4. Add more tests, especially, long running ingestion tests (goraci, BigTop's 
TestLoadAndVerify, LoadTestTool), and chaos monkey style fault tests. 

All suggestions welcome. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6203) Create hbase-it

2012-06-12 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6203:
-

Attachment: HBASE-6203_v1.patch

Attaching a patch.

 Create hbase-it
 ---

 Key: HBASE-6203
 URL: https://issues.apache.org/jira/browse/HBASE-6203
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
 Attachments: HBASE-6203_v1.patch


 Create hbase-it, as per parent issue, and re-introduce HBASE-4454

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6203) Create hbase-it

2012-06-12 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-6203:


 Summary: Create hbase-it
 Key: HBASE-6203
 URL: https://issues.apache.org/jira/browse/HBASE-6203
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar


Create hbase-it, as per parent issue, and re-introduce HBASE-4454

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6203) Create hbase-it

2012-06-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293972#comment-13293972
 ] 

Enis Soztutar commented on HBASE-6203:
--

Some notes:
{code}
mvn verify 
{code}
runs the tests under hbase-it, named IntegrationTestXXX. Note that {{mvn test}} 
does not run these tests. 
You can run just the integration tests, but cd'ing under hbase-it module, or use
{code}
mvn verify -Dskip-server-tests -Dskip-common-tests
{code}
You can also skip integration tests with {{-Dskip-integration-tests}}. Failsafe 
also honors {{-DskipTests}}. 

 Create hbase-it
 ---

 Key: HBASE-6203
 URL: https://issues.apache.org/jira/browse/HBASE-6203
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
 Attachments: HBASE-6203_v1.patch


 Create hbase-it, as per parent issue, and re-introduce HBASE-4454

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6201) HBase integration/system tests

2012-06-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293981#comment-13293981
 ] 

Enis Soztutar commented on HBASE-6201:
--

bq. Bigtop provides a framework for integration tests that is, essentially, 
'mvn verify'.
Thanks for bringing this up. I know that bigtop provides a test framework for 
integration tests. From my perspective, I see hbase and bigtop sharing 
responsibility on the testing side, and we can work to define best practices 
for this, and would love to hear Bigtop's perspective as well. 

I completely agree that HBase code, should not bother with deployments, cluster 
management services, smoke testing, nor  integration with other components 
(hive, pig, etc). Those kind of functionality can belong in BigTop or similar 
projects. 

However, some core testing functionality, is better managed by the HBase 
project. Lets consider the TestMasterFailover test. Right now it is a unit 
test, testing the internal state transitions, when the master fails. However, 
we can extend this test to run from the client side, and see whether the 
transition is transparent when we kill the active master on an actual cluster. 
That kind of testing, should be managed by HBase itself, because, although they 
would run from the client side, these kind of tests are hbase-specific, and 
better managed by Hbase devs. Also, I do not expect BigTop to host a large 
number of test cases for all of the stack (right now 8 projects). 

Having said that, in this issue, we can come up with a way to interface with 
BigTop (and other projects, custom jenkins jobs, etc) so that, these tests can 
use the underlying deployment, server management, etc services, and BigTop, and 
others can just execute the HBase internal integration tests on the cluster. A 
simple way for this is that HBase to offer {{mvn verify}} to be consumed by 
BigTop, and those tests will use HBase's own scripts (and SSH, etc) for 
cluster/server management. Since BigTop configures the cluster to be usable by 
those, it should be ok.

 HBase integration/system tests
 --

 Key: HBASE-6201
 URL: https://issues.apache.org/jira/browse/HBASE-6201
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar

 Integration and general system tests have been discussed previously, and the 
 conclusion is that we need to unify how we do release candidate testing 
 (HBASE-6091).
 In this issue, I would like to discuss and agree on a general plan, and open 
 subtickets for execution so that we can carry out most of the tests in 
 HBASE-6091 automatically. 
 Initially, here is what I have in mind: 
 1. Create hbase-it (or hbase-tests) containing forward port of HBASE-4454 
 (without any tests). This will allow integration test to be run with
  {code}
   mvn verify
  {code}
 2. Add ability to run all integration/system tests on a given cluster. Smt 
 like: 
  {code}
   mvn verify -Dconf=/etc/hbase/conf/
  {code}
 should run the test suite on the given cluster. (Right now we can launch some 
 of the tests (TestAcidGuarantees) from command line). Most of the system 
 tests will be client side, and interface with the cluster through public 
 APIs. We need a tool on top of MiniHBaseCluster or improve 
 HBaseTestingUtility, so that tests can interface with the mini cluster or the 
 actual cluster uniformly.
 3. Port candidate unit tests to the integration tests module. Some of the 
 candidates are: 
  - TestAcidGuarantees / TestAtomicOperation
  - TestRegionBalancing (HBASE-6053)
  - TestFullLogReconstruction
  - TestMasterFailover
  - TestImportExport
  - TestMultiVersions / TestKeepDeletes
  - TestFromClientSide
  - TestShell and src/test/ruby
  - TestRollingRestart
  - Test**OnCluster
  - Balancer tests
 These tests should continue to be run as unit tests w/o any change in 
 semantics. However, given an actual cluster, they should use that, instead of 
 spinning a mini cluster.  
 4. Add more tests, especially, long running ingestion tests (goraci, BigTop's 
 TestLoadAndVerify, LoadTestTool), and chaos monkey style fault tests. 
 All suggestions welcome. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6053) Enhance TestRegionRebalancing test to be a system test

2012-06-12 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6053:
-

Issue Type: Sub-task  (was: Bug)
Parent: HBASE-6201

 Enhance TestRegionRebalancing test to be a system test
 --

 Key: HBASE-6053
 URL: https://issues.apache.org/jira/browse/HBASE-6053
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Minor
 Attachments: 6053-1.patch, regionRebalancingSystemTest.txt


 TestRegionRebalancing can be converted to be a system test

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6053) Enhance TestRegionRebalancing test to be a system test

2012-06-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294010#comment-13294010
 ] 

Enis Soztutar commented on HBASE-6053:
--

TestRegionRebalancing assumes that there are 1 RS available, and adds other RSs 
afterwards. What happens when we run this on 10/100 node cluster. We can have 
more RS, than initial regions. Should we also generalize the testing condition? 
Or the test will shut down every RS, except for 1, and restart them afterwards? 

We can remove RandomKiller, not used for now. 

 Enhance TestRegionRebalancing test to be a system test
 --

 Key: HBASE-6053
 URL: https://issues.apache.org/jira/browse/HBASE-6053
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Minor
 Attachments: 6053-1.patch, regionRebalancingSystemTest.txt


 TestRegionRebalancing can be converted to be a system test

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6201) HBase integration/system tests

2012-06-18 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396326#comment-13396326
 ] 

Enis Soztutar commented on HBASE-6201:
--

I think your categorization, and my comments above are telling the same thing, 
no confusion there. This umbrella issue is all about maintaining #2 kind of 
tests inside HBase. Now, the problem is how to best interface between HBase and 
Bigtop.

My proposal is that depends on itest-common, and uses it to interact with the 
servers. My understanding is that, even if you are not deploying the cluster 
with bigtop, as long as /etc/init.d/ scripts are there, you should be fine. At 
this point, we only need starting / stopping deamons kind of functionality, 
assuming the cluster is already deployed.

On the other side, if we provide a mvn verify in hbase-it module to run the 
tests on the actual cluster, I assume BigTop can leverage this to carry out the 
tests. 

For refactoring, once the module, and other bits are ready, we can move select 
tests from Bigtop to HBase. I'll open a subtask for that. 

 HBase integration/system tests
 --

 Key: HBASE-6201
 URL: https://issues.apache.org/jira/browse/HBASE-6201
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar

 Integration and general system tests have been discussed previously, and the 
 conclusion is that we need to unify how we do release candidate testing 
 (HBASE-6091).
 In this issue, I would like to discuss and agree on a general plan, and open 
 subtickets for execution so that we can carry out most of the tests in 
 HBASE-6091 automatically. 
 Initially, here is what I have in mind: 
 1. Create hbase-it (or hbase-tests) containing forward port of HBASE-4454 
 (without any tests). This will allow integration test to be run with
  {code}
   mvn verify
  {code}
 2. Add ability to run all integration/system tests on a given cluster. Smt 
 like: 
  {code}
   mvn verify -Dconf=/etc/hbase/conf/
  {code}
 should run the test suite on the given cluster. (Right now we can launch some 
 of the tests (TestAcidGuarantees) from command line). Most of the system 
 tests will be client side, and interface with the cluster through public 
 APIs. We need a tool on top of MiniHBaseCluster or improve 
 HBaseTestingUtility, so that tests can interface with the mini cluster or the 
 actual cluster uniformly.
 3. Port candidate unit tests to the integration tests module. Some of the 
 candidates are: 
  - TestAcidGuarantees / TestAtomicOperation
  - TestRegionBalancing (HBASE-6053)
  - TestFullLogReconstruction
  - TestMasterFailover
  - TestImportExport
  - TestMultiVersions / TestKeepDeletes
  - TestFromClientSide
  - TestShell and src/test/ruby
  - TestRollingRestart
  - Test**OnCluster
  - Balancer tests
 These tests should continue to be run as unit tests w/o any change in 
 semantics. However, given an actual cluster, they should use that, instead of 
 spinning a mini cluster.  
 4. Add more tests, especially, long running ingestion tests (goraci, BigTop's 
 TestLoadAndVerify, LoadTestTool), and chaos monkey style fault tests. 
 All suggestions welcome. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6201) HBase integration/system tests

2012-06-18 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396347#comment-13396347
 ] 

Enis Soztutar commented on HBASE-6201:
--

bq. Are you saying that you would like the tests themeselves to get involved in 
the lifecycle of each service? Like bringing them up and down, etc?
Yes. At the current state, most of our unit tests, which are candidates to be 
upgraded to be system tests does start a mini-cluster of n-nodes, load some 
data, kill a few nodes, verify, etc. We are converting/reimplementing them to 
do the same things on the actual cluster. A particular test case, for example, 
starts 4 region servers, put some data, kills 1 RS, checks whether the regions 
are balanced, kills one more, checks agains, etc. 

Some basic functionality we can use from itest are: 
 - Starting / stopping / sending a signal to daemons (start a region server on 
host1, kill master on host2, etc). For both HBase and Hadoop processes. 
 - Basic cluster/node discovery (give me the nodes running hmaster)
 - Run this command on host3 (SSH)

 HBase integration/system tests
 --

 Key: HBASE-6201
 URL: https://issues.apache.org/jira/browse/HBASE-6201
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar

 Integration and general system tests have been discussed previously, and the 
 conclusion is that we need to unify how we do release candidate testing 
 (HBASE-6091).
 In this issue, I would like to discuss and agree on a general plan, and open 
 subtickets for execution so that we can carry out most of the tests in 
 HBASE-6091 automatically. 
 Initially, here is what I have in mind: 
 1. Create hbase-it (or hbase-tests) containing forward port of HBASE-4454 
 (without any tests). This will allow integration test to be run with
  {code}
   mvn verify
  {code}
 2. Add ability to run all integration/system tests on a given cluster. Smt 
 like: 
  {code}
   mvn verify -Dconf=/etc/hbase/conf/
  {code}
 should run the test suite on the given cluster. (Right now we can launch some 
 of the tests (TestAcidGuarantees) from command line). Most of the system 
 tests will be client side, and interface with the cluster through public 
 APIs. We need a tool on top of MiniHBaseCluster or improve 
 HBaseTestingUtility, so that tests can interface with the mini cluster or the 
 actual cluster uniformly.
 3. Port candidate unit tests to the integration tests module. Some of the 
 candidates are: 
  - TestAcidGuarantees / TestAtomicOperation
  - TestRegionBalancing (HBASE-6053)
  - TestFullLogReconstruction
  - TestMasterFailover
  - TestImportExport
  - TestMultiVersions / TestKeepDeletes
  - TestFromClientSide
  - TestShell and src/test/ruby
  - TestRollingRestart
  - Test**OnCluster
  - Balancer tests
 These tests should continue to be run as unit tests w/o any change in 
 semantics. However, given an actual cluster, they should use that, instead of 
 spinning a mini cluster.  
 4. Add more tests, especially, long running ingestion tests (goraci, BigTop's 
 TestLoadAndVerify, LoadTestTool), and chaos monkey style fault tests. 
 All suggestions welcome. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-6203) Create hbase-it

2012-06-19 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar reassigned HBASE-6203:


Assignee: Enis Soztutar

 Create hbase-it
 ---

 Key: HBASE-6203
 URL: https://issues.apache.org/jira/browse/HBASE-6203
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6203_v1.patch


 Create hbase-it, as per parent issue, and re-introduce HBASE-4454

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6053) Enhance TestRegionRebalancing test to be a system test

2012-06-19 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13397187#comment-13397187
 ] 

Enis Soztutar commented on HBASE-6053:
--

After some discussions, we realized that the patch is too big to handle. I've 
opened HBASE-6241 for tracking the changes for the 
HBaseCluster/MiniHBaseCluster/RealHBaseCluster related changes. In this issue, 
we can track the TestRegionRebalancing-specific changes. Obviously this issue 
will depend on the new issue. 

 Enhance TestRegionRebalancing test to be a system test
 --

 Key: HBASE-6053
 URL: https://issues.apache.org/jira/browse/HBASE-6053
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Minor
 Attachments: 6053-1.patch, regionRebalancingSystemTest.txt


 TestRegionRebalancing can be converted to be a system test

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6205) Support an option to keep data of dropped table for some time

2012-06-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400756#comment-13400756
 ] 

Enis Soztutar commented on HBASE-6205:
--

+1 on using the hdfs trash, it has all the properties we need (configurable, 
easy to use, and works). We just need a way to reconstruct the table.

 Support an option to keep data of dropped table for some time
 -

 Key: HBASE-6205
 URL: https://issues.apache.org/jira/browse/HBASE-6205
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.0, 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: HBASE-6205.patch, HBASE-6205v2.patch, 
 HBASE-6205v3.patch, HBASE-6205v4.patch, HBASE-6205v5.patch


 User may drop table accidentally because of error code or other uncertain 
 reasons.
 Unfortunately, it happens in our environment because one user make a mistake 
 between production cluster and testing cluster.
 So, I just give a suggestion, do we need to support an option to keep data of 
 dropped table for some time, e.g. 1 day
 In the patch:
 We make a new dir named .trashtables in the rood dir.
 In the DeleteTableHandler, we move files in dropped table's dir to trash 
 table dir instead of deleting them directly.
 And Create new class TrashCleaner which will clean dropped tables if it is 
 time out with a period check.
 Default keep time for dropped tables is 1 day, and check period is 1 hour.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6205) Support an option to keep data of dropped table for some time

2012-06-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400946#comment-13400946
 ] 

Enis Soztutar commented on HBASE-6205:
--

After an offline conversation with Ted, and Jitendra, it seems that hdfs trash 
works only for shell. One other concern is that trash is not exposed as hadoop 
filesystem feature, so we have to use the shell-equivalent commands to 
accomplish this, and it will work only on hdfs, not other file systems. 

The question of whether to implement an hbase-thrash boils down to whether we 
want this to work with file systems other than hdfs, and have more control on 
the retention policy.

 Support an option to keep data of dropped table for some time
 -

 Key: HBASE-6205
 URL: https://issues.apache.org/jira/browse/HBASE-6205
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.0, 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: HBASE-6205.patch, HBASE-6205v2.patch, 
 HBASE-6205v3.patch, HBASE-6205v4.patch, HBASE-6205v5.patch


 User may drop table accidentally because of error code or other uncertain 
 reasons.
 Unfortunately, it happens in our environment because one user make a mistake 
 between production cluster and testing cluster.
 So, I just give a suggestion, do we need to support an option to keep data of 
 dropped table for some time, e.g. 1 day
 In the patch:
 We make a new dir named .trashtables in the rood dir.
 In the DeleteTableHandler, we move files in dropped table's dir to trash 
 table dir instead of deleting them directly.
 And Create new class TrashCleaner which will clean dropped tables if it is 
 time out with a period check.
 Default keep time for dropped tables is 1 day, and check period is 1 hour.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6241) HBaseCluster interface for interacting with the cluster from system tests

2012-06-26 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6241:
-

Attachment: HBASE-6241_v0.2.patch

Attaching an patch for early view. I am still polishing stuff, but bulk of the 
patch is pretty much done. i'll upload the candidate for review version once it 
is done. This is based on the patch for HBASE-6053, but does not include 
TestRegionRebalance changes. It requires HBASE-6201.

Some high-level notes on the patch: 
 - uses hbase-it module, and adds a new test there called 
IntegrationTestDataIngestWithChaosMonkey. This class runs LoadTestTool with a 
chaos 
monkey(http://www.codinghorror.com/blog/2011/04/working-with-the-chaos-monkey.html).
 Chaos monkey is very sipmle right now, just does selects a random RS, kills 
and restarts it. 
 - Introduces HBaseCluster, RealHBaseCluster and changes MiniHBaseCluster to 
extends HBaseCluster. 
 - Adds a ClusterManager interface, and a default HBase shell scripts based 
HBaseClusterManager. These are internal-classses and tests does not directly 
refer to them, so we can improve on them, maybe add another implementation when 
BIGTOP-635 is done.
 - I've tested the patch on a mini-cluster as well as a 8-node cluster.
 - Adds an IntegrationTestsDriver class as a driver for running integration 
tests from command line. You can do bin/hbase --config hbase_conf_dir 
o.a.h.h.ITD to run all the integration tests on a real cluster. mvn verify 
runs them on mini cluster. I'll open another issue for mvn verify on real 
clusters.  

 HBaseCluster interface for interacting with the cluster from system tests 
 --

 Key: HBASE-6241
 URL: https://issues.apache.org/jira/browse/HBASE-6241
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6241_v0.2.patch


 We need to abstract away the cluster interactions for system tests running on 
 actual clusters. 
 MiniHBaseCluster and RealHBaseCluster should both implement this interface, 
 and system tests should work with both.
 I'll split Devaraj's patch in HBASE-6053 for the initial version. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6241) HBaseCluster interface for interacting with the cluster from system tests

2012-06-26 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401847#comment-13401847
 ] 

Enis Soztutar commented on HBASE-6241:
--

bq. This JIRA is a sub-task of HBASE-6201 which doesn't have patch attached.
bq. Can you clarify the above ?

Sorry, it should be HBASE-6203. 

 HBaseCluster interface for interacting with the cluster from system tests 
 --

 Key: HBASE-6241
 URL: https://issues.apache.org/jira/browse/HBASE-6241
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6241_v0.2.patch


 We need to abstract away the cluster interactions for system tests running on 
 actual clusters. 
 MiniHBaseCluster and RealHBaseCluster should both implement this interface, 
 and system tests should work with both.
 I'll split Devaraj's patch in HBASE-6053 for the initial version. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6274) Proto files should be in the same palce

2012-06-26 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401854#comment-13401854
 ] 

Enis Soztutar commented on HBASE-6274:
--

Jimmy, this looks like a duplicate of HBASE-6000? 

 Proto files should be in the same palce
 ---

 Key: HBASE-6274
 URL: https://issues.apache.org/jira/browse/HBASE-6274
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Priority: Trivial
 Fix For: 0.96.0


 Currently, proto files are under hbase-server/src/main/protobuf and 
 hbase-server/src/protobuf.  It's better to put them together.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5612) Data types for HBase values

2012-06-26 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401899#comment-13401899
 ] 

Enis Soztutar commented on HBASE-5612:
--

At the recent HBase hackaton, and the BOF sessions, we had some discussions 
about adding some kind of schemas/data types to hbase, and Ian gave a short 
talk about it. Other than the use cases for this jira, having optional 
schema-data has the advantages of:
 - HBase internals can make use of data types (like the block level encoding, 
comparators for sub-fields in keys, etc)
 - HBase shell can make use of the data types, and display the data correctly
 - Hive/Pig can better map their own data-types to hbase types, and their 
schemas to hbase schema, instead of managing it themselves.
 - Client written coprocessors or system level coprocessors can do data 
validation according to the schema and data types.

So, what I am trying to say is that we can start to think of a bigger picture 
for the data types, rather than doing something only for compression/block 
encoding. WDTY? 

 Data types for HBase values
 ---

 Key: HBASE-5612
 URL: https://issues.apache.org/jira/browse/HBASE-5612
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin

 In many real-life applications all values in a certain column family are of a 
 certain data type, e.g. 64-bit integer. We could specify that in the column 
 descriptor and enable data type-specific compression such as variable-length 
 integer encoding.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6205) Support an option to keep data of dropped table for some time

2012-06-27 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402436#comment-13402436
 ] 

Enis Soztutar commented on HBASE-6205:
--

Considering this, HBASE-5547, and snapshots, it seems that we can decouple file 
management, and region-file association. We can build a very lightweight file 
manager, and remove all file deletion code from RS code. 

As in the bigtable design, we can keep the current hfile's (and WAL's) of the 
regions in META, and RS flushes, or rolls the log, adds the file reference at 
META. Then for a snapshot or a backup, we just need a point-in-time snapshot of 
the META table. A master thread can periodically scan the META, and META 
snapshots and the hdfs directories, and delete the files with 0 reference based 
on a policy. And deleting the table will just take a META snapshot for the 
table, and delete the META entries afterwards. This META snapshot will be kept 
for a while (similar to the normal snapshot retention). WDYT, how crazy is 
this? 

 Support an option to keep data of dropped table for some time
 -

 Key: HBASE-6205
 URL: https://issues.apache.org/jira/browse/HBASE-6205
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.0, 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: HBASE-6205.patch, HBASE-6205v2.patch, 
 HBASE-6205v3.patch, HBASE-6205v4.patch, HBASE-6205v5.patch


 User may drop table accidentally because of error code or other uncertain 
 reasons.
 Unfortunately, it happens in our environment because one user make a mistake 
 between production cluster and testing cluster.
 So, I just give a suggestion, do we need to support an option to keep data of 
 dropped table for some time, e.g. 1 day
 In the patch:
 We make a new dir named .trashtables in the rood dir.
 In the DeleteTableHandler, we move files in dropped table's dir to trash 
 table dir instead of deleting them directly.
 And Create new class TrashCleaner which will clean dropped tables if it is 
 time out with a period check.
 Default keep time for dropped tables is 1 day, and check period is 1 hour.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6203) Create hbase-it

2012-06-28 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403361#comment-13403361
 ] 

Enis Soztutar commented on HBASE-6203:
--

How about waiting for HBASE-6241 and committing this and that consecutively? 

 Create hbase-it
 ---

 Key: HBASE-6203
 URL: https://issues.apache.org/jira/browse/HBASE-6203
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6203_v1.patch


 Create hbase-it, as per parent issue, and re-introduce HBASE-4454

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6203) Create hbase-it

2012-06-28 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403647#comment-13403647
 ] 

Enis Soztutar commented on HBASE-6203:
--

HBASE-6241 is currently a large patch, and I don't want to add more complexity 
to it. Let's keep the patches separate. For convenience though, I'll upload 
both merged and unmerged patches at HBASE-6241. 

 Create hbase-it
 ---

 Key: HBASE-6203
 URL: https://issues.apache.org/jira/browse/HBASE-6203
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6203_v1.patch


 Create hbase-it, as per parent issue, and re-introduce HBASE-4454

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6241) HBaseCluster interface for interacting with the cluster from system tests

2012-06-28 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403661#comment-13403661
 ] 

Enis Soztutar commented on HBASE-6241:
--

@Ted thanks for comments. I've addressed most of them.
I've uploaded an updated version of the patch: 
https://reviews.apache.org/r/5653/. I guess RB still does not post to jira. 

 HBaseCluster interface for interacting with the cluster from system tests 
 --

 Key: HBASE-6241
 URL: https://issues.apache.org/jira/browse/HBASE-6241
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6241_v0.2.patch


 We need to abstract away the cluster interactions for system tests running on 
 actual clusters. 
 MiniHBaseCluster and RealHBaseCluster should both implement this interface, 
 and system tests should work with both.
 I'll split Devaraj's patch in HBASE-6053 for the initial version. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6203) Create hbase-it module

2012-06-29 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13404170#comment-13404170
 ] 

Enis Soztutar commented on HBASE-6203:
--

@Jesse,
Agreed. We now have a switch for every module, but -Donly-integration-tests 
might not be necessary since you can do {{cd hbase-it;mvn-verify}}. With maven 
modules, I think the standard way to only execute one module is to cd into that 
module and execute. I'll add that to the doc. Do you think that would be enough?

@Stack 
Thanks a lot for the docs, I totally missed that. 
failsafe is basically a fork of surefire, which only adds pre- and 
post-integration-tests targets. We are not using them right now, but we can 
make use of those targets, for recovering the cluster after the test for 
example. I totally share your doubts about this failsafe/surefire thing, but I 
do not know of any better solution. Suggestions welcome :) 

mvn verify now executes the classes {{IntegrationTestXXX}} as unit tests. There 
is also an IntegrationTestsDriver class in the patch at HBASE-6241, which is 
executed by: 
{code}
bin/hbase --config org.apache.hadoop.hbase.IntegrationTestsDriver
{code}

I'll open another subtask for HBASE-6201, for making {{mvn verify}} to work 
with real clusters. I have checked how bigtop does it, and it seems they have: 
{code}
 bigtop-tests/test-artifacts/  -- contains actual test code
 bigtop-tests/test-execution/  -- contains code + configuration for executing 
the tests
{code}

Especially, if you look into 
{{bigtop-tests/test-execution/smokes/hbase/pom.xml}}, it passes HBASE_HOME, 
HBASE_CONF_DIR, etc from evn to failsafe. It works for bigtop, so I think we 
can make it work for our cases as well.  


 Create hbase-it module
 --

 Key: HBASE-6203
 URL: https://issues.apache.org/jira/browse/HBASE-6203
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: HBASE-6203_v1.patch, it-doc.txt


 Create hbase-it, as per parent issue, and re-introduce HBASE-4454

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6241) HBaseCluster interface for interacting with the cluster from system tests

2012-06-29 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13404174#comment-13404174
 ] 

Enis Soztutar commented on HBASE-6241:
--

@Ted 
Thanks for trying it out. Did you run mvn verify at the top level or cd'ing 
into hbase-it. hbase-it depends on hbase-server, so it fetches hbase-common and 
other jars transitively, but you might have to do mvn install -DskipTests 
first. 

 HBaseCluster interface for interacting with the cluster from system tests 
 --

 Key: HBASE-6241
 URL: https://issues.apache.org/jira/browse/HBASE-6241
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6241_v0.2.patch, HBASE-6241_v1.patch


 We need to abstract away the cluster interactions for system tests running on 
 actual clusters. 
 MiniHBaseCluster and RealHBaseCluster should both implement this interface, 
 and system tests should work with both.
 I'll split Devaraj's patch in HBASE-6053 for the initial version. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6203) Create hbase-it module

2012-06-29 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13404331#comment-13404331
 ] 

Enis Soztutar commented on HBASE-6203:
--

bq. which is kind of a pain
such is maven :)
My only concern for the flag is that, for implementing it, every module has to 
know about and honor only-integration-tests parameter, which seems not clean to 
me. I am not a maven guy, if you have a suggestion, I'll be more than happy to 
try it out. Can we instruct reactor to only run test-compile for everything, 
but test just for hbase-it?  

 Create hbase-it module
 --

 Key: HBASE-6203
 URL: https://issues.apache.org/jira/browse/HBASE-6203
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: HBASE-6203_v1.patch, it-doc.txt


 Create hbase-it, as per parent issue, and re-introduce HBASE-4454

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-6302) Document how to run integration tests

2012-07-09 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar reassigned HBASE-6302:


Assignee: Enis Soztutar

 Document how to run integration tests
 -

 Key: HBASE-6302
 URL: https://issues.apache.org/jira/browse/HBASE-6302
 Project: HBase
  Issue Type: Bug
  Components: documentation
Reporter: stack
Assignee: Enis Soztutar
Priority: Blocker
 Fix For: 0.96.0


 HBASE-6203 has attached the old IT doc with some mods.  When we figure how 
 ITs are to be run, update it and apply the documentation under this issue.  
 Making a blocker against 0.96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5754) data lost with gora continuous ingest test (goraci)

2012-07-17 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416003#comment-13416003
 ] 

Enis Soztutar commented on HBASE-5754:
--

@Lars, 
We have been running this for a while as nightlies, and apart from the reported 
HBASE-5986, HBASE-6060, and HBASE-6160, we did not run into more issues. All of 
them can be considered META issues w/o actual data loss. Let's see what Eric 
would say. 

 data lost with gora continuous ingest test (goraci)
 ---

 Key: HBASE-5754
 URL: https://issues.apache.org/jira/browse/HBASE-5754
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1
 Environment: 10 node test cluster
Reporter: Eric Newton
Assignee: stack

 Keith Turner re-wrote the accumulo continuous ingest test using gora, which 
 has both hbase and accumulo back-ends.
 I put a billion entries into HBase, and ran the Verify map/reduce job.  The 
 verification failed because about 21K entries were missing.  The goraci 
 [README|https://github.com/keith-turner/goraci] explains the test, and how it 
 detects missing data.
 I re-ran the test with 100 million entries, and it verified successfully.  
 Both of the times I tested using a billion entries, the verification failed.
 If I run the verification step twice, the results are consistent, so the 
 problem is
 probably not on the verify step.
 Here's the versions of the various packages:
 ||package||version||
 |hadoop|0.20.205.0|
 |hbase|0.92.1|
 |gora|http://svn.apache.org/repos/asf/gora/trunk r1311277|
 |goraci|https://github.com/ericnewton/goraci  tagged 2012-04-08|
 The change I made to goraci was to configure it for hbase and to allow it to 
 build properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6400) Add getMasterAdmin() and getMasterMonitor() to HConnection

2012-07-17 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-6400:


 Summary: Add getMasterAdmin() and getMasterMonitor() to HConnection
 Key: HBASE-6400
 URL: https://issues.apache.org/jira/browse/HBASE-6400
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar


HConnection used to have getMasterInterface(), but after HBASE-6039 it has been 
removed. I think we need to expose HConnection.getMasterAdmin() and 
getMasterMonitor() a la HConnection.getAdmin(), and getClient(). 

HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason to 
leak keep alive classes to upper layers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6400) Add getMasterAdmin() and getMasterMonitor() to HConnection

2012-07-17 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6400:
-

Attachment: HBASE-6400_v1.patch

Attaching a very simple patch that does the task. 

BTW, I did need these for the patch at HBASE-6241. 

 Add getMasterAdmin() and getMasterMonitor() to HConnection
 --

 Key: HBASE-6400
 URL: https://issues.apache.org/jira/browse/HBASE-6400
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6400_v1.patch


 HConnection used to have getMasterInterface(), but after HBASE-6039 it has 
 been removed. I think we need to expose HConnection.getMasterAdmin() and 
 getMasterMonitor() a la HConnection.getAdmin(), and getClient(). 
 HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason 
 to leak keep alive classes to upper layers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5754) data lost with gora continuous ingest test (goraci)

2012-07-17 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416243#comment-13416243
 ] 

Enis Soztutar commented on HBASE-5754:
--

Just FYI in case, I've been working on adding long running ingestion tests 
while randomly killing of servers, and other types of integration tests over 
at HBASE-6241, HBASE-6201. Feel free to chime in. 

 data lost with gora continuous ingest test (goraci)
 ---

 Key: HBASE-5754
 URL: https://issues.apache.org/jira/browse/HBASE-5754
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1
 Environment: 10 node test cluster
Reporter: Eric Newton
Assignee: stack

 Keith Turner re-wrote the accumulo continuous ingest test using gora, which 
 has both hbase and accumulo back-ends.
 I put a billion entries into HBase, and ran the Verify map/reduce job.  The 
 verification failed because about 21K entries were missing.  The goraci 
 [README|https://github.com/keith-turner/goraci] explains the test, and how it 
 detects missing data.
 I re-ran the test with 100 million entries, and it verified successfully.  
 Both of the times I tested using a billion entries, the verification failed.
 If I run the verification step twice, the results are consistent, so the 
 problem is
 probably not on the verify step.
 Here's the versions of the various packages:
 ||package||version||
 |hadoop|0.20.205.0|
 |hbase|0.92.1|
 |gora|http://svn.apache.org/repos/asf/gora/trunk r1311277|
 |goraci|https://github.com/ericnewton/goraci  tagged 2012-04-08|
 The change I made to goraci was to configure it for hbase and to allow it to 
 build properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6241) HBaseCluster interface for interacting with the cluster from system tests

2012-07-19 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418668#comment-13418668
 ] 

Enis Soztutar commented on HBASE-6241:
--

Thanks Stack for the review. I put up an updated patch at RB (after some sweet 
time-off :) ).

 HBaseCluster interface for interacting with the cluster from system tests 
 --

 Key: HBASE-6241
 URL: https://issues.apache.org/jira/browse/HBASE-6241
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6241_v0.2.patch, HBASE-6241_v1.patch


 We need to abstract away the cluster interactions for system tests running on 
 actual clusters. 
 MiniHBaseCluster and RealHBaseCluster should both implement this interface, 
 and system tests should work with both.
 I'll split Devaraj's patch in HBASE-6053 for the initial version. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6203) Create hbase-it module

2012-07-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6203:
-

  Resolution: Fixed
Release Note: Adds a new module hbase-it, which contains integration and 
system tests
  Status: Resolved  (was: Patch Available)

Resolving this issue as it has been committed

 Create hbase-it module
 --

 Key: HBASE-6203
 URL: https://issues.apache.org/jira/browse/HBASE-6203
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: HBASE-6203_v1.patch, it-doc.txt


 Create hbase-it, as per parent issue, and re-introduce HBASE-4454

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6400) Add getMasterAdmin() and getMasterMonitor() to HConnection

2012-07-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6400:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Resolving this, since it is committed. 

 Add getMasterAdmin() and getMasterMonitor() to HConnection
 --

 Key: HBASE-6400
 URL: https://issues.apache.org/jira/browse/HBASE-6400
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: 6400-v2.patch, HBASE-6400_v1.patch


 HConnection used to have getMaster() which returns HMasterInterface, but 
 after HBASE-6039 it has been removed. I think we need to expose 
 HConnection.getMasterAdmin() and getMasterMonitor() a la 
 HConnection.getAdmin(), and getClient(). 
 HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason 
 to leak keep alive classes to upper layers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6462) TestAcidGuarantees failed on trunk

2012-07-26 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-6462:


 Summary: TestAcidGuarantees failed on trunk
 Key: HBASE-6462
 URL: https://issues.apache.org/jira/browse/HBASE-6462
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Enis Soztutar


I've seen TestAcidGurantees fail with: 

{code}
testGetAtomicity(org.apache.hadoop.hbase.IntegrationTestAcidGuaranteesWithChaosMonkey)
  Time elapsed: 42.611 sec   ERROR!
java.lang.RuntimeException: Deferred
  at 
org.apache.hadoop.hbase.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:76)
  at 
org.apache.hadoop.hbase.MultithreadedTestUtil$TestContext.stop(MultithreadedTestUtil.java:103)
  at 
org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:298)
  at 
org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:248)
  at 
org.apache.hadoop.hbase.IntegrationTestAcidGuaranteesWithChaosMonkey.testGetAtomicity(IntegrationTestAcidGuaranteesWithChaosMonkey.java:58)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
  at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
  at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
  at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
  at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
  at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
  at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47)
  at org.junit.rules.RunRules.evaluate(RunRules.java:18)
  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
  at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
  at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
  at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
  at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
  at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:234)
  at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:133)
  at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:114)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:188)
  at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:166)
  at 
org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:86)
  at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:101)
  at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74)
Caused by: java.lang.RuntimeException: Failed after 
147200!Expected=\x1BT\xC0i\x0CW\x9B\x108\xA0Got:
test_row_0/A:col0/1343328428704/Put/vlen=10/ts=0 val= 
\x1BT\xC0i\x0CW\x9B\x108\xA0
test_row_0/A:col1/1343328428704/Put/vlen=10/ts=0 val= 
\x1BT\xC0i\x0CW\x9B\x108\xA0
test_row_0/A:col10/1343328428704/Put/vlen=10/ts=0 val= 
\x1BT\xC0i\x0CW\x9B\x108\xA0
...
test_row_0/B:col0/1343328425510/Put/vlen=10/ts=0 val= 
4G\xE1T\x1B\xFDa\x98\xAC\xB6
test_row_0/B:col1/1343328425510/Put/vlen=10/ts=0 val= 
4G\xE1T\x1B\xFDa\x98\xAC\xB6
test_row_0/B:col10/1343328425510/Put/vlen=10/ts=0 val= 
...
test_row_0/C:col0/1343328425510/Put/vlen=10/ts=0 val= 
4G\xE1T\x1B\xFDa\x98\xAC\xB6
test_row_0/C:col1/1343328425510/Put/vlen=10/ts=0 val= 
4G\xE1T\x1B\xFDa\x98\xAC\xB6
test_row_0/C:col10/1343328425510/Put/vlen=10/ts=0 val= 
{code}

Might be related to HBASE-2856, but haven't had the time to check the root 
cause. The flusher thread was on. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6462) TestAcidGuarantees failed on trunk

2012-07-26 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423489#comment-13423489
 ] 

Enis Soztutar commented on HBASE-6462:
--

Another stack trace I got now has the 3rd CF with different values:
{code}
Caused by: java.lang.RuntimeException: Failed after 
98200!Expected=\xFF[\x0B_\xAF\xCAQJ\xBDKGot:
test_row_2/A:col0/1343337923715/Put/vlen=10/ts=0 val= \xFF[\x0B_\xAF\xCAQJ\xBDK
test_row_2/A:col1/1343337923715/Put/vlen=10/ts=0 val= \xFF[\x0B_\xAF\xCAQJ\xBDK
..
test_row_2/B:col8/1343337923715/Put/vlen=10/ts=0 val= \xFF[\x0B_\xAF\xCAQJ\xBDK
test_row_2/B:col9/1343337923715/Put/vlen=10/ts=0 val= \xFF[\x0B_\xAF\xCAQJ\xBDK
..
test_row_2/C:col0/1343337921472/Put/vlen=10/ts=0 val= 
\xEA\xD0\x15\xFB\xC0\xE7\xE3\xA0\xDB^
test_row_2/C:col1/1343337921472/Put/vlen=10/ts=0 val= 
\xEA\xD0\x15\xFB\xC0\xE7\xE3\xA0\xDB^
..
{code}



 TestAcidGuarantees failed on trunk
 --

 Key: HBASE-6462
 URL: https://issues.apache.org/jira/browse/HBASE-6462
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Enis Soztutar

 I've seen TestAcidGurantees fail with: 
 {code}
 testGetAtomicity(org.apache.hadoop.hbase.IntegrationTestAcidGuaranteesWithChaosMonkey)
   Time elapsed: 42.611 sec   ERROR!
 java.lang.RuntimeException: Deferred
   at 
 org.apache.hadoop.hbase.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:76)
   at 
 org.apache.hadoop.hbase.MultithreadedTestUtil$TestContext.stop(MultithreadedTestUtil.java:103)
   at 
 org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:298)
   at 
 org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:248)
   at 
 org.apache.hadoop.hbase.IntegrationTestAcidGuaranteesWithChaosMonkey.testGetAtomicity(IntegrationTestAcidGuaranteesWithChaosMonkey.java:58)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
   at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47)
   at org.junit.rules.RunRules.evaluate(RunRules.java:18)
   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:234)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:133)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:114)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:188)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:166)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:86)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:101)
   at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74)
 Caused by: java.lang.RuntimeException: Failed after 
 147200!Expected=\x1BT\xC0i\x0CW\x9B\x108\xA0Got:
 test_row_0/A:col0/1343328428704/Put/vlen=10/ts=0 val= 
 \x1BT\xC0i\x0CW\x9B\x108\xA0
 test_row_0/A:col1/1343328428704/Put/vlen=10/ts=0 val= 
 \x1BT\xC0i\x0CW\x9B\x108\xA0
 

[jira] [Created] (HBASE-6469) Failure on enable/disable table will cause table state in zk to be left as enabling/disabling until master is restart

2012-07-27 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-6469:


 Summary: Failure on enable/disable table will cause table state in 
zk to be left as enabling/disabling until master is restart
 Key: HBASE-6469
 URL: https://issues.apache.org/jira/browse/HBASE-6469
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0, 0.94.2
Reporter: Enis Soztutar
Assignee: Enis Soztutar


In Enable/DisableTableHandler code, if something goes wrong in handling, the 
table state in zk is left as ENABLING / DISABLING. After that we cannot force 
any more action from the API or CLI, and the only recovery path is restarting 
the master. 

{code}
if (done) {
  // Flip the table to enabled.
  this.assignmentManager.getZKTable().setEnabledTable(
this.tableNameStr);
  LOG.info(Table ' + this.tableNameStr
  + ' was successfully enabled. Status: done= + done);
} else {
  LOG.warn(Table ' + this.tableNameStr
  + ' wasn't successfully enabled. Status: done= + done);
}
{code}

Here, if done is false, the table state is not changed. There is also no way to 
set skipTableStateCheck from cli / api. 

We have run into this issue a couple of times before. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6302) Document how to run integration tests

2012-07-30 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6302:
-

Issue Type: Sub-task  (was: Bug)
Parent: HBASE-6201

 Document how to run integration tests
 -

 Key: HBASE-6302
 URL: https://issues.apache.org/jira/browse/HBASE-6302
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: stack
Assignee: Enis Soztutar
Priority: Blocker
 Fix For: 0.96.0


 HBASE-6203 has attached the old IT doc with some mods.  When we figure how 
 ITs are to be run, update it and apply the documentation under this issue.  
 Making a blocker against 0.96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   3   4   5   6   7   8   9   10   >