[jira] [Created] (HBASE-6877) Coprocessor exec result is incorrect when region is in splitting

2012-09-25 Thread chunhui shen (JIRA)
chunhui shen created HBASE-6877:
---

 Summary: Coprocessor exec result is incorrect when region is in 
splitting 
 Key: HBASE-6877
 URL: https://issues.apache.org/jira/browse/HBASE-6877
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.94.1
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical


When we execute the coprocessor, we will called HTable#getStartKeysInRange 
first and get the Keys to exec coprocessor,
if then some regions are split before execCoprocessor RPC, the Keys are 
something wrong now, and the result we get is not integrated, 

for example:
parent region is split into daughter region A and daughter region B,
we executed coprocessor on the parent region, but the result data is only 
daughter region A or daughter region B




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6877) Coprocessor exec result is incorrect when region is in splitting

2012-09-25 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-6877:


Attachment: HBASE-6877.patch

There is a test case in the patch to show this bug

 Coprocessor exec result is incorrect when region is in splitting 
 -

 Key: HBASE-6877
 URL: https://issues.apache.org/jira/browse/HBASE-6877
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.94.1
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Attachments: HBASE-6877.patch


 When we execute the coprocessor, we will called HTable#getStartKeysInRange 
 first and get the Keys to exec coprocessor,
 if then some regions are split before execCoprocessor RPC, the Keys are 
 something wrong now, and the result we get is not integrated, 
 for example:
 parent region is split into daughter region A and daughter region B,
 we executed coprocessor on the parent region, but the result data is only 
 daughter region A or daughter region B

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6870) HTable#coprocessorExec always scan the whole table

2012-09-25 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462475#comment-13462475
 ] 

chunhui shen commented on HBASE-6870:
-

Coprocessor exec result is incorrect if cached region location is wrong 
HBASE-6877

 HTable#coprocessorExec always scan the whole table 
 ---

 Key: HBASE-6870
 URL: https://issues.apache.org/jira/browse/HBASE-6870
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors
Affects Versions: 0.94.1
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, 
 HBASE-6870v2.patch, HBASE-6870v3.patch


 In current logic, HTable#coprocessorExec always scan the whole table, its 
 efficiency is low and will affect the Regionserver carrying .META. under 
 large coprocessorExec requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6870) HTable#coprocessorExec always scan the whole table

2012-09-25 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462479#comment-13462479
 ] 

Andrew Purtell commented on HBASE-6870:
---

Thanks [~zjushch].

 HTable#coprocessorExec always scan the whole table 
 ---

 Key: HBASE-6870
 URL: https://issues.apache.org/jira/browse/HBASE-6870
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors
Affects Versions: 0.94.1
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, 
 HBASE-6870v2.patch, HBASE-6870v3.patch


 In current logic, HTable#coprocessorExec always scan the whole table, its 
 efficiency is low and will affect the Regionserver carrying .META. under 
 large coprocessorExec requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6877) Coprocessor exec result is incorrect when region is in splitting

2012-09-25 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462482#comment-13462482
 ] 

Andrew Purtell commented on HBASE-6877:
---

Reissuing requests to the other daughter when a split is detected makes sense. 
Minor issue with the patch is by dropping actual in method names and 
variables, the result seems to read better.

 Coprocessor exec result is incorrect when region is in splitting 
 -

 Key: HBASE-6877
 URL: https://issues.apache.org/jira/browse/HBASE-6877
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.94.1
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Attachments: HBASE-6877.patch


 When we execute the coprocessor, we will called HTable#getStartKeysInRange 
 first and get the Keys to exec coprocessor,
 if then some regions are split before execCoprocessor RPC, the Keys are 
 something wrong now, and the result we get is not integrated, 
 for example:
 parent region is split into daughter region A and daughter region B,
 we executed coprocessor on the parent region, but the result data is only 
 daughter region A or daughter region B

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6875) Remove commons-httpclient, -component, and up versions on other jars (remove unused repository)

2012-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462495#comment-13462495
 ] 

Hadoop QA commented on HBASE-6875:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12546445/pom.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestHCM

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2928//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2928//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2928//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2928//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2928//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2928//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2928//console

This message is automatically generated.

 Remove commons-httpclient, -component, and up versions on other jars (remove 
 unused repository)
 ---

 Key: HBASE-6875
 URL: https://issues.apache.org/jira/browse/HBASE-6875
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.96.0
Reporter: stack
Assignee: stack
 Attachments: pom.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462526#comment-13462526
 ] 

nkeywal commented on HBASE-6702:


bq. What is this change?
I've changed the interface of the resource checker, but not yet removed 
ResourceCheckerJUnitRule, so I've just commented the removed methods.

bq. Whats this mean 'migrate the localTests to a newer version of surefire'?
The log lines don't show up with surefire 2.10. It works with my patched 
version. But the localTests profile uses the 2.10. It's historical: I've done 
it this way because we don't use categories nor parallelization for localTests.

The v2 should be ready for commit' and will include your comments. Thanks for 
the review!

 ResourceChecker refinement
 --

 Key: HBASE-6702
 URL: https://issues.apache.org/jira/browse/HBASE-6702
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6702.v1.patch


 This was based on some discussion from HBASE-6234.
 The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
 issues, but has since not be widely utilized. Further, with modularization we 
 have had to drop the ResourceChecker from the tests that are moved into the 
 hbase-common module because bringing the ResourceChecker up to hbase-common 
 would involved bringing all its dependencies (which are quite far reaching).
 The question then is, what should we do with it? Get rid of it? Refactor and 
 resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5954) Allow proper fsync support for HBase

2012-09-25 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462584#comment-13462584
 ] 

Luke Lu commented on HBASE-5954:


Hi Lars,

We just noticed that HDFS-744 did not implement the correct hsync semantics 
(mostly due to HDFS-265) so that the hsync is slower AND (arguably) less 
durable than hflush in Hadoop 1.x.

 Allow proper fsync support for HBase
 

 Key: HBASE-5954
 URL: https://issues.apache.org/jira/browse/HBASE-5954
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.3, 0.96.0

 Attachments: 5954-trunk-hdfs-trunk.txt, 5954-trunk-hdfs-trunk-v2.txt, 
 5954-trunk-hdfs-trunk-v3.txt, 5954-trunk-hdfs-trunk-v4.txt, 
 5954-trunk-hdfs-trunk-v5.txt, 5954-trunk-hdfs-trunk-v6.txt, hbase-hdfs-744.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5835) [hbck] Catch and handle NotServingRegionException when close region attempt fails

2012-09-25 Thread liang xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liang xie updated HBASE-5835:
-

Status: Patch Available  (was: Open)

seems i forgot to click submit patch...

 [hbck] Catch and handle NotServingRegionException when close region attempt 
 fails
 -

 Key: HBASE-5835
 URL: https://issues.apache.org/jira/browse/HBASE-5835
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.94.0, 0.90.7, 0.92.2, 0.96.0
Reporter: Jonathan Hsieh
 Attachments: HBASE-5835.patch


 Currently, if hbck attempts to close a region and catches a 
 NotServerRegionException, hbck may hang outputting a stack trace.  Since the 
 goal is to close the region at a particular server, and since it is not 
 serving the region, the region is closed, and we should just warn and eat 
 this exception.
 {code}
 Exception in thread main org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hbase.NotServingRegionException: Received close for 
 regionid but we are not serving it
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2162)
 at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
 at $Proxy5.closeRegion(Unknown Source)
 at 
 org.apache.hadoop.hbase.util.HBaseFsckRepair.closeRegionSilentlyAndWait(HBaseFsckRepair.java:165)
 at org.apache.hadoop.hbase.util.HBaseFsck.closeRegion(HBaseFsck.java:1185)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1302)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1065)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:351)
 at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:370)
 at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3001)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5835) [hbck] Catch and handle NotServingRegionException when close region attempt fails

2012-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462665#comment-13462665
 ] 

Hadoop QA commented on HBASE-5835:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12542730/HBASE-5835.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.wal.TestLogRolling

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//console

This message is automatically generated.

 [hbck] Catch and handle NotServingRegionException when close region attempt 
 fails
 -

 Key: HBASE-5835
 URL: https://issues.apache.org/jira/browse/HBASE-5835
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
 Attachments: HBASE-5835.patch


 Currently, if hbck attempts to close a region and catches a 
 NotServerRegionException, hbck may hang outputting a stack trace.  Since the 
 goal is to close the region at a particular server, and since it is not 
 serving the region, the region is closed, and we should just warn and eat 
 this exception.
 {code}
 Exception in thread main org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hbase.NotServingRegionException: Received close for 
 regionid but we are not serving it
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2162)
 at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
 at $Proxy5.closeRegion(Unknown Source)
 at 
 org.apache.hadoop.hbase.util.HBaseFsckRepair.closeRegionSilentlyAndWait(HBaseFsckRepair.java:165)
 at org.apache.hadoop.hbase.util.HBaseFsck.closeRegion(HBaseFsck.java:1185)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1302)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1065)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:351)
 at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:370)
 at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3001)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6702:
---

Attachment: 6702.v4.patch

 ResourceChecker refinement
 --

 Key: HBASE-6702
 URL: https://issues.apache.org/jira/browse/HBASE-6702
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6702.v1.patch, 6702.v4.patch


 This was based on some discussion from HBASE-6234.
 The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
 issues, but has since not be widely utilized. Further, with modularization we 
 have had to drop the ResourceChecker from the tests that are moved into the 
 hbase-common module because bringing the ResourceChecker up to hbase-common 
 would involved bringing all its dependencies (which are quite far reaching).
 The question then is, what should we do with it? Get rid of it? Refactor and 
 resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6702:
---

Status: Patch Available  (was: Open)

 ResourceChecker refinement
 --

 Key: HBASE-6702
 URL: https://issues.apache.org/jira/browse/HBASE-6702
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6702.v1.patch, 6702.v4.patch


 This was based on some discussion from HBASE-6234.
 The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
 issues, but has since not be widely utilized. Further, with modularization we 
 have had to drop the ResourceChecker from the tests that are moved into the 
 hbase-common module because bringing the ResourceChecker up to hbase-common 
 would involved bringing all its dependencies (which are quite far reaching).
 The question then is, what should we do with it? Get rid of it? Refactor and 
 resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6737) NullPointerException at regionserver.wal.SequenceFileLogWriter.append

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462701#comment-13462701
 ] 

nkeywal commented on HBASE-6737:


Stack 1: It seems to be an expected case, from the code:
{code}
  @Override
  public void append(HLog.Entry entry) throws IOException {
entry.setCompressionContext(compressionContext);
try {
  this.writer.append(entry.getKey(), entry.getEdit());
} catch (NullPointerException npe) {
  // Concurrent close...
  throw new IOException(npe);
}
  }
{code}


 NullPointerException at regionserver.wal.SequenceFileLogWriter.append
 -

 Key: HBASE-6737
 URL: https://issues.apache.org/jira/browse/HBASE-6737
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: nkeywal
Priority: Critical

 Real cluster, scenario in HBASE-5843.
 There are two exceptions, I create a single JIRA with both of them.
 2012-09-04 18:14:49,264 FATAL 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-1 Got 
 while writing log entry to log
 java.io.IOException: java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:229)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:949)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:891)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
   at 
 org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
   at 
 org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
   at 
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:226)
   ... 3 more
 2012-09-04 18:15:52,546 ERROR 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Error in log splitting 
 write thread
 java.lang.reflect.UndeclaredThrowableException
   at $Proxy7.getFileInfo(Unknown Source)
   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:875)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:513)
   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:768)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getRegionSplitEditsPath(HLogSplitter.java:559)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWAP(HLogSplitter.java:974)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.access$800(HLogSplitter.java:82)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$OutputSink.getWriterAndPath(HLogSplitter.java:1309)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:942)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:891)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:261)
   ... 11 more
 Caused by: java.io.IOException: Call to BOX1/192.168.15.5:9000 failed on 
 local exception: java.nio.channels.ClosedByInterruptException
   at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
   at org.apache.hadoop.ipc.Client.call(Client.java:1075)
   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
   at $Proxy7.getFileInfo(Unknown Source)
   at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
   at $Proxy7.getFileInfo(Unknown Source)
   ... 15 more
 Caused by: java.nio.channels.ClosedByInterruptException
   at 
 java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
   at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
   at 
 

[jira] [Commented] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462724#comment-13462724
 ] 

Hadoop QA commented on HBASE-6702:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12546497/6702.v4.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 858 new or modified tests.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.coprocessor.TestRowProcessorEndpoint

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//console

This message is automatically generated.

 ResourceChecker refinement
 --

 Key: HBASE-6702
 URL: https://issues.apache.org/jira/browse/HBASE-6702
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6702.v1.patch, 6702.v4.patch


 This was based on some discussion from HBASE-6234.
 The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
 issues, but has since not be widely utilized. Further, with modularization we 
 have had to drop the ResourceChecker from the tests that are moved into the 
 hbase-common module because bringing the ResourceChecker up to hbase-common 
 would involved bringing all its dependencies (which are quite far reaching).
 The question then is, what should we do with it? Get rid of it? Refactor and 
 resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462726#comment-13462726
 ] 

nkeywal commented on HBASE-6702:


Seems ok...

 ResourceChecker refinement
 --

 Key: HBASE-6702
 URL: https://issues.apache.org/jira/browse/HBASE-6702
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6702.v1.patch, 6702.v4.patch


 This was based on some discussion from HBASE-6234.
 The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
 issues, but has since not be widely utilized. Further, with modularization we 
 have had to drop the ResourceChecker from the tests that are moved into the 
 hbase-common module because bringing the ResourceChecker up to hbase-common 
 would involved bringing all its dependencies (which are quite far reaching).
 The question then is, what should we do with it? Get rid of it? Refactor and 
 resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6309) [MTTR] Do NN operations outside of the ZK EventThread in SplitLogManager

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462745#comment-13462745
 ] 

nkeywal commented on HBASE-6309:


I'm was having a look at this. Could we have the log archiving done by the 
regionserver instead of the master? This would lower the work done in the event 
thread? The only remaining stuff would be the renaming of the region log dir at 
the end. 

I see one impact: if the same log was processed simultaneously by multiple 
region server, this archiving could occur in parallel on two different region 
server. Manageable I think...

 [MTTR] Do NN operations outside of the ZK EventThread in SplitLogManager
 

 Key: HBASE-6309
 URL: https://issues.apache.org/jira/browse/HBASE-6309
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.96.0


 We found this issue during the leap second cataclysm which prompted a 
 distributed splitting of all our logs.
 I saw that none of the RS were splitting after some time while the master was 
 showing that it wasn't even 30% done. jstack'ing I saw this:
 {noformat}
 main-EventThread daemon prio=10 tid=0x7f6ce46d8800 nid=0x5376 in
 Object.wait() [0x7f6ce2ecb000]
java.lang.Thread.State: WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Object.wait(Object.java:485)
 at org.apache.hadoop.ipc.Client.call(Client.java:1093)
 - locked 0x0005fdd661a0 (a org.apache.hadoop.ipc.Client$Call)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
 at $Proxy9.rename(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 at $Proxy9.rename(Unknown Source)
 at org.apache.hadoop.hdfs.DFSClient.rename(DFSClient.java:759)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.rename(DistributedFileSystem.java:253)
 at 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.moveRecoveredEditsFromTemp(HLogSplitter.java:553)
 at 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.moveRecoveredEditsFromTemp(HLogSplitter.java:519)
 at 
 org.apache.hadoop.hbase.master.SplitLogManager$1.finish(SplitLogManager.java:138)
 at 
 org.apache.hadoop.hbase.master.SplitLogManager.getDataSetWatchSuccess(SplitLogManager.java:431)
 at 
 org.apache.hadoop.hbase.master.SplitLogManager.access$1200(SplitLogManager.java:95)
 at 
 org.apache.hadoop.hbase.master.SplitLogManager$GetDataAsyncCallback.processResult(SplitLogManager.java:1011)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:571)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497)
 {noformat}
 We are effectively bottlenecking on doing NN operations and whatever else is 
 happening in GetDataAsyncCallback. It was so bad that on our 100 offline 
 cluster it took a few hours for the master to process all the incoming ZK 
 events while the actual splitting took a fraction of that time.
 I'm marking this as critical and against 0.96 but depending on how involved 
 the fix is we might want to backport.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6878) DistributerLogSplit can fail to resubmit a task done if there is an exception during the log archiving

2012-09-25 Thread nkeywal (JIRA)
nkeywal created HBASE-6878:
--

 Summary: DistributerLogSplit can fail to resubmit a task done if 
there is an exception during the log archiving
 Key: HBASE-6878
 URL: https://issues.apache.org/jira/browse/HBASE-6878
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: nkeywal
Priority: Minor




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6878) DistributerLogSplit can fail to resubmit a task done if there is an exception during the log archiving

2012-09-25 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6878:
---

Description: 
The code in SplitLogManager# getDataSetWatchSuccess is:
{code}
if (slt.isDone()) {
  LOG.info(task  + path +  entered state:  + slt.toString());
  if (taskFinisher != null  !ZKSplitLog.isRescanNode(watcher, path)) {
if (taskFinisher.finish(slt.getServerName(), 
ZKSplitLog.getFileName(path)) == Status.DONE) {
  setDone(path, SUCCESS);
} else {
  resubmitOrFail(path, CHECK);
}
  } else {
setDone(path, SUCCESS);
  }
{code}

  resubmitOrFail(path, CHECK);

should be 
  resubmitOrFail(path, FORCE);

Without it, the task won't be resubmitted if the delay is not reached, and the 
task will be marked as failed.



 DistributerLogSplit can fail to resubmit a task done if there is an exception 
 during the log archiving
 --

 Key: HBASE-6878
 URL: https://issues.apache.org/jira/browse/HBASE-6878
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: nkeywal
Priority: Minor

 The code in SplitLogManager# getDataSetWatchSuccess is:
 {code}
 if (slt.isDone()) {
   LOG.info(task  + path +  entered state:  + slt.toString());
   if (taskFinisher != null  !ZKSplitLog.isRescanNode(watcher, path)) {
 if (taskFinisher.finish(slt.getServerName(), 
 ZKSplitLog.getFileName(path)) == Status.DONE) {
   setDone(path, SUCCESS);
 } else {
   resubmitOrFail(path, CHECK);
 }
   } else {
 setDone(path, SUCCESS);
   }
 {code}
   resubmitOrFail(path, CHECK);
 should be 
   resubmitOrFail(path, FORCE);
 Without it, the task won't be resubmitted if the delay is not reached, and 
 the task will be marked as failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462757#comment-13462757
 ] 

nkeywal commented on HBASE-4955:


Monthly update...
Surefire: the regression on elapsed time is fixed on 2.12.4 (not tested). Still 
waiting for #800. May be it will make it to the 2.13. No date.
JUnit: no life there. Still a release this quarter is likely...



 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor

 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5961) New standard HBase code formatter

2012-09-25 Thread Cody Marcel (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462851#comment-13462851
 ] 

Cody Marcel commented on HBASE-5961:


Nice!

 New standard HBase code formatter
 -

 Key: HBASE-5961
 URL: https://issues.apache.org/jira/browse/HBASE-5961
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Attachments: HBase-Formmatter.xml


 There is currently no good way of passing out the formmatter currently the 
 'standard' in HBase. The standard Apache formatter is actually not very close 
 to what we are considering 'good'/'pretty' code. Further, its not trivial to 
 get a good formatter setup.
 Proposing two things: 
 1) Adding a formmatter to the dev tools and calling out the formmatter usage 
 in the docs
 2) Move to a 'better' formmatter that is not the standard apache formmatter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6868:
-

Status: Patch Available  (was: Open)

 Skip checksum is broke; are we double-checksumming by default?
 --

 Key: HBASE-6868
 URL: https://issues.apache.org/jira/browse/HBASE-6868
 Project: HBase
  Issue Type: Bug
  Components: HFile, wal
Affects Versions: 0.94.1, 0.94.0
Reporter: LiuLei
Priority: Blocker
 Fix For: 0.94.3, 0.96.0

 Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt


 The HFile contains checksums for decrease the iops, so when Hbase read HFile 
 , that dont't need to read the checksum from meta file of HDFS.  But HLog 
 file of Hbase don't contain the checksum, so when HBase read the HLog, that 
 must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
 file to hdfs or we could write checksums into WAL if this skip checksum 
 facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6736) Distributed Split: a split tasks can be mark as DONE but keep unassigned

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462897#comment-13462897
 ] 

nkeywal commented on HBASE-6736:


There are multiple synchro issues. 

One of them is 
{code}
@Override
protected void chore() {
  // [...]
  for (Map.EntryString, Task e : tasks.entrySet()) {
{code}

As we're iterating over a set that can be modified we can have reliability 
issues, cf. javadoc: If the map is modified while an iteration over the set is 
in progress (except through the iterator's own remove operation, or through the 
setValue operation on a map entry returned by the iterator) the results of the 
iteration are undefined.



 Distributed Split: a split tasks can be mark as DONE but keep unassigned
 

 Key: HBASE-6736
 URL: https://issues.apache.org/jira/browse/HBASE-6736
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: nkeywal

 Real cluster, scenario mentioned on HBASE-5843.
 Got it once out of 5 tests on 0.96
 Didn't get it on 0.94 after 3 tests.
 It seems we have a race condition on split logs: the task was nearly 
 simultaneously marked as done and resubmitted. Then it remained in the 
 unassigned state.
 2012-09-04 17:27:06,237 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
 total tasks = 1 unassigned = 0
 2012-09-04 17:27:06,237 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 resubmitted 1 out of 1 tasks
 2012-09-04 17:27:06,237 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
 task not yet acquired 
 /hbase/splitlog/hdfs%3A%2F%2FBOX1%3A9000%2Fhbase%2F.logs%2FBOX0%2C60020%2C1346772046399-splitting%2FBOX0%252C60020%252C1346772046399.1346772046609
  ver = 7
 2012-09-04 17:27:06,314 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 task /hbase/splitlog/RESCAN02 entered state: DONE 
 BOX1,6,1346771990737
 2012-09-04 17:27:06,337 DEBUG 
 org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
 /hbase/splitlog/RESCAN02
 2012-09-04 17:27:06,337 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
 deleted task without in memory state /hbase/splitlog/RESCAN02
 2012-09-04 17:27:07,226 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
 total tasks = 1 unassigned = 1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462923#comment-13462923
 ] 

Hadoop QA commented on HBASE-6868:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12546437/6868-0.96-v3.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//console

This message is automatically generated.

 Skip checksum is broke; are we double-checksumming by default?
 --

 Key: HBASE-6868
 URL: https://issues.apache.org/jira/browse/HBASE-6868
 Project: HBase
  Issue Type: Bug
  Components: HFile, wal
Affects Versions: 0.94.0, 0.94.1
Reporter: LiuLei
Priority: Blocker
 Fix For: 0.94.3, 0.96.0

 Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt


 The HFile contains checksums for decrease the iops, so when Hbase read HFile 
 , that dont't need to read the checksum from meta file of HDFS.  But HLog 
 file of Hbase don't contain the checksum, so when HBase read the HLog, that 
 must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
 file to hdfs or we could write checksums into WAL if this skip checksum 
 facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462925#comment-13462925
 ] 

Lars Hofhansl commented on HBASE-6868:
--

I looked through the run, nothing stuck out... All the tests passed.

I'll do some manual testing today and then commit.

 Skip checksum is broke; are we double-checksumming by default?
 --

 Key: HBASE-6868
 URL: https://issues.apache.org/jira/browse/HBASE-6868
 Project: HBase
  Issue Type: Bug
  Components: HFile, wal
Affects Versions: 0.94.0, 0.94.1
Reporter: LiuLei
Priority: Blocker
 Fix For: 0.94.3, 0.96.0

 Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt


 The HFile contains checksums for decrease the iops, so when Hbase read HFile 
 , that dont't need to read the checksum from meta file of HDFS.  But HLog 
 file of Hbase don't contain the checksum, so when HBase read the HLog, that 
 must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
 file to hdfs or we could write checksums into WAL if this skip checksum 
 facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6868:
-

Status: Open  (was: Patch Available)

 Skip checksum is broke; are we double-checksumming by default?
 --

 Key: HBASE-6868
 URL: https://issues.apache.org/jira/browse/HBASE-6868
 Project: HBase
  Issue Type: Bug
  Components: HFile, wal
Affects Versions: 0.94.1, 0.94.0
Reporter: LiuLei
Priority: Blocker
 Fix For: 0.94.3, 0.96.0

 Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt


 The HFile contains checksums for decrease the iops, so when Hbase read HFile 
 , that dont't need to read the checksum from meta file of HDFS.  But HLog 
 file of Hbase don't contain the checksum, so when HBase read the HLog, that 
 must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
 file to hdfs or we could write checksums into WAL if this skip checksum 
 facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6868:
-

Attachment: 6868-0.94.txt

 Skip checksum is broke; are we double-checksumming by default?
 --

 Key: HBASE-6868
 URL: https://issues.apache.org/jira/browse/HBASE-6868
 Project: HBase
  Issue Type: Bug
  Components: HFile, wal
Affects Versions: 0.94.0, 0.94.1
Reporter: LiuLei
Priority: Blocker
 Fix For: 0.94.2, 0.96.0

 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
 6868-0.96-v3.txt


 The HFile contains checksums for decrease the iops, so when Hbase read HFile 
 , that dont't need to read the checksum from meta file of HDFS.  But HLog 
 file of Hbase don't contain the checksum, so when HBase read the HLog, that 
 must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
 file to hdfs or we could write checksums into WAL if this skip checksum 
 facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6868:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 Skip checksum is broke; are we double-checksumming by default?
 --

 Key: HBASE-6868
 URL: https://issues.apache.org/jira/browse/HBASE-6868
 Project: HBase
  Issue Type: Bug
  Components: HFile, wal
Affects Versions: 0.94.0, 0.94.1
Reporter: LiuLei
Priority: Blocker
 Fix For: 0.94.2, 0.96.0

 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
 6868-0.96-v3.txt


 The HFile contains checksums for decrease the iops, so when Hbase read HFile 
 , that dont't need to read the checksum from meta file of HDFS.  But HLog 
 file of Hbase don't contain the checksum, so when HBase read the HLog, that 
 must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
 file to hdfs or we could write checksums into WAL if this skip checksum 
 facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463017#comment-13463017
 ] 

Lars Hofhansl commented on HBASE-6868:
--

I manually did these tests (0.94 patch):
* started HBase with HBase checksums off, inserted some data, flushed, 
compacted, scanned
* restarted HBase with HBase checksums on, inserted some more data, 
flush/compacted, scanned
* restarted HBase again with HBase checksums off, inserted some more data, 
flush/compacted, scanned

Checked the logs for anything weird. Looks good. Going to commit to 0.94 and 
0.96.

 Skip checksum is broke; are we double-checksumming by default?
 --

 Key: HBASE-6868
 URL: https://issues.apache.org/jira/browse/HBASE-6868
 Project: HBase
  Issue Type: Bug
  Components: HFile, wal
Affects Versions: 0.94.0, 0.94.1
Reporter: LiuLei
Priority: Blocker
 Fix For: 0.94.2, 0.96.0

 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
 6868-0.96-v3.txt


 The HFile contains checksums for decrease the iops, so when Hbase read HFile 
 , that dont't need to read the checksum from meta file of HDFS.  But HLog 
 file of Hbase don't contain the checksum, so when HBase read the HLog, that 
 must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
 file to hdfs or we could write checksums into WAL if this skip checksum 
 facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-6868.
--

  Resolution: Fixed
Assignee: Lars Hofhansl
Hadoop Flags: Reviewed

Committed to 0.94 and 0.96.

 Skip checksum is broke; are we double-checksumming by default?
 --

 Key: HBASE-6868
 URL: https://issues.apache.org/jira/browse/HBASE-6868
 Project: HBase
  Issue Type: Bug
  Components: HFile, wal
Affects Versions: 0.94.0, 0.94.1
Reporter: LiuLei
Assignee: Lars Hofhansl
Priority: Blocker
 Fix For: 0.94.2, 0.96.0

 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
 6868-0.96-v3.txt


 The HFile contains checksums for decrease the iops, so when Hbase read HFile 
 , that dont't need to read the checksum from meta file of HDFS.  But HLog 
 file of Hbase don't contain the checksum, so when HBase read the HLog, that 
 must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
 file to hdfs or we could write checksums into WAL if this skip checksum 
 facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6851) Race condition in TableAuthManager.updateGlobalCache()

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6851:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 Race condition in TableAuthManager.updateGlobalCache()
 --

 Key: HBASE-6851
 URL: https://issues.apache.org/jira/browse/HBASE-6851
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.94.1, 0.96.0
Reporter: Gary Helmling
Assignee: Gary Helmling
Priority: Critical
 Fix For: 0.94.2, 0.96.0

 Attachments: HBASE-6851_2.patch, HBASE-6851_3.patch, HBASE-6851.patch


 When new global permissions are assigned, there is a race condition, during 
 which further authorization checks relying on global permissions may fail.
 In TableAuthManager.updateGlobalCache(), we have:
 {code:java}
 USER_CACHE.clear();
 GROUP_CACHE.clear();
 try {
   initGlobal(conf);
 } catch (IOException e) {
   // Never happens
   LOG.error(Error occured while updating the user cache, e);
 }
 for (Map.EntryString,TablePermission entry : userPerms.entries()) {
   if (AccessControlLists.isGroupPrincipal(entry.getKey())) {
 GROUP_CACHE.put(AccessControlLists.getGroupName(entry.getKey()),
 new Permission(entry.getValue().getActions()));
   } else {
 USER_CACHE.put(entry.getKey(), new 
 Permission(entry.getValue().getActions()));
   }
 }
 {code}
 If authorization checks come in following the .clear() but before 
 repopulating, they will fail.
 We should have some synchronization here to serialize multiple updates and 
 use a COW type rebuild and reassign of the new maps.
 This particular issue crept in with the fix in HBASE-6157, so I'm flagging 
 for 0.94 and 0.96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6784) TestCoprocessorScanPolicy is sometimes flaky when run locally

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6784:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 TestCoprocessorScanPolicy is sometimes flaky when run locally
 -

 Key: HBASE-6784
 URL: https://issues.apache.org/jira/browse/HBASE-6784
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.2, 0.96.0

 Attachments: 6784.txt


 The problem is not seen in jenkins build.  
 When we run TestCoprocessorScanPolicy.testBaseCases locally or in our 
 internal jenkins we tend to get random failures.  The reason is the 2 puts 
 that we do here is sometimes getting the same timestamp.  This is leading to 
 improper scan results as the version check tends to skip one of the row 
 seeing the timestamp to be same. Marking this as minor.  As we are trying to 
 solve testcase related failures just raising this incase we need to resolve 
 this also.
 For eg,
 Both the puts are getting the time
 {code}
 time 1347635287360
 time 1347635287360
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6870) HTable#coprocessorExec always scan the whole table

2012-09-25 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463031#comment-13463031
 ] 

Himanshu Vashishtha commented on HBASE-6870:


Looked at the patch:

Can you make the these two if statements in-line
{code}
+if (Bytes.compareTo(start, startKeys[i]) = 0) {
+  if (Bytes.equals(endKeys[i], HConstants.EMPTY_END_ROW)
+  || Bytes.compareTo(start, endKeys[i])  0) {
+rangeKeys.add(start);
+  }
{code}

Can it be private?
{code}
+  public LinkedHashMapbyte[], HRegionLocation getKeysToRegionsInRange(
{code}

Re: Andrew's concern regarding cache use: 6877 will take care of region move 
too? cache may become stale for reasons other than splits too. Will look at 
6877.

 HTable#coprocessorExec always scan the whole table 
 ---

 Key: HBASE-6870
 URL: https://issues.apache.org/jira/browse/HBASE-6870
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors
Affects Versions: 0.94.1
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, 
 HBASE-6870v2.patch, HBASE-6870v3.patch


 In current logic, HTable#coprocessorExec always scan the whole table, its 
 efficiency is low and will affect the Regionserver carrying .META. under 
 large coprocessorExec requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6854) Deletion of SPLITTING node on split rollback should clear the region from RIT

2012-09-25 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463037#comment-13463037
 ] 

ramkrishna.s.vasudevan commented on HBASE-6854:
---

I found that the testcase added with this is sometimes failing.  Seems there is 
something in the AM and the way the watcher is set.
I will debug it and then commit the patch though it is only a testcase change.

 Deletion of SPLITTING node on split rollback should clear the region from RIT
 -

 Key: HBASE-6854
 URL: https://issues.apache.org/jira/browse/HBASE-6854
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.94.3

 Attachments: HBASE-6854.patch


 If a failure happens in split before OFFLINING_PARENT, we tend to rollback 
 the split including deleting the znodes created.
 On deletion of the RS_ZK_SPLITTING node we are getting a callback but not 
 remvoving from RIT. We need to remove it from RIT, anyway SSH logic is well 
 guarded in case the delete event comes due to RS down scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6853) IllegalArgument Exception is thrown when an empty region is spliitted.

2012-09-25 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463039#comment-13463039
 ] 

ramkrishna.s.vasudevan commented on HBASE-6853:
---

@Stack
Can we commit patch 1?

 IllegalArgument Exception is thrown when an empty region is spliitted.
 --

 Key: HBASE-6853
 URL: https://issues.apache.org/jira/browse/HBASE-6853
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.1
Reporter: ramkrishna.s.vasudevan
 Attachments: HBASE-6853_2_splitsuccess.patch, 
 HBASE-6853_splitfailure.patch


 This is w.r.t a mail sent in the dev mail list.
 Empty region split should be handled gracefully.  Either we should not allow 
 the split to happen if we know that the region is empty or we should allow 
 the split to happen by setting the no of threads to the thread pool executor 
 as 1.
 {code}
 int nbFiles = hstoreFilesToSplit.size();
 ThreadFactoryBuilder builder = new ThreadFactoryBuilder();
 builder.setNameFormat(StoreFileSplitter-%1$d);
 ThreadFactory factory = builder.build();
 ThreadPoolExecutor threadPool =
   (ThreadPoolExecutor) Executors.newFixedThreadPool(nbFiles, factory);
 ListFutureVoid futures = new ArrayListFutureVoid(nbFiles);
 {code}
 Here the nbFiles needs to be a non zero positive value.
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5961) New standard HBase code formatter

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463047#comment-13463047
 ] 

stack commented on HBASE-5961:
--

I committed this formatter under dev-support and I added how to install doc 
from HBASE-3678.

 New standard HBase code formatter
 -

 Key: HBASE-5961
 URL: https://issues.apache.org/jira/browse/HBASE-5961
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBase-Formmatter.xml


 There is currently no good way of passing out the formmatter currently the 
 'standard' in HBase. The standard Apache formatter is actually not very close 
 to what we are considering 'good'/'pretty' code. Further, its not trivial to 
 get a good formatter setup.
 Proposing two things: 
 1) Adding a formmatter to the dev tools and calling out the formmatter usage 
 in the docs
 2) Move to a 'better' formmatter that is not the standard apache formmatter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-5961) New standard HBase code formatter

2012-09-25 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5961.
--

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed

Committed to trunk. Thanks for the patch Jesse.

 New standard HBase code formatter
 -

 Key: HBASE-5961
 URL: https://issues.apache.org/jira/browse/HBASE-5961
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBase-Formmatter.xml


 There is currently no good way of passing out the formmatter currently the 
 'standard' in HBase. The standard Apache formatter is actually not very close 
 to what we are considering 'good'/'pretty' code. Further, its not trivial to 
 get a good formatter setup.
 Proposing two things: 
 1) Adding a formmatter to the dev tools and calling out the formmatter usage 
 in the docs
 2) Move to a 'better' formmatter that is not the standard apache formmatter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463063#comment-13463063
 ] 

Jesse Yates commented on HBASE-6702:


Good stuff keywal! Just a couple comments:
{code}
+  artifactIdhbase-common/artifactId
+  version${project.version}/version
+  typetest-jar/type
+  scopetest/scope
+/dependency
+dependency
{code}

To keep DRY, the aboves should go into hbase/pom.xml's dependencyManagement 
section and then the children projects should just use:
{code}
+  artifactIdhbase-common/artifactId
+  typetest-jar/type
+/dependency
+dependency
{code}

Also, any chance for some javadocs on things like:
{code}
+  public ResourceChecker(String tagLine) {
+this.tagLine = tagLine;
+  }
{code}

Otherwise, this is a really sweet add.


 ResourceChecker refinement
 --

 Key: HBASE-6702
 URL: https://issues.apache.org/jira/browse/HBASE-6702
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6702.v1.patch, 6702.v4.patch


 This was based on some discussion from HBASE-6234.
 The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
 issues, but has since not be widely utilized. Further, with modularization we 
 have had to drop the ResourceChecker from the tests that are moved into the 
 hbase-common module because bringing the ResourceChecker up to hbase-common 
 would involved bringing all its dependencies (which are quite far reaching).
 The question then is, what should we do with it? Get rid of it? Refactor and 
 resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common

2012-09-25 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463064#comment-13463064
 ] 

Jesse Yates commented on HBASE-6637:


As mentioned, failing tests passed locally...

 Move DaemonThreadFactory into Threads and Threads to hbase-common
 -

 Key: HBASE-6637
 URL: https://issues.apache.org/jira/browse/HBASE-6637
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Fix For: 0.96.0

 Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, 
 hbase-6637-v0.patch, hbase-6637-v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463079#comment-13463079
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-0.94-security #57 (See 
[https://builds.apache.org/job/HBase-0.94-security/57/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390012)

 Result = SUCCESS
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


 Skip checksum is broke; are we double-checksumming by default?
 --

 Key: HBASE-6868
 URL: https://issues.apache.org/jira/browse/HBASE-6868
 Project: HBase
  Issue Type: Bug
  Components: HFile, wal
Affects Versions: 0.94.0, 0.94.1
Reporter: LiuLei
Assignee: Lars Hofhansl
Priority: Blocker
 Fix For: 0.94.2, 0.96.0

 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
 6868-0.96-v3.txt


 The HFile contains checksums for decrease the iops, so when Hbase read HFile 
 , that dont't need to read the checksum from meta file of HDFS.  But HLog 
 file of Hbase don't contain the checksum, so when HBase read the HLog, that 
 must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
 file to hdfs or we could write checksums into WAL if this skip checksum 
 facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6851) Race condition in TableAuthManager.updateGlobalCache()

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463080#comment-13463080
 ] 

Hudson commented on HBASE-6851:
---

Integrated in HBase-0.94-security #57 (See 
[https://builds.apache.org/job/HBase-0.94-security/57/])
HBASE-6851  Fix race condition in TableAuthManager.updateGlobalCache() 
(Revision 1388898)

 Result = SUCCESS
garyh : 
Files : 
* 
/hbase/branches/0.94/security/src/main/java/org/apache/hadoop/hbase/security/access/TableAuthManager.java
* 
/hbase/branches/0.94/security/src/test/java/org/apache/hadoop/hbase/security/access/TestTablePermissions.java


 Race condition in TableAuthManager.updateGlobalCache()
 --

 Key: HBASE-6851
 URL: https://issues.apache.org/jira/browse/HBASE-6851
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.94.1, 0.96.0
Reporter: Gary Helmling
Assignee: Gary Helmling
Priority: Critical
 Fix For: 0.94.2, 0.96.0

 Attachments: HBASE-6851_2.patch, HBASE-6851_3.patch, HBASE-6851.patch


 When new global permissions are assigned, there is a race condition, during 
 which further authorization checks relying on global permissions may fail.
 In TableAuthManager.updateGlobalCache(), we have:
 {code:java}
 USER_CACHE.clear();
 GROUP_CACHE.clear();
 try {
   initGlobal(conf);
 } catch (IOException e) {
   // Never happens
   LOG.error(Error occured while updating the user cache, e);
 }
 for (Map.EntryString,TablePermission entry : userPerms.entries()) {
   if (AccessControlLists.isGroupPrincipal(entry.getKey())) {
 GROUP_CACHE.put(AccessControlLists.getGroupName(entry.getKey()),
 new Permission(entry.getValue().getActions()));
   } else {
 USER_CACHE.put(entry.getKey(), new 
 Permission(entry.getValue().getActions()));
   }
 }
 {code}
 If authorization checks come in following the .clear() but before 
 repopulating, they will fail.
 We should have some synchronization here to serialize multiple updates and 
 use a COW type rebuild and reassign of the new maps.
 This particular issue crept in with the fix in HBASE-6157, so I'm flagging 
 for 0.94 and 0.96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6784) TestCoprocessorScanPolicy is sometimes flaky when run locally

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463081#comment-13463081
 ] 

Hudson commented on HBASE-6784:
---

Integrated in HBase-0.94-security #57 (See 
[https://builds.apache.org/job/HBase-0.94-security/57/])
HBASE-6784 TestCoprocessorScanPolicy is sometimes flaky when run locally 
(Revision 1389619)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestCoprocessorScanPolicy.java


 TestCoprocessorScanPolicy is sometimes flaky when run locally
 -

 Key: HBASE-6784
 URL: https://issues.apache.org/jira/browse/HBASE-6784
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.2, 0.96.0

 Attachments: 6784.txt


 The problem is not seen in jenkins build.  
 When we run TestCoprocessorScanPolicy.testBaseCases locally or in our 
 internal jenkins we tend to get random failures.  The reason is the 2 puts 
 that we do here is sometimes getting the same timestamp.  This is leading to 
 improper scan results as the version check tends to skip one of the row 
 seeing the timestamp to be same. Marking this as minor.  As we are trying to 
 solve testcase related failures just raising this incase we need to resolve 
 this also.
 For eg,
 Both the puts are getting the time
 {code}
 time 1347635287360
 time 1347635287360
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6637:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Move DaemonThreadFactory into Threads and Threads to hbase-common
 -

 Key: HBASE-6637
 URL: https://issues.apache.org/jira/browse/HBASE-6637
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Fix For: 0.96.0

 Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, 
 hbase-6637-v0.patch, hbase-6637-v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6637:
-


Committed to 0.96 (for the new files first, added those in a 2nd commit).

 Move DaemonThreadFactory into Threads and Threads to hbase-common
 -

 Key: HBASE-6637
 URL: https://issues.apache.org/jira/browse/HBASE-6637
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Fix For: 0.96.0

 Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, 
 hbase-6637-v0.patch, hbase-6637-v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463090#comment-13463090
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-0.94 #488 (See 
[https://builds.apache.org/job/HBase-0.94/488/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390012)

 Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


 Skip checksum is broke; are we double-checksumming by default?
 --

 Key: HBASE-6868
 URL: https://issues.apache.org/jira/browse/HBASE-6868
 Project: HBase
  Issue Type: Bug
  Components: HFile, wal
Affects Versions: 0.94.0, 0.94.1
Reporter: LiuLei
Assignee: Lars Hofhansl
Priority: Blocker
 Fix For: 0.94.2, 0.96.0

 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
 6868-0.96-v3.txt


 The HFile contains checksums for decrease the iops, so when Hbase read HFile 
 , that dont't need to read the checksum from meta file of HDFS.  But HLog 
 file of Hbase don't contain the checksum, so when HBase read the HLog, that 
 must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
 file to hdfs or we could write checksums into WAL if this skip checksum 
 facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6879) Add HBase Code Template

2012-09-25 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates reassigned HBASE-6879:
--

Assignee: Jesse Yates

 Add HBase Code Template
 ---

 Key: HBASE-6879
 URL: https://issues.apache.org/jira/browse/HBASE-6879
 Project: HBase
  Issue Type: Bug
  Components: build, documentation
Reporter: Jesse Yates
Assignee: Jesse Yates

 Add a standard code template to do along with the code formatter for HBase. 
 This helps make sure people have the correct license and general commenting 
 for auto-generated elements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6879) Add HBase Code Template

2012-09-25 Thread Jesse Yates (JIRA)
Jesse Yates created HBASE-6879:
--

 Summary: Add HBase Code Template
 Key: HBASE-6879
 URL: https://issues.apache.org/jira/browse/HBASE-6879
 Project: HBase
  Issue Type: Bug
  Components: build, documentation
Reporter: Jesse Yates


Add a standard code template to do along with the code formatter for HBase. 
This helps make sure people have the correct license and general commenting for 
auto-generated elements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6879) Add HBase Code Template

2012-09-25 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-6879:
---

Attachment: HBase Code Template.xml

Attaching template to go into hbase/dev-support. Easier to see this way than as 
an actual patch.

 Add HBase Code Template
 ---

 Key: HBASE-6879
 URL: https://issues.apache.org/jira/browse/HBASE-6879
 Project: HBase
  Issue Type: Bug
  Components: build, documentation
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: HBase Code Template.xml


 Add a standard code template to do along with the code formatter for HBase. 
 This helps make sure people have the correct license and general commenting 
 for auto-generated elements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6879) Add HBase Code Template

2012-09-25 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463098#comment-13463098
 ] 

Jesse Yates commented on HBASE-6879:


[~saint@gmail.com] here's a stab at a code template to go with the 
formmatter from HBASE-5961

 Add HBase Code Template
 ---

 Key: HBASE-6879
 URL: https://issues.apache.org/jira/browse/HBASE-6879
 Project: HBase
  Issue Type: Bug
  Components: build, documentation
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: HBase Code Template.xml


 Add a standard code template to do along with the code formatter for HBase. 
 This helps make sure people have the correct license and general commenting 
 for auto-generated elements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463103#comment-13463103
 ] 

Hudson commented on HBASE-6637:
---

Integrated in HBase-TRUNK #3377 (See 
[https://builds.apache.org/job/HBase-TRUNK/3377/])
HBASE-6637 Argghh... Missed deleted files too (Revision 1390040)
HBASE-6637 Missed new files (Revision 1390035)
HBASE-6637 Move DaemonThreadFactory into Threads and Threads to hbase-common 
(Jesse Yates) (Revision 1390034)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java

larsh : 
Files : 
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java

larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java


 Move DaemonThreadFactory into Threads and Threads to hbase-common
 -

 Key: HBASE-6637
 URL: https://issues.apache.org/jira/browse/HBASE-6637
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Fix For: 0.96.0

 Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, 
 hbase-6637-v0.patch, hbase-6637-v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3678) Add Eclipse-based Apache Formatter to HBase Wiki

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463104#comment-13463104
 ] 

Hudson commented on HBASE-3678:
---

Integrated in HBase-TRUNK #3377 (See 
[https://builds.apache.org/job/HBase-TRUNK/3377/])
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390028)
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390026)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/docbkx/developer.xml

stack : 
Files : 
* /hbase/trunk/dev-support/hbase_eclipse_formatter.xml
* /hbase/trunk/src/docbkx/developer.xml
* /hbase/trunk/src/docbkx/troubleshooting.xml


 Add Eclipse-based Apache Formatter to HBase Wiki
 

 Key: HBASE-3678
 URL: https://issues.apache.org/jira/browse/HBASE-3678
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Trivial
 Fix For: 0.92.0

 Attachments: eclipse_formatter_apache.xml


 Currently, on http://wiki.apache.org/hadoop/Hbase/HowToContribute , we tell 
 the user to follow Sun's code conventions and then add a couple things.  For 
 lazy people like myself, it would be much easier to just tell us to import an 
 Apache formatter into your Eclipse project and not worry about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463105#comment-13463105
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-TRUNK #3377 (See 
[https://builds.apache.org/job/HBase-TRUNK/3377/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390013)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


 Skip checksum is broke; are we double-checksumming by default?
 --

 Key: HBASE-6868
 URL: https://issues.apache.org/jira/browse/HBASE-6868
 Project: HBase
  Issue Type: Bug
  Components: HFile, wal
Affects Versions: 0.94.0, 0.94.1
Reporter: LiuLei
Assignee: Lars Hofhansl
Priority: Blocker
 Fix For: 0.94.2, 0.96.0

 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
 6868-0.96-v3.txt


 The HFile contains checksums for decrease the iops, so when Hbase read HFile 
 , that dont't need to read the checksum from meta file of HDFS.  But HLog 
 file of Hbase don't contain the checksum, so when HBase read the HLog, that 
 must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
 file to hdfs or we could write checksums into WAL if this skip checksum 
 facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5691) Importtsv stops the webservice from which it is evoked

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463106#comment-13463106
 ] 

Hudson commented on HBASE-5691:
---

Integrated in HBase-TRUNK #3377 (See 
[https://builds.apache.org/job/HBase-TRUNK/3377/])
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390028)
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390026)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/docbkx/developer.xml

stack : 
Files : 
* /hbase/trunk/dev-support/hbase_eclipse_formatter.xml
* /hbase/trunk/src/docbkx/developer.xml
* /hbase/trunk/src/docbkx/troubleshooting.xml


 Importtsv stops the webservice from which it is evoked
 --

 Key: HBASE-5691
 URL: https://issues.apache.org/jira/browse/HBASE-5691
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: debarshi basak
Priority: Minor

 I was trying to run importtsv from a servlet. Everytime after the completion 
 of job, the tomcat server was shutdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5961) New standard HBase code formatter

2012-09-25 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463109#comment-13463109
 ] 

Jesse Yates commented on HBASE-5961:


hmmm, looks like we might need to add this to the rat excludes file too.

 New standard HBase code formatter
 -

 Key: HBASE-5961
 URL: https://issues.apache.org/jira/browse/HBASE-5961
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBase-Formmatter.xml


 There is currently no good way of passing out the formmatter currently the 
 'standard' in HBase. The standard Apache formatter is actually not very close 
 to what we are considering 'good'/'pretty' code. Further, its not trivial to 
 get a good formatter setup.
 Proposing two things: 
 1) Adding a formmatter to the dev tools and calling out the formmatter usage 
 in the docs
 2) Move to a 'better' formmatter that is not the standard apache formmatter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6401) HBase may lose edits after a crash if used with HDFS 1.0.3 or older

2012-09-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463115#comment-13463115
 ] 

Lars Hofhansl commented on HBASE-6401:
--

Hadoop-2 has other issues, though (see last few comments on HDFS-744).

 HBase may lose edits after a crash if used with HDFS 1.0.3 or older
 ---

 Key: HBASE-6401
 URL: https://issues.apache.org/jira/browse/HBASE-6401
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Attachments: TestReadAppendWithDeadDN.java


 This comes from a hdfs bug, fixed in some hdfs versions. I haven't found the 
 hdfs jira for this.
 Context: HBase Write Ahead Log features. This is using hdfs append. If the 
 node crashes, the file that was written is read by other processes to replay 
 the action.
 - So we have in hdfs one (dead) process writing with another process reading.
 - But, despite the call to syncFs, we don't always see the data when we have 
 a dead node. It seems to be because the call in DFSClient#updateBlockInfo 
 ignores the ipc errors and set the length to 0.
 - So we may miss all the writes to the last block if we try to connect to the 
 dead DN.
 hdfs 1.0.3, branch-1 or branch-1-win: we have the issue
 http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java?revision=1359853view=markup
 hdfs branch-2 or trunk: we should not have the issue (but not tested)
 http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java?view=markup
 The attached test will fail ~50 of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6880) Failure in assigning root causes system hang

2012-09-25 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HBASE-6880:
--

 Summary: Failure in assigning root causes system hang
 Key: HBASE-6880
 URL: https://issues.apache.org/jira/browse/HBASE-6880
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang


In looking into a TestReplication failure, I found out sometimes assignRoot 
could fail, for example, RS is not serving traffic yet.  In this case, the 
master will keep waiting for root to be available, which could never happen.
 
Need to gracefully terminate master if root is not assigned properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6401) HBase may lose edits after a crash if used with HDFS 1.0.3 or older

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463144#comment-13463144
 ] 

nkeywal commented on HBASE-6401:


HDFS-3701 has just been fixed, so we may have a reasonable hdfs 1.1 version as 
HDFS-3703 made it as well. We need HDFS-3912 to be complete from a failure 
management point of view. Then there is the question of durability...

 HBase may lose edits after a crash if used with HDFS 1.0.3 or older
 ---

 Key: HBASE-6401
 URL: https://issues.apache.org/jira/browse/HBASE-6401
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Attachments: TestReadAppendWithDeadDN.java


 This comes from a hdfs bug, fixed in some hdfs versions. I haven't found the 
 hdfs jira for this.
 Context: HBase Write Ahead Log features. This is using hdfs append. If the 
 node crashes, the file that was written is read by other processes to replay 
 the action.
 - So we have in hdfs one (dead) process writing with another process reading.
 - But, despite the call to syncFs, we don't always see the data when we have 
 a dead node. It seems to be because the call in DFSClient#updateBlockInfo 
 ignores the ipc errors and set the length to 0.
 - So we may miss all the writes to the last block if we try to connect to the 
 dead DN.
 hdfs 1.0.3, branch-1 or branch-1-win: we have the issue
 http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java?revision=1359853view=markup
 hdfs branch-2 or trunk: we should not have the issue (but not tested)
 http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java?view=markup
 The attached test will fail ~50 of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6881) All regionservers are marked offline even there is still one up

2012-09-25 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HBASE-6881:
--

 Summary: All regionservers are marked offline even there is still 
one up
 Key: HBASE-6881
 URL: https://issues.apache.org/jira/browse/HBASE-6881
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang


This is an issue caused by HBASE-6438:

{noformat}
+RegionPlan newPlan = plan;
+if (!regionAlreadyInTransitionException) {
+  // Force a new plan and reassign. Will return null if no servers.
+  newPlan = getRegionPlan(state, plan.getDestination(), true);
+}
+if (newPlan == null) {
   this.timeoutMonitor.setAllRegionServersOffline(true);
   LOG.warn(Unable to find a viable location to assign region  +
 state.getRegion().getRegionNameAsString());
{noformat}

Here, when newPlan is null, plan.getDestination() could be up actually.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6881) All regionservers are marked offline even there is still one up

2012-09-25 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6881:
---

Description: 
{noformat}
+RegionPlan newPlan = plan;
+if (!regionAlreadyInTransitionException) {
+  // Force a new plan and reassign. Will return null if no servers.
+  newPlan = getRegionPlan(state, plan.getDestination(), true);
+}
+if (newPlan == null) {
   this.timeoutMonitor.setAllRegionServersOffline(true);
   LOG.warn(Unable to find a viable location to assign region  +
 state.getRegion().getRegionNameAsString());
{noformat}

Here, when newPlan is null, plan.getDestination() could be up actually.



  was:
This is an issue caused by HBASE-6438:

{noformat}
+RegionPlan newPlan = plan;
+if (!regionAlreadyInTransitionException) {
+  // Force a new plan and reassign. Will return null if no servers.
+  newPlan = getRegionPlan(state, plan.getDestination(), true);
+}
+if (newPlan == null) {
   this.timeoutMonitor.setAllRegionServersOffline(true);
   LOG.warn(Unable to find a viable location to assign region  +
 state.getRegion().getRegionNameAsString());
{noformat}

Here, when newPlan is null, plan.getDestination() could be up actually.




 All regionservers are marked offline even there is still one up
 ---

 Key: HBASE-6881
 URL: https://issues.apache.org/jira/browse/HBASE-6881
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang

 {noformat}
 +RegionPlan newPlan = plan;
 +if (!regionAlreadyInTransitionException) {
 +  // Force a new plan and reassign. Will return null if no servers.
 +  newPlan = getRegionPlan(state, plan.getDestination(), true);
 +}
 +if (newPlan == null) {
this.timeoutMonitor.setAllRegionServersOffline(true);
LOG.warn(Unable to find a viable location to assign region  +
  state.getRegion().getRegionNameAsString());
 {noformat}
 Here, when newPlan is null, plan.getDestination() could be up actually.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6881) All regionservers are marked offline even there is still one up

2012-09-25 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463162#comment-13463162
 ] 

Jimmy Xiang commented on HBASE-6881:


This is NOT an issue caused by HBASE-6438 actually. I fixed the description.  
It is an existing issue.

During unit test, there could be just one region server. This can lead to 
HBASE-6880, and hanging tests.

 All regionservers are marked offline even there is still one up
 ---

 Key: HBASE-6881
 URL: https://issues.apache.org/jira/browse/HBASE-6881
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang

 {noformat}
 +RegionPlan newPlan = plan;
 +if (!regionAlreadyInTransitionException) {
 +  // Force a new plan and reassign. Will return null if no servers.
 +  newPlan = getRegionPlan(state, plan.getDestination(), true);
 +}
 +if (newPlan == null) {
this.timeoutMonitor.setAllRegionServersOffline(true);
LOG.warn(Unable to find a viable location to assign region  +
  state.getRegion().getRegionNameAsString());
 {noformat}
 Here, when newPlan is null, plan.getDestination() could be up actually.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread Mikhail Bautin (JIRA)
Mikhail Bautin created HBASE-6882:
-

 Summary: Thrift IOError should include exception class
 Key: HBASE-6882
 URL: https://issues.apache.org/jira/browse/HBASE-6882
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin


Return exception class as part of IOError thrown from the Thrift proxy or the 
embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-6882:
---

Attachment: D5679.1.patch

mbautin requested code review of [jira] [HBASE-6882] [89-fb] Thrift IOError 
should include exception class.
Reviewers: Liyin, Karthik, aaiyer, chip, JIRA

  Return exception class as part of IOError thrown from the Thrift proxy or the 
embedded Thrift server in the regionserver.

TEST PLAN
  Unit tests
  Test through C++ HBase client

REVISION DETAIL
  https://reviews.facebook.net/D5679

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/RegionException.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java
  src/main/java/org/apache/hadoop/hbase/thrift/generated/IOError.java
  src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/13341/

To: Liyin, Karthik, aaiyer, chip, JIRA, mbautin


 Thrift IOError should include exception class
 -

 Key: HBASE-6882
 URL: https://issues.apache.org/jira/browse/HBASE-6882
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D5679.1.patch


 Return exception class as part of IOError thrown from the Thrift proxy or the 
 embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6881) All regionservers are marked offline even there is still one up

2012-09-25 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6881:
---

Attachment: trunk-6881.patch

 All regionservers are marked offline even there is still one up
 ---

 Key: HBASE-6881
 URL: https://issues.apache.org/jira/browse/HBASE-6881
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-6881.patch


 {noformat}
 +RegionPlan newPlan = plan;
 +if (!regionAlreadyInTransitionException) {
 +  // Force a new plan and reassign. Will return null if no servers.
 +  newPlan = getRegionPlan(state, plan.getDestination(), true);
 +}
 +if (newPlan == null) {
this.timeoutMonitor.setAllRegionServersOffline(true);
LOG.warn(Unable to find a viable location to assign region  +
  state.getRegion().getRegionNameAsString());
 {noformat}
 Here, when newPlan is null, plan.getDestination() could be up actually.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6881) All regionservers are marked offline even there is still one up

2012-09-25 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6881:
---

Status: Patch Available  (was: Open)

 All regionservers are marked offline even there is still one up
 ---

 Key: HBASE-6881
 URL: https://issues.apache.org/jira/browse/HBASE-6881
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-6881.patch


 {noformat}
 +RegionPlan newPlan = plan;
 +if (!regionAlreadyInTransitionException) {
 +  // Force a new plan and reassign. Will return null if no servers.
 +  newPlan = getRegionPlan(state, plan.getDestination(), true);
 +}
 +if (newPlan == null) {
this.timeoutMonitor.setAllRegionServersOffline(true);
LOG.warn(Unable to find a viable location to assign region  +
  state.getRegion().getRegionNameAsString());
 {noformat}
 Here, when newPlan is null, plan.getDestination() could be up actually.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread Mikhail Bautin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463218#comment-13463218
 ] 

Mikhail Bautin commented on HBASE-6882:
---

Phabricator diff for 0.89-fb: https://reviews.facebook.net/D5679


 Thrift IOError should include exception class
 -

 Key: HBASE-6882
 URL: https://issues.apache.org/jira/browse/HBASE-6882
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D5679.1.patch


 Return exception class as part of IOError thrown from the Thrift proxy or the 
 embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5456) Introduce PowerMock into our unit tests to reduce unnecessary method exposure

2012-09-25 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-5456:
---

Attachment: hbase-5456-v0.patch

Attaching patch to add jmockit and powermock to the test depdendencies.

For more discussion and examples of why its the right way to go, see 
http://search-hadoop.com/m/HbsjjRSKLc2

 Introduce PowerMock into our unit tests to reduce unnecessary method exposure
 -

 Key: HBASE-5456
 URL: https://issues.apache.org/jira/browse/HBASE-5456
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
 Attachments: hbase-5456-v0.patch


 We should introduce PowerMock into our unit tests so that we don't have to 
 expose methods intended to be used by unit tests.
 Here was Benoit's reply to a user of asynchbase about testability:
 OpenTSDB has unit tests that are mocking out HBaseClient just fine
 [1].  You can mock out pretty much anything on the JVM: final,
 private, JDK stuff, etc.  All you need is the right tools.  I've been
 very happy with PowerMock.  It supports Mockito and EasyMock.
 I've never been keen on mutilating public interfaces for the sake of
 testing.  With tools like PowerMock, we can keep the public APIs tidy
 while mocking and overriding anything, even in the most private guts
 of the classes.
  [1] 
 https://github.com/stumbleupon/opentsdb/blob/master/src/uid/TestUniqueId.java#L66

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6883) CleanerChore treats .archive as a table and throws TableInfoMissingException

2012-09-25 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HBASE-6883:
--

 Summary: CleanerChore treats .archive as a table and throws 
TableInfoMissingException
 Key: HBASE-6883
 URL: https://issues.apache.org/jira/browse/HBASE-6883
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang


{noformat}
2012-09-25 14:52:21,902 DEBUG org.apache.hadoop.hbase.util.FSTableDescriptors: 
Exception during readTableDecriptor. Current table name = .archive
org.apache.hadoop.hbase.TableInfoMissingException: No .tableinfo file under 
hdfs://c0322.hal.cloudera.com:56020/hbase/.archive
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptor(FSTableDescriptors.java:417)
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptor(FSTableDescriptors.java:408)
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:170)
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:201)
at 
org.apache.hadoop.hbase.master.HMaster.getTableDescriptors(HMaster.java:2205)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hbase.ipc.ProtobufRpcEngine$Server.call(ProtobufRpcEngine.java:357)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1816)
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463294#comment-13463294
 ] 

Phabricator commented on HBASE-6882:


Liyin has accepted the revision [jira] [HBASE-6882] [89-fb] Thrift IOError 
should include exception class.

  LGTM !

REVISION DETAIL
  https://reviews.facebook.net/D5679

BRANCH
  ioerror_class_name

To: Liyin, Karthik, aaiyer, chip, JIRA, mbautin


 Thrift IOError should include exception class
 -

 Key: HBASE-6882
 URL: https://issues.apache.org/jira/browse/HBASE-6882
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D5679.1.patch


 Return exception class as part of IOError thrown from the Thrift proxy or the 
 embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6881) All regionservers are marked offline even there is still one up

2012-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463307#comment-13463307
 ] 

Hadoop QA commented on HBASE-6881:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12546581/trunk-6881.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//console

This message is automatically generated.

 All regionservers are marked offline even there is still one up
 ---

 Key: HBASE-6881
 URL: https://issues.apache.org/jira/browse/HBASE-6881
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-6881.patch


 {noformat}
 +RegionPlan newPlan = plan;
 +if (!regionAlreadyInTransitionException) {
 +  // Force a new plan and reassign. Will return null if no servers.
 +  newPlan = getRegionPlan(state, plan.getDestination(), true);
 +}
 +if (newPlan == null) {
this.timeoutMonitor.setAllRegionServersOffline(true);
LOG.warn(Unable to find a viable location to assign region  +
  state.getRegion().getRegionNameAsString());
 {noformat}
 Here, when newPlan is null, plan.getDestination() could be up actually.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6424) TestReplication frequently hangs

2012-09-25 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463308#comment-13463308
 ] 

Jimmy Xiang commented on HBASE-6424:


May relate to HBASE-6880

 TestReplication frequently hangs
 

 Key: HBASE-6424
 URL: https://issues.apache.org/jira/browse/HBASE-6424
 Project: HBase
  Issue Type: Bug
  Components: Replication, test
Affects Versions: 0.94.0
Reporter: Andrew Purtell
 Attachments: testReplication.jstack


 TestReplication frequently hangs. Separated out from HBASE-6406.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6572) Tiered HFile storage

2012-09-25 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6572:
--

Description: 
Consider how we might enable tiered HFile storage. If HDFS has the capability, 
we could create certain files on solid state devices where they might be 
frequently accessed, especially for random reads; and others (and by default) 
on spinning media as before. We could support the move of frequently read 
HFiles from spinning media to solid state. We already have CF statistics for 
this, would only need to add requisite admin interface; could even consider an 
autotiering option. 

Dhruba Borthakur did some early work in this area and wrote up his findings: 
http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html . It 
is important to note the findings but I suggest most of the recommendations are 
out of scope of this JIRA. This JIRA seeks to find an initial use case that 
produces a reasonable benefit, and serves as a testbed for further 
improvements. If I may paraphrase Dhruba's findings (any misstatements and 
errors are mine): First, the DFSClient code paths introduce significant 
latency, so the HDFS client (and presumably the DataNode, as the next 
bottleneck) will need significant work to knock that down. Need to investigate 
optimized (perhaps read-only) DFS clients, server side read and caching 
strategies. Second, RegionServers are heavily threaded and this imposes a lot 
of monitor contention and context switching cost. Need to investigate reducing 
the number of threads in a RegionServer, nonblocking IO and RPC.

  was:Consider how we might enable tiered HFile storage. If HDFS has the 
capability, we could create certain files on solid state devices where they 
might be frequently accessed, especially for random reads; and others (and by 
default) on spinning media as before. We could support the move of frequently 
read HFiles from spinning media to solid state. We already have CF statistics 
for this, would only need to add requisite admin interface; could even consider 
an autotiering option. 


 Tiered HFile storage
 

 Key: HBASE-6572
 URL: https://issues.apache.org/jira/browse/HBASE-6572
 Project: HBase
  Issue Type: Brainstorming
Reporter: Andrew Purtell
Assignee: Andrew Purtell

 Consider how we might enable tiered HFile storage. If HDFS has the 
 capability, we could create certain files on solid state devices where they 
 might be frequently accessed, especially for random reads; and others (and by 
 default) on spinning media as before. We could support the move of frequently 
 read HFiles from spinning media to solid state. We already have CF statistics 
 for this, would only need to add requisite admin interface; could even 
 consider an autotiering option. 
 Dhruba Borthakur did some early work in this area and wrote up his findings: 
 http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html . 
 It is important to note the findings but I suggest most of the 
 recommendations are out of scope of this JIRA. This JIRA seeks to find an 
 initial use case that produces a reasonable benefit, and serves as a testbed 
 for further improvements. If I may paraphrase Dhruba's findings (any 
 misstatements and errors are mine): First, the DFSClient code paths introduce 
 significant latency, so the HDFS client (and presumably the DataNode, as the 
 next bottleneck) will need significant work to knock that down. Need to 
 investigate optimized (perhaps read-only) DFS clients, server side read and 
 caching strategies. Second, RegionServers are heavily threaded and this 
 imposes a lot of monitor contention and context switching cost. Need to 
 investigate reducing the number of threads in a RegionServer, nonblocking IO 
 and RPC.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463344#comment-13463344
 ] 

Hudson commented on HBASE-6637:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/])
HBASE-6637 Argghh... Missed deleted files too (Revision 1390040)
HBASE-6637 Missed new files (Revision 1390035)
HBASE-6637 Move DaemonThreadFactory into Threads and Threads to hbase-common 
(Jesse Yates) (Revision 1390034)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java

larsh : 
Files : 
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java

larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java


 Move DaemonThreadFactory into Threads and Threads to hbase-common
 -

 Key: HBASE-6637
 URL: https://issues.apache.org/jira/browse/HBASE-6637
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Fix For: 0.96.0

 Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, 
 hbase-6637-v0.patch, hbase-6637-v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3678) Add Eclipse-based Apache Formatter to HBase Wiki

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463345#comment-13463345
 ] 

Hudson commented on HBASE-3678:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/])
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390028)
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390026)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/docbkx/developer.xml

stack : 
Files : 
* /hbase/trunk/dev-support/hbase_eclipse_formatter.xml
* /hbase/trunk/src/docbkx/developer.xml
* /hbase/trunk/src/docbkx/troubleshooting.xml


 Add Eclipse-based Apache Formatter to HBase Wiki
 

 Key: HBASE-3678
 URL: https://issues.apache.org/jira/browse/HBASE-3678
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Trivial
 Fix For: 0.92.0

 Attachments: eclipse_formatter_apache.xml


 Currently, on http://wiki.apache.org/hadoop/Hbase/HowToContribute , we tell 
 the user to follow Sun's code conventions and then add a couple things.  For 
 lazy people like myself, it would be much easier to just tell us to import an 
 Apache formatter into your Eclipse project and not worry about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5691) Importtsv stops the webservice from which it is evoked

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463347#comment-13463347
 ] 

Hudson commented on HBASE-5691:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/])
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390028)
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390026)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/docbkx/developer.xml

stack : 
Files : 
* /hbase/trunk/dev-support/hbase_eclipse_formatter.xml
* /hbase/trunk/src/docbkx/developer.xml
* /hbase/trunk/src/docbkx/troubleshooting.xml


 Importtsv stops the webservice from which it is evoked
 --

 Key: HBASE-5691
 URL: https://issues.apache.org/jira/browse/HBASE-5691
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: debarshi basak
Priority: Minor

 I was trying to run importtsv from a servlet. Everytime after the completion 
 of job, the tomcat server was shutdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463346#comment-13463346
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390013)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


 Skip checksum is broke; are we double-checksumming by default?
 --

 Key: HBASE-6868
 URL: https://issues.apache.org/jira/browse/HBASE-6868
 Project: HBase
  Issue Type: Bug
  Components: HFile, wal
Affects Versions: 0.94.0, 0.94.1
Reporter: LiuLei
Assignee: Lars Hofhansl
Priority: Blocker
 Fix For: 0.94.2, 0.96.0

 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
 6868-0.96-v3.txt


 The HFile contains checksums for decrease the iops, so when Hbase read HFile 
 , that dont't need to read the checksum from meta file of HDFS.  But HLog 
 file of Hbase don't contain the checksum, so when HBase read the HLog, that 
 must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
 file to hdfs or we could write checksums into WAL if this skip checksum 
 facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6353) Snapshots shell

2012-09-25 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-6353:
---

Issue Type: Sub-task  (was: New Feature)
Parent: HBASE-6055

 Snapshots shell
 ---

 Key: HBASE-6353
 URL: https://issues.apache.org/jira/browse/HBASE-6353
 Project: HBase
  Issue Type: Sub-task
  Components: shell
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Attachments: HBASE-6353-v0.patch


 h6. hbase shell with snapshot commands
 * snapshot snapshot name table name
 ** Take a snapshot of the specified name with the specified name 
 * restore_snapshot snapshot name
 ** Restore specified snapshot on the original table
 * mount_snapshot snapshot name table name [readonly]
 ** Load the snapshot data as specified table (optional readonly flag)
 * list_snapshots [filter]
 ** Show a list of snapshots
 * delete_snapshot snapshot name
 ** Remove a specified snapshot
 h6. Restore Table
 Given a snapshot name restore override the original table with the snapshot 
 content.
 Before restoring a new snapshot of the table is taken, just to avoid bad 
 situations.
 (If the table is not disabled we can keep serving reads)
 This allows a full and quick rollback to a previous snapshot.
 h6. Mount Table (Aka Clone Table)
 Given a snapshot name a new table is created with the content of the 
 specified snapshot.
 This operation allows:
  * To have an old version of the table in parallel with the current one.
  ** Look at snapshot side-by-side with the current before making the 
 decision whether to roll back or not
  * To Restore only individual items (only some small range of data was lost 
 from current)
  ** MR job that scan the cloned table and update the data in the original 
 one. (Partial restore of the data)
  * if the table is not marked as read-only
  ** To Add/Remove data from this table without affecting the original one or 
 the snapshot.
 h6. Open points
  * Add snapshot type option on take snapshot command (global, timestamp)?
  * Keep separate the restore from mount?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6025) Expose Hadoop Dynamic Metrics through JSON Rest interface

2012-09-25 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6025:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Applied to trunk.  Thanks for the patch Elliott.

 Expose Hadoop Dynamic Metrics through JSON Rest interface
 -

 Key: HBASE-6025
 URL: https://issues.apache.org/jira/browse/HBASE-6025
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: HBASE-6025-0.patch, HBASE-6025-1.patch, 
 HBASE-6025-2.patch, HBASE-6025-3.patch, HBASE-6025-4.patch, hbase-jmx2.patch, 
 hbase-jmx.patch, hbase-jmx.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash

2012-09-25 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463401#comment-13463401
 ] 

Jean-Daniel Cryans commented on HBASE-5844:
---

One thing that worries about this patch is the situation where the pid file is 
gone and someone tries to start the region server. It happened to me a bunch of 
times. I tried it with you patch and since it removes ephemeral znode it 
_kills_ the region server that's already running and doesn't start a new one 
because the ports are already occupied.

I'm not sure if this is related to this patch, but we're now missing info when 
using the scripts. We used to have:

{noformat}
su-jdcryans-2:0.94 jdcryans$ ./bin/start-hbase.sh 
localhost: starting zookeeper, logging to 
/Users/jdcryans/Work/HBase/0.94/bin/../logs/hbase-jdcryans-zookeeper-h-25-185.sfo.stumble.net.out
starting master, logging to 
/Users/jdcryans/Work/HBase/0.94/bin/../logs/hbase-jdcryans-master-h-25-185.sfo.stumble.net.out
localhost: starting regionserver, logging to 
/Users/jdcryans/Work/HBase/0.94/bin/../logs/hbase-jdcryans-regionserver-h-25-185.sfo.stumble.net.out
{noformat}

Now we have:

{noformat}
su-jdcryans-2:trunk-commit jdcryans$ ./bin/start-hbase.sh 

su-jdcryans-2:trunk-commit jdcryans$ 
{noformat}

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463409#comment-13463409
 ] 

stack commented on HBASE-5844:
--

Looking at this w/ j-d, now we no longer do nohup so the parent process can 
stick around to watch out for the server crash. This make it so now there are 
two  hbase processes listed per launched daemon.  This is kinda ugly.

When we have this bash script watching the running java process we verge into 
the territory normally occupied by babysitters like supervise.   Our parent 
bash script will always be less than a real babysitter -- supervise, god, etc. 
-- so maybe we should just have this kill znode as an optional script w/ 
prescription for how to set it up -- e.g. run znode remover on daemon crash 
before starting new one (if we want supervise to start a new one).

I'm thinking we should back this out since there are open questions still.

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96

2012-09-25 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463433#comment-13463433
 ] 

Jesse Yates commented on HBASE-6055:


I was going through the offline snapshot code 
(https://github.com/jyates/hbase/tree/offline-snapshots) and noticed that 
apparently I wrote the following:
{code}
Path editsdir = 
HLog.getRegionDirRecoveredEditsDir(HRegion.getRegionDir(tdir,regionInfo.getEncodedName()));
WALReferenceTask op = new WALReferenceTask(snapshot, this.monitor, editsdir, 
conf, fs, disabledTableSnapshot);
{code}

For referencing the current hfiles for a disabled table, this makes no sense. 
However, it got me thinking about dealing with recovered edits for a table. 
Even if a table is disabled, it may have recovered edits that haven't been 
applied to the table (a RS comes up, splits the logs, but then dies again 
before replaying the split log). 

If I'm reading the log-splitting code correctly, I think it archives the 
original HLog after splitting, but not before the edits are applied to the 
region. This would mean we also need to reference the recovered.edits directory 
under each region, if we keep the current implementation...right?

I was thinking that instead we can keep the hfiles around in the .logs 
directory until the recovered.edits files for that log file have been replayed. 
This way we can avoid another task for snapshotting (referencing all the 
recovered edits) and keep everything simple fairly simple. There would need to 
be some extra work to keep track of the source hlog - either an 'info' file for 
the source hlog that lists the written recovered.edits files or special naming 
of the recovered.edits files that point back to the source file. 

Thoughts?

 Snapshots in HBase 0.96
 ---

 Key: HBASE-6055
 URL: https://issues.apache.org/jira/browse/HBASE-6055
 Project: HBase
  Issue Type: New Feature
  Components: Client, master, regionserver, snapshots, Zookeeper
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: hbase-6055, 0.96.0

 Attachments: Snapshots in HBase.docx


 Continuation of HBASE-50 for the current trunk. Since the implementation has 
 drastically changed, opening as a new ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6870) HTable#coprocessorExec always scan the whole table

2012-09-25 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463466#comment-13463466
 ] 

chunhui shen commented on HBASE-6870:
-

[~v.himanshu]
These two if statements is not made by this patch, so I just keep the previous.

{code}
public LinkedHashMapbyte[], HRegionLocation getKeysToRegionsInRange(
{code}

Yes, it could be private.

Thanks for the review.

I will rework patch with other comments later

 HTable#coprocessorExec always scan the whole table 
 ---

 Key: HBASE-6870
 URL: https://issues.apache.org/jira/browse/HBASE-6870
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors
Affects Versions: 0.94.1
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, 
 HBASE-6870v2.patch, HBASE-6870v3.patch


 In current logic, HTable#coprocessorExec always scan the whole table, its 
 efficiency is low and will affect the Regionserver carrying .META. under 
 large coprocessorExec requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6025) Expose Hadoop Dynamic Metrics through JSON Rest interface

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463471#comment-13463471
 ] 

Hudson commented on HBASE-6025:
---

Integrated in HBase-TRUNK #3379 (See 
[https://builds.apache.org/job/HBase-TRUNK/3379/])
HBASE-6025 Expose Hadoop Dynamic Metrics through JSON Rest interface; 
REAPPLY (Revision 1390240)
HBASE-6025 Expose Hadoop Dynamic Metrics through JSON Rest interface; REVERT -- 
OVERCOMMIT (Revision 1390239)
HBASE-6025 Expose Hadoop Dynamic Metrics through JSON Rest interface (Revision 
1390238)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon
* /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/table.jsp
* 
/hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/tablesDetailed.jsp
* /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/zk.jsp

stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon
* /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/table.jsp
* 
/hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/tablesDetailed.jsp
* /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/zk.jsp
* /hbase/trunk/hbase-server/src/main/ruby/hbase/admin.rb
* /hbase/trunk/hbase-server/src/main/ruby/hbase/hbase.rb
* /hbase/trunk/hbase-server/src/main/ruby/hbase/table.rb
* /hbase/trunk/hbase-server/src/main/ruby/shell.rb
* /hbase/trunk/hbase-server/src/main/ruby/shell/commands.rb
* /hbase/trunk/hbase-server/src/main/ruby/shell/formatter.rb

stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon
* /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/table.jsp
* 
/hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/tablesDetailed.jsp
* /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/zk.jsp
* /hbase/trunk/hbase-server/src/main/ruby/hbase/admin.rb
* /hbase/trunk/hbase-server/src/main/ruby/hbase/hbase.rb
* /hbase/trunk/hbase-server/src/main/ruby/hbase/table.rb
* /hbase/trunk/hbase-server/src/main/ruby/shell.rb
* /hbase/trunk/hbase-server/src/main/ruby/shell/commands.rb
* /hbase/trunk/hbase-server/src/main/ruby/shell/formatter.rb


 Expose Hadoop Dynamic Metrics through JSON Rest interface
 -

 Key: HBASE-6025
 URL: https://issues.apache.org/jira/browse/HBASE-6025
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: HBASE-6025-0.patch, HBASE-6025-1.patch, 
 HBASE-6025-2.patch, HBASE-6025-3.patch, HBASE-6025-4.patch, hbase-jmx2.patch, 
 hbase-jmx.patch, hbase-jmx.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split

2012-09-25 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463494#comment-13463494
 ] 

Devaraj Das commented on HBASE-6679:


Okay, did some digging into the logs (that was attached in the jira earlier) 
and the code. Doesn't seem like a race between compaction and split (apologies 
for the confusion I might have created). The two are sequential (at the end of 
a compaction, split is requested for). But I'll note that the split happens in 
a separate thread.

The problem is that the daughter tries to open a reader to a file that doesn't 
exist. 
{noformat}
java.io.IOException: Failed 
ip-10-4-197-133.ec2.internal,60020,1346119706203-daughterOpener=4efb1c92918bbf3c54d0ead3345bb735
at 
org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:368)
at 
org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:456)
at 
org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.FileNotFoundException: File does not exist: 
/apps/hbase/data/TestLoadAndVerify_1346120615716/5689a8785bbc9a8aa8e526cd7ef1542a/f1/5a55df83829f401993d95ecf2e539ba1
{noformat}

The method SplitTransaction.createDaughters creates the reference files (via a 
call to the method SplitTransaction.splitStoreFiles) that the daughter then 
tries to open. The list of files to create references to is the set of entries 
in the storeFiles field in Store.java (obtained via the call to 
this.parent.close in createDaughters). The storeFiles is last updated (in the 
thread doing the compaction) in the method Store.completeCompaction.

My suspicion is that the problem is due to the fact that accesses to storeFiles 
is not synchronized, and it not volatile either. This leads to inconsistencies 
in the compaction-thread and split-thread and the split thread doesn't see the 
last updated value of the field.

If the above theory is right (and I have this theory only), then the solution 
could be to make the storeFiles field volatile.

Thoughts?

 RegionServer aborts due to race between compaction and split
 

 Key: HBASE-6679
 URL: https://issues.apache.org/jira/browse/HBASE-6679
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: rs-crash-parallel-compact-split.log


 In our nightlies, we have seen RS aborts due to compaction and split racing. 
 Original parent file gets deleted after the compaction, and hence, the 
 daughters don't find the parent data file. The RS kills itself when this 
 happens. Will attach a snippet of the relevant RS logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463503#comment-13463503
 ] 

liang xie commented on HBASE-6882:
--

Hi Mikhail, seems attached file is not for current community TRUNK version? 
since i saw :
{code:title=Hbase.thrift|borderStyle=solid}
 exception IOError {
   1: string message,
-  2: i64 backoffTimeMillis
+  2: i64 backoffTimeMillis,
+  3: string exceptionClass
 }
{code} 

there is no backoffTimeMillis parameter in struct IOError on current trunk code

and another thing, do we encourage using thrift2 more than thrift right now ? 
if that's right, maybe changing thrift2's TIOError is great ? 

 Thrift IOError should include exception class
 -

 Key: HBASE-6882
 URL: https://issues.apache.org/jira/browse/HBASE-6882
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D5679.1.patch


 Return exception class as part of IOError thrown from the Thrift proxy or the 
 embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463504#comment-13463504
 ] 

liang xie commented on HBASE-6882:
--

Hi Mikhail, seems attached file is not for current community TRUNK version? 
since i saw :
{code:title=Hbase.thrift|borderStyle=solid}
 exception IOError {
   1: string message,
-  2: i64 backoffTimeMillis
+  2: i64 backoffTimeMillis,
+  3: string exceptionClass
 }
{code} 

there is no backoffTimeMillis parameter in struct IOError on current trunk code

and another thing, do we encourage using thrift2 more than thrift right now ? 
if that's right, maybe changing thrift2's TIOError is great ? 

 Thrift IOError should include exception class
 -

 Key: HBASE-6882
 URL: https://issues.apache.org/jira/browse/HBASE-6882
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D5679.1.patch


 Return exception class as part of IOError thrown from the Thrift proxy or the 
 embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463507#comment-13463507
 ] 

stack commented on HBASE-6882:
--

@Lang thrift2 tries to make the thrift apis more align w/ current trunk.  
thrift1 has most usage and hence more trust.  What is lacking is an owner for 
either package.   Without this folks show up and fix their particular issue in 
whatever package they are using and then move on.  Would be grand if someone 
could drive thrift2 so it had all of thrift1 and was better aligned w/ the 
native apis.

 Thrift IOError should include exception class
 -

 Key: HBASE-6882
 URL: https://issues.apache.org/jira/browse/HBASE-6882
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D5679.1.patch


 Return exception class as part of IOError thrown from the Thrift proxy or the 
 embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463511#comment-13463511
 ] 

liang xie commented on HBASE-6882:
--

Got it, [~saint@gmail.com]
I'd like to have a try:)

 Thrift IOError should include exception class
 -

 Key: HBASE-6882
 URL: https://issues.apache.org/jira/browse/HBASE-6882
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D5679.1.patch


 Return exception class as part of IOError thrown from the Thrift proxy or the 
 embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463514#comment-13463514
 ] 

stack commented on HBASE-6882:
--

@liang That'd be great. Would suggest first a survey of thrift1 and thrift2.  
Figure what the difference is.  Do you want to have the two packages achieve 
parity?  Or do you want to add what is in thrift2 to thrift1 and keep up 
thrift1?  The exmamples package has stuff to exercise the thrift stuff.  A few 
more unit tests would probably not go amiss.  Good on you.

 Thrift IOError should include exception class
 -

 Key: HBASE-6882
 URL: https://issues.apache.org/jira/browse/HBASE-6882
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D5679.1.patch


 Return exception class as part of IOError thrown from the Thrift proxy or the 
 embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4565) Maven HBase build broken on cygwin with copynativelib.sh call.

2012-09-25 Thread Suraj Varma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463518#comment-13463518
 ] 

Suraj Varma commented on HBASE-4565:


This is no longer an issue on trunk, it appears. The build script 
modularization changes have completely done away with the copynativelibs.sh 
which caused the original issue. I am able to build from trunk successfully via 
cygwin now.

 Maven HBase build broken on cygwin with copynativelib.sh call.
 --

 Key: HBASE-4565
 URL: https://issues.apache.org/jira/browse/HBASE-4565
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.0
 Environment: cygwin (on xp and win7)
Reporter: Suraj Varma
Assignee: Suraj Varma
  Labels: build, maven
 Fix For: 0.96.0

 Attachments: HBASE-4565-0.92.patch, HBASE-4565.patch, 
 HBASE-4565-v2.patch, HBASE-4565-v3-0.92.patch, HBASE-4565-v3.patch


 This is broken in both 0.92 as well as trunk pom.xml
 Here's a sample maven log snippet from trunk (from Mayuresh on user mailing 
 list)
 [INFO] [antrun:run {execution: package}]
 [INFO] Executing tasks
 main:
[mkdir] Created dir: 
 D:\workspace\mkshirsa\hbase-trunk\target\hbase-0.93-SNAPSHOT\hbase-0.93-SNAPSHOT\lib\native\${build.platform}
 [exec] ls: cannot access D:workspacemkshirsahbase-trunktarget/nativelib: 
 No such file or directory
 [exec] tar (child): Cannot connect to D: resolve failed
 [INFO] 
 
 [ERROR] BUILD ERROR
 [INFO] 
 
 [INFO] An Ant BuildException has occured: exec returned: 3328
 There are two issues: 
 1) The ant run task below doesn't resolve the windows file separator returned 
 by the project.build.directory - this causes the above resolve failed.
 !-- Using Unix cp to preserve symlinks, using script to handle wildcards --
 echo file=${project.build.directory}/copynativelibs.sh
 if [ `ls ${project.build.directory}/nativelib | wc -l` -ne 0]; then
 2) The tar argument value below also has a similar issue in that the path arg 
 doesn't resolve right.
 !-- Using Unix tar to preserve symlinks --
 exec executable=tar failonerror=yes 
 dir=${project.build.directory}/${project.artifactId}-${project.version}
 arg value=czf/
 arg 
 value=/cygdrive/c/workspaces/hbase-0.92-svn/target/${project.artifactId}-${project.version}.tar.gz/
 arg value=./
 /exec
 In both cases, the fix would probably be to use a cross-platform way to 
 handle the directory locations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463523#comment-13463523
 ] 

stack commented on HBASE-6679:
--

For sure the regions was not doubly-assigned? Split happened of the region on 
one server but on another server, the same region was being compacted?  You'd 
need the master logs to figure it a dbl-assign.

Storefiles are an ImmutableList.

Can you figure a place where we'd be running compactions on a region concurrent 
w/ our splitting it?  Compacting we take out write lock.  Doesnt look like any 
locks while SplitTransaction is running (closing parent, it'll need write 
lock... thats after daughters open though).

 RegionServer aborts due to race between compaction and split
 

 Key: HBASE-6679
 URL: https://issues.apache.org/jira/browse/HBASE-6679
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: rs-crash-parallel-compact-split.log


 In our nightlies, we have seen RS aborts due to compaction and split racing. 
 Original parent file gets deleted after the compaction, and hence, the 
 daughters don't find the parent data file. The RS kills itself when this 
 happens. Will attach a snippet of the relevant RS logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463525#comment-13463525
 ] 

stack commented on HBASE-6882:
--

@Liang ... or just pick up any outstanding thrift issues and take a look at 
resolving them?

 Thrift IOError should include exception class
 -

 Key: HBASE-6882
 URL: https://issues.apache.org/jira/browse/HBASE-6882
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D5679.1.patch


 Return exception class as part of IOError thrown from the Thrift proxy or the 
 embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463528#comment-13463528
 ] 

stack commented on HBASE-6702:
--

+1 on commit after addressing Jesse comments.  The rest of the convertion work 
would be done in another issue?  Good stuff N.

 ResourceChecker refinement
 --

 Key: HBASE-6702
 URL: https://issues.apache.org/jira/browse/HBASE-6702
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6702.v1.patch, 6702.v4.patch


 This was based on some discussion from HBASE-6234.
 The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
 issues, but has since not be widely utilized. Further, with modularization we 
 have had to drop the ResourceChecker from the tests that are moved into the 
 hbase-common module because bringing the ResourceChecker up to hbase-common 
 would involved bringing all its dependencies (which are quite far reaching).
 The question then is, what should we do with it? Get rid of it? Refactor and 
 resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463532#comment-13463532
 ] 

liang xie commented on HBASE-6882:
--

Thanks [~saint@gmail.com] for nice guiding ! My plan is to resolve some 
outstanding thrift related issues firstly, afterwards i could know more 
details, then maybe i'll have a good feeling on how to fuse thriftthrift2. 
Don't worry, i'll send a design note before making any big change:)

 Thrift IOError should include exception class
 -

 Key: HBASE-6882
 URL: https://issues.apache.org/jira/browse/HBASE-6882
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D5679.1.patch


 Return exception class as part of IOError thrown from the Thrift proxy or the 
 embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split

2012-09-25 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463535#comment-13463535
 ] 

ramkrishna.s.vasudevan commented on HBASE-6679:
---

@Deva
Am not able to tell clearly what is the problem.  I too went thro those logs 
and found that the region 5689a8785bbc9a8aa8e526cd7ef1542a has completed the 
compaction.

{code}
2012-08-28 06:15:34,107 INFO 
org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed 
compaction: 
regionName=TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/07_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a.,
 storeName=f1, fileCount=3, fileSize=27.3m, priority=3, time=14360293782301; 
duration=4sec

{code}
and later the split has started for the region (after 2 ms)
{code}
2012-08-28 06:15:34,109 INFO 
org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region 
TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/07_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a.
{code}
The offlining of the region is done here

{code}
2012-08-28 06:15:34,788 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
Offlined parent region 
TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/07_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a.
 in META
{code}
So before this itself the region got closed. I feel the store file list should 
have been updated by the time. No ?


 RegionServer aborts due to race between compaction and split
 

 Key: HBASE-6679
 URL: https://issues.apache.org/jira/browse/HBASE-6679
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: rs-crash-parallel-compact-split.log


 In our nightlies, we have seen RS aborts due to compaction and split racing. 
 Original parent file gets deleted after the compaction, and hence, the 
 daughters don't find the parent data file. The RS kills itself when this 
 happens. Will attach a snippet of the relevant RS logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4565) Maven HBase build broken on cygwin with copynativelib.sh call.

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463543#comment-13463543
 ] 

stack commented on HBASE-4565:
--

[~svarma] So we should apply the patch to 0.92 and 0.94?  The v3 patch still 
works on windows?  Thanks for checking trunk.

 Maven HBase build broken on cygwin with copynativelib.sh call.
 --

 Key: HBASE-4565
 URL: https://issues.apache.org/jira/browse/HBASE-4565
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.0
 Environment: cygwin (on xp and win7)
Reporter: Suraj Varma
Assignee: Suraj Varma
  Labels: build, maven
 Fix For: 0.96.0

 Attachments: HBASE-4565-0.92.patch, HBASE-4565.patch, 
 HBASE-4565-v2.patch, HBASE-4565-v3-0.92.patch, HBASE-4565-v3.patch


 This is broken in both 0.92 as well as trunk pom.xml
 Here's a sample maven log snippet from trunk (from Mayuresh on user mailing 
 list)
 [INFO] [antrun:run {execution: package}]
 [INFO] Executing tasks
 main:
[mkdir] Created dir: 
 D:\workspace\mkshirsa\hbase-trunk\target\hbase-0.93-SNAPSHOT\hbase-0.93-SNAPSHOT\lib\native\${build.platform}
 [exec] ls: cannot access D:workspacemkshirsahbase-trunktarget/nativelib: 
 No such file or directory
 [exec] tar (child): Cannot connect to D: resolve failed
 [INFO] 
 
 [ERROR] BUILD ERROR
 [INFO] 
 
 [INFO] An Ant BuildException has occured: exec returned: 3328
 There are two issues: 
 1) The ant run task below doesn't resolve the windows file separator returned 
 by the project.build.directory - this causes the above resolve failed.
 !-- Using Unix cp to preserve symlinks, using script to handle wildcards --
 echo file=${project.build.directory}/copynativelibs.sh
 if [ `ls ${project.build.directory}/nativelib | wc -l` -ne 0]; then
 2) The tar argument value below also has a similar issue in that the path arg 
 doesn't resolve right.
 !-- Using Unix tar to preserve symlinks --
 exec executable=tar failonerror=yes 
 dir=${project.build.directory}/${project.artifactId}-${project.version}
 arg value=czf/
 arg 
 value=/cygdrive/c/workspaces/hbase-0.92-svn/target/${project.artifactId}-${project.version}.tar.gz/
 arg value=./
 /exec
 In both cases, the fix would probably be to use a cross-platform way to 
 handle the directory locations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split

2012-09-25 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463548#comment-13463548
 ] 

Devaraj Das commented on HBASE-6679:


bq. For sure the regions was not doubly-assigned? Split happened of the region 
on one server but on another server, the same region was being compacted? You'd 
need the master logs to figure it a dbl-assign

Unfortunately, didn't save the master logs when the failure happened.. 

bq. Can you figure a place where we'd be running compactions on a region 
concurrent w/ our splitting it? Compacting we take out write lock. Doesnt look 
like any locks while SplitTransaction is running (closing parent, it'll need 
write lock... thats after daughters open though).

I can't figure out a place where this could happen in the natural execution of 
the regionserver.

bq. Storefiles are an ImmutableList.

Yes.. but that still could be exposed to the problems of memory inconsistencies 
when multiple threads are accessing the object in unsynchronized/non-volatile 
ways, no?

bq. @Deva

After a long time, someone addressed me by that name :-)

bq. So before this itself the region got closed. I feel the store file list 
should have been updated by the time. No ?

Can't say Ram for sure. There is no guarantee unless the access (read/write) 
are synchronized or the field is declared volatile..


 RegionServer aborts due to race between compaction and split
 

 Key: HBASE-6679
 URL: https://issues.apache.org/jira/browse/HBASE-6679
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: rs-crash-parallel-compact-split.log


 In our nightlies, we have seen RS aborts due to compaction and split racing. 
 Original parent file gets deleted after the compaction, and hence, the 
 daughters don't find the parent data file. The RS kills itself when this 
 happens. Will attach a snippet of the relevant RS logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira