date:20150420


[ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503923#comment-14503923
 ] 

Hudson commented on HBASE-13514:


SUCCESS: Integrated in HBase-1.1 #414 (See 
[https://builds.apache.org/job/HBase-1.1/414/])
HBASE-13514 Fix test failures in TestScannerHeartbeatMessages caused by 
incorrect setting of hbase.rpc.timeout (Jonathan Lawlor) (tedyu: rev 
b9eac01704586488683c34a591ed52712f21e292)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestScannerHeartbeatMessages.java


 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Assignee: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0

 Attachments: HBASE-13514-branch-1.1.patch, 
 HBASE-13514-branch-1.patch, HBASE-13514.patch


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13471) Deadlock closing a region

2015-04-20 Thread Rajesh Nishtala (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Nishtala updated HBASE-13471:

Attachment: HBASE-13471.patch

 Deadlock closing a region
 -

 Key: HBASE-13471
 URL: https://issues.apache.org/jira/browse/HBASE-13471
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Rajesh Nishtala
 Attachments: HBASE-13471.patch


 {code}
 Thread 4139 
 (regionserver/hbase412.example.com/10.158.6.53:60020-splits-1429003183537):
   State: WAITING
   Blocked count: 131
   Waited count: 228
   Waiting on 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@50714dc3
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1371)
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1325)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:352)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:252)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:509)
 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84)
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-13515) Handle FileNotFoundException in region replica replay for flush/compaction events

Enis Soztutar created HBASE-13515:
-

 Summary: Handle FileNotFoundException in region replica replay for 
flush/compaction events
 Key: HBASE-13515
 URL: https://issues.apache.org/jira/browse/HBASE-13515
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0


I had this patch laying around that somehow dropped from my plate. We should 
skip replaying compaction / flush and region open event markers if the files 
(from flush or compaction) can no longer be found from the secondary. If we do 
not skip, the replay will be retried forever, effectively blocking the 
replication further. 

Bulk load already does this, we just need to do it for flush / compaction and 
region open events as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13515) Handle FileNotFoundException in region replica replay for flush/compaction events


 [ 
https://issues.apache.org/jira/browse/HBASE-13515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-13515:
--
Status: Patch Available  (was: Open)

 Handle FileNotFoundException in region replica replay for flush/compaction 
 events
 -

 Key: HBASE-13515
 URL: https://issues.apache.org/jira/browse/HBASE-13515
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0

 Attachments: hbase-13515_v1.patch


 I had this patch laying around that somehow dropped from my plate. We should 
 skip replaying compaction / flush and region open event markers if the files 
 (from flush or compaction) can no longer be found from the secondary. If we 
 do not skip, the replay will be retried forever, effectively blocking the 
 replication further. 
 Bulk load already does this, we just need to do it for flush / compaction and 
 region open events as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13515) Handle FileNotFoundException in region replica replay for flush/compaction events


 [ 
https://issues.apache.org/jira/browse/HBASE-13515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-13515:
--
Attachment: hbase-13515_v1.patch

Attaching straightforward patch. 

 Handle FileNotFoundException in region replica replay for flush/compaction 
 events
 -

 Key: HBASE-13515
 URL: https://issues.apache.org/jira/browse/HBASE-13515
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0

 Attachments: hbase-13515_v1.patch


 I had this patch laying around that somehow dropped from my plate. We should 
 skip replaying compaction / flush and region open event markers if the files 
 (from flush or compaction) can no longer be found from the secondary. If we 
 do not skip, the replay will be retried forever, effectively blocking the 
 replication further. 
 Bulk load already does this, we just need to do it for flush / compaction and 
 region open events as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by a too restrictive setting for


 [ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Lawlor updated HBASE-13514:

Summary: Fix test failures in TestScannerHeartbeatMessages caused by a too 
restrictive setting for   (was: Fix test failures in 
TestScannerHeartbeatMessages in branch-1.1 and branch-1)

 Fix test failures in TestScannerHeartbeatMessages caused by a too restrictive 
 setting for 
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout


 [ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Lawlor updated HBASE-13514:

Description: The test inside TestScannerHeartbeatMessages is failing 
because the configured value of hbase.rpc.timeout cannot be less than 2 
seconds in branch-1 and branch-1.1 but the test expects that it can be set to 
0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in 
{{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer 
in master.  (was: The test inside TestScannerHeartbeatMessages is failing 
because the configured value of hbase.rpc.timeout cannot be less than 2 
seconds in branch-1 and branch-1.1 but the test expects that it can be set to 
0.5 seconds.)

 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout


[ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503636#comment-14503636
 ] 

Ted Yu commented on HBASE-13514:


Test now passes.

+1

 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Assignee: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0

 Attachments: HBASE-13514-branch-1.1.patch, 
 HBASE-13514-branch-1.patch, HBASE-13514.patch


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout


 [ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-13514:
---
Hadoop Flags: Reviewed

 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Assignee: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0

 Attachments: HBASE-13514-branch-1.1.patch, 
 HBASE-13514-branch-1.patch, HBASE-13514.patch


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-10070) HBase read high-availability using timeline-consistent region replicas

[
https://issues.apache.org/jira/browse/HBASE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Enis Soztutar updated HBASE-10070:
--
Fix Version/s: 1.1.0
2.0.0

HBase read high-availability using timeline-consistent region replicas
--

Key: HBASE-10070
URL: https://issues.apache.org/jira/browse/HBASE-10070
Project: HBase
Issue Type: New Feature
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Fix For: 2.0.0, 1.1.0

Attachments: HighAvailabilityDesignforreadsApachedoc.pdf

In the present HBase architecture, it is hard, probably impossible, to
satisfy constraints like 99th percentile of the reads will be served under 10
ms. One of the major factors that affects this is the MTTR for regions. There
are three phases in the MTTR process - detection, assignment, and recovery.
Of these, the detection is usually the longest and is presently in the order
of 20-30 seconds. During this time, the clients would not be able to read the
region data.
However, some clients will be better served if regions will be available for
reads during recovery for doing eventually consistent reads. This will help
with satisfying low latency guarantees for some class of applications which
can work with stale reads.
For improving read availability, we propose a replicated read-only region
serving design, also referred as secondary regions, or region shadows.
Extending current model of a region being opened for reads and writes in a
single region server, the region will be also opened for reading in region
servers. The region server which hosts the region for reads and writes (as in
current case) will be declared as PRIMARY, while 0 or more region servers
might be hosting the region as SECONDARY. There may be more than one
secondary (replica count 2).
Will attach a design doc shortly which contains most of the details and some
thoughts about development approaches. Reviews are more than welcome.
We also have a proof of concept patch, which includes the master and regions
server side of changes. Client side changes will be coming soon as well.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-20 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503806#comment-14503806
 ] 

Lars Hofhansl commented on HBASE-13389:
---

Thanks [~jeffreyz], just discussed a bit with [~stack]... If we kept the 
in-order compactions, we won't need MVCC stamps in the HFile beyond the oldest 
scanner, right?

I feel like I am missing something. Could you show an example of when we need 
MVCC stamps in the HFile beyond the oldest scanner when you have some time?
The issue has to do with Puts/Deletes happening in the same millisecond, right?

 [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
 -

 Key: HBASE-13389
 URL: https://issues.apache.org/jira/browse/HBASE-13389
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: stack
 Attachments: 13389.txt


 HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
 mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
 mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
 file level. This is sort of how it should be many of us would argue but as a 
 side-effect of this change, read-time optimizations that helped speed scans 
 were undone by this change.
 In this issue, lets see if we can get the optimizations back -- or just 
 remove the optimizations altogether.
 The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
 The optimizations undone by this changes are (to quote the optimizer himself, 
 Mr [~lhofhansl]):
 {quote}
 Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
 We're always storing the mvcc readpoints, and we never compare them against 
 the actual smallestReadpoint, and hence we're always performing all the 
 checks, tests, and comparisons that these jiras removed in addition to 
 actually storing the data - which with up to 8 bytes per Cell is not trivial.
 {quote}
 This is the 'breaking' change: 
 https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner


[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503889#comment-14503889
 ] 

Jonathan Lawlor commented on HBASE-13082:
-

I'm a little late to the party but this versioned data structure sounds neat. 
If I'm understanding correctly, it sounds like this versioned data structure 
would also allow us to remove the lingering lock in updateReaders (and 
potentially remove updateReaders completely?). Instead of having to update the 
readers, the compaction/flush would occur in the background and be made visible 
to new readers via a new latest version in the data structure, is that 
correct? In other words, would the introduction of this new versioned data 
structure make StoreScanner single threaded (and thus remove any need for 
synchronization)?

 Coarsen StoreScanner locks to RegionScanner
 ---

 Key: HBASE-13082
 URL: https://issues.apache.org/jira/browse/HBASE-13082
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, 
 next.png, next.png


 Continuing where HBASE-10015 left of.
 We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
 the lock already held by the RegionScanner.
 In tests this shows quite a scan improvement and reduced CPU (the fences make 
 the cores wait for memory fetches).
 There are some drawbacks too:
 * All calls to RegionScanner need to be remain synchronized
 * Implementors of coprocessors need to be diligent in following the locking 
 contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
 required in the documentation (not picking on Phoenix, this one is my fault 
 as I told them it's OK)
 * possible starving of flushes and compaction with heavy read load. 
 RegionScanner operations would keep getting the locks and the 
 flushes/compactions would not be able finalize the set of files.
 I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13515) Handle FileNotFoundException in region replica replay for flush/compaction events

2015-04-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503934#comment-14503934
 ] 

Hadoop QA commented on HBASE-13515:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726685/hbase-13515_v1.patch
  against master branch at commit eb82b8b3098d6a9ac62aa50189f9d4b289f38472.
  ATTACHMENT ID: 12726685

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13746//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13746//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13746//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13746//console

This message is automatically generated.

 Handle FileNotFoundException in region replica replay for flush/compaction 
 events
 -

 Key: HBASE-13515
 URL: https://issues.apache.org/jira/browse/HBASE-13515
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0

 Attachments: hbase-13515_v1.patch


 I had this patch laying around that somehow dropped from my plate. We should 
 skip replaying compaction / flush and region open event markers if the files 
 (from flush or compaction) can no longer be found from the secondary. If we 
 do not skip, the replay will be retried forever, effectively blocking the 
 replication further. 
 Bulk load already does this, we just need to do it for flush / compaction and 
 region open events as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13502) Deprecate/remove getRowComparator() in TableName


[ 
https://issues.apache.org/jira/browse/HBASE-13502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503993#comment-14503993
 ] 

stack commented on HBASE-13502:
---

Ok, if two places then, yeah, sounds like KVComparator is going to be sticking 
around a while (if deprecated). HRI having a getComparator makes more sense but 
it should not be public so, deprecate here... I think you can set the 
IA.Private annotation on a method? Could do that too for these two 
getComparator calls.

There are only two comparator types (four if you include reverse comparators) 
and even then, the comparators only differ in how they compare rows...  The 
switch is table name (meta and user table name -- later, if we bring back root, 
it will be a 3rd dimension on comparators...).  Would be good to shutdown the 
places we go when comparator is not plain (i.e. we didn't read the comparator 
to use from hfile, etc.)... say have a static or a factory on CellComparator 
that took a TableName instance... and use that in place of these methods.

 Deprecate/remove getRowComparator() in TableName
 

 Key: HBASE-13502
 URL: https://issues.apache.org/jira/browse/HBASE-13502
 Project: HBase
  Issue Type: Sub-task
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 2.0.0

 Attachments: HBASE-13502.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13516) Increase PermSize to 128MB


[ 
https://issues.apache.org/jira/browse/HBASE-13516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503995#comment-14503995
 ] 

stack commented on HBASE-13516:
---

Yes, given you've done the research.

Only needed in jdk8.

 Increase PermSize to 128MB
 --

 Key: HBASE-13516
 URL: https://issues.apache.org/jira/browse/HBASE-13516
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0


 HBase uses ~40MB, and with Phoenix we use ~56MB of Perm space out of 64MB by 
 default. Every Filter and Coprocessor increases that.
 Running out of perm space triggers a stop the world full GC of the entire 
 heap. We have seen this in misconfigured cluster. 
 Should we default to  {{-XX:PermSize=128m -XX:MaxPermSize=128m}} out of the 
 box as a convenience for users? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13517) Publish a client artifact with shaded dependencies


 [ 
https://issues.apache.org/jira/browse/HBASE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-13517:
--
Description: 
Guava's moved on. Hadoop has not.
Jackson moves whenever it feels like it.
Protobuf moves with breaking point changes.

While shading all of the time would break people that require the transitive 
dependencies for MR or other things. Lets provide an artifact with our 
dependencies shaded. Then users can have the choice to use the shaded version 
or the non-shaded version.

 Publish a client artifact with shaded dependencies
 --

 Key: HBASE-13517
 URL: https://issues.apache.org/jira/browse/HBASE-13517
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark

 Guava's moved on. Hadoop has not.
 Jackson moves whenever it feels like it.
 Protobuf moves with breaking point changes.
 While shading all of the time would break people that require the transitive 
 dependencies for MR or other things. Lets provide an artifact with our 
 dependencies shaded. Then users can have the choice to use the shaded version 
 or the non-shaded version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13516) Increase PermSize to 128MB


[ 
https://issues.apache.org/jira/browse/HBASE-13516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504070#comment-14504070
 ] 

Andrew Purtell commented on HBASE-13516:


bq. Only needed in jdk8.

I think Stack meant not needed in jdk8 since perm gen went away and using these 
options will cause the JVM to throw up warnings.

 Increase PermSize to 128MB
 --

 Key: HBASE-13516
 URL: https://issues.apache.org/jira/browse/HBASE-13516
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0


 HBase uses ~40MB, and with Phoenix we use ~56MB of Perm space out of 64MB by 
 default. Every Filter and Coprocessor increases that.
 Running out of perm space triggers a stop the world full GC of the entire 
 heap. We have seen this in misconfigured cluster. 
 Should we default to  {{-XX:PermSize=128m -XX:MaxPermSize=128m}} out of the 
 box as a convenience for users? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13516) Increase PermSize to 128MB


[ 
https://issues.apache.org/jira/browse/HBASE-13516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504069#comment-14504069
 ] 

Andrew Purtell commented on HBASE-13516:


+1

 Increase PermSize to 128MB
 --

 Key: HBASE-13516
 URL: https://issues.apache.org/jira/browse/HBASE-13516
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0


 HBase uses ~40MB, and with Phoenix we use ~56MB of Perm space out of 64MB by 
 default. Every Filter and Coprocessor increases that.
 Running out of perm space triggers a stop the world full GC of the entire 
 heap. We have seen this in misconfigured cluster. 
 Should we default to  {{-XX:PermSize=128m -XX:MaxPermSize=128m}} out of the 
 box as a convenience for users? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout


 [ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Lawlor reassigned HBASE-13514:
---

Assignee: Jonathan Lawlor

 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Assignee: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0

 Attachments: HBASE-13514-branch-1.1.patch, 
 HBASE-13514-branch-1.patch, HBASE-13514.patch


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout


 [ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Lawlor updated HBASE-13514:

Status: Patch Available  (was: Open)

 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Assignee: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0

 Attachments: HBASE-13514-branch-1.1.patch, 
 HBASE-13514-branch-1.patch, HBASE-13514.patch


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout


 [ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Lawlor updated HBASE-13514:

Attachment: HBASE-13514-branch-1.patch
HBASE-13514-branch-1.1.patch
HBASE-13514.patch

Attaching a patch for each branch to get a QA run on each. The patch addresses 
the test failure and also adds a deleteTable in test cleanup. [~tedyu] got some 
time to take a quick looksee?

 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0

 Attachments: HBASE-13514-branch-1.1.patch, 
 HBASE-13514-branch-1.patch, HBASE-13514.patch


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13469) [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1


[ 
https://issues.apache.org/jira/browse/HBASE-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503824#comment-14503824
 ] 

stack commented on HBASE-13469:
---

[~syuanjiang]
bq. I think we should spend our energy to clean up handler code in 1.2 and make 
procedure robust.

Ok. Sounds reasonable. Took a look at the last patch and not much code and it 
has a test (only nit comment is why not have the enum name same as the 
configuration value that turns on the state: i.e. name enums unused, disable, 
enabled... then you could compare the configuration and the enum toString'd... 
No biggie)

 [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1
 

 Key: HBASE-13469
 URL: https://issues.apache.org/jira/browse/HBASE-13469
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Stephen Yuan Jiang
 Fix For: 1.1.0

 Attachments: HBASE-13469.v1-branch-1.1.patch


 In branch-1, I think we want proc v2 to be configurable, so that if any 
 non-recoverable issue is found, at least there is a workaround. We already 
 have the handlers and code laying around. It will be just introducing the 
 config to enable / disable. We can even make it dynamically configurable via 
 the new framework. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

[
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503943#comment-14503943
]

stack commented on HBASE-13389:
---

bq. This may be hard to achieve because out of order puts can be flushed at
different time.

Do 'out of order' puts happen at DLR time only [~jeffreyz]? i.e. WALs can be
replayed in any order since they are farmed out over the cluster. We also
cannot guarantee when a region that is receiving DLR edits will flush hfiles;
e.g. we could get row1/logSeqId=2 during DLR and flush because we had memory
pressure, but then later row1/logSeqId=1 might arrive and be flushed into a
newer hfile. The fix for this is to not let compactions happen when region is
in recovery -- this is probably the case already (or let compactions go on but
preserve mvcc while in recovery)?

So, the Lars fix would be to drop mvcc if no scanner outstanding with a span
that includes mvcc in current hfile AND we are not in DLR recovery mode?

Are there other places where we might have out-of-order puts? (Flushes are
single threaded and edits go into FSHLog and MemStore in order caveat Elliott
and Nate's recent find:
https://issues.apache.org/jira/browse/HBASE-12751?focusedCommentId=14377157page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14377157).

bq. ...and only keep mvcc around during region recovery time so that we can
still keep HBASE-12600 goal

Yes.

On keeping seqid in the KV in hfiles so we can do ...out of order in minor
compactions.

...don't we mean compacting non-adjacent files rather than out-of-order here?

So, yeah, if we preserved mvcc always, we could do any order and non-adjacent.
Would be nice.

Otherwise, as I see it, if we want to do non-adjacent compactions (which as
[~lhofhansl] says above, we do not currently have), then we could do it if all
files under a Store have zero for mvcc and we just order the edits by the hfile
meta data mvcc number. When there are files with an mvcc per KV, then we should
probably merge those first... Would have to think it through more.

It gets a little complicated though if the Store has some files with a hfile
meta data mvcc number but other files have an mvcc per KV. We could not do a
file that has an mvcc per KV with a non-adjacent

But we could do it also if files with zero if we have the Lars optimization, we
could do non-adjacent if we respected the hfile seqid order. It gets tricky if
a file has mvcc in the KV and all the rest do not. Files with KVs in the mvcc
need to be compacted together ahead of

[REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
-

Key: HBASE-13389
URL: https://issues.apache.org/jira/browse/HBASE-13389
Project: HBase
Issue Type: Sub-task
Components: Performance
Reporter: stack
Attachments: 13389.txt

HBASE-12600 moved the edit sequenceid from tags to instead exploit the
mvcc/sequenceid slot in a key. Now Cells near-always have an associated
mvcc/sequenceid where previous it was rare or the mvcc was kept up at the
file level. This is sort of how it should be many of us would argue but as a
side-effect of this change, read-time optimizations that helped speed scans
were undone by this change.
In this issue, lets see if we can get the optimizations back -- or just
remove the optimizations altogether.
The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
The optimizations undone by this changes are (to quote the optimizer himself,
Mr [~lhofhansl]):
{quote}
Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
We're always storing the mvcc readpoints, and we never compare them against
the actual smallestReadpoint, and hence we're always performing all the
checks, tests, and comparisons that these jiras removed in addition to
actually storing the data - which with up to 8 bytes per Cell is not trivial.
{quote}
This is the 'breaking' change:
https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine

2015-04-20 Thread zhangduo (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503941#comment-14503941
]

zhangduo commented on HBASE-13259:
--

I can pick this up and address the 'ugly ByteBufferArray'.
But we do not have enough time to test it on large dataset if we want to catch
up with the first rc of 1.1 I think. It is a tuning work, the time we need is
unpredictable. We can file a new issue to hold the tuning work and resolve this
issue before the first rc of 1.1.

What do you think? [~ndimiduk]
Thanks.

mmap() based BucketCache IOEngine
-

Key: HBASE-13259
URL: https://issues.apache.org/jira/browse/HBASE-13259
Project: HBase
Issue Type: New Feature
Components: BlockCache
Affects Versions: 0.98.10
Reporter: Zee Chen
Fix For: 2.0.0, 1.1.0

Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg,
mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch

Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data
from kernel space to user space. This is a good choice when the total working
set size is much bigger than the available RAM and the latency is dominated
by IO access. However, when the entire working set is small enough to fit in
the RAM, using mmap() (and subsequent memcpy()) to move data from kernel
space to user space is faster. I have run some short keyval gets tests and
the results indicate a reduction of 2%-7% of kernel CPU on my system,
depending on the load. On the gets, the latency histograms from mmap() are
identical to those from pread(), but peak throughput is close to 40% higher.
This patch modifies ByteByfferArray to allow it to specify a backing file.
Example for using this feature: set hbase.bucketcache.ioengine to
mmap:/dev/shm/bucketcache.0 in hbase-site.xml.
Attached perf measured CPU usage breakdown in flames graph.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-4462) Properly treating SocketTimeoutException


 [ 
https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4462:
--
Affects Version/s: (was: 0.90.4)

 Properly treating SocketTimeoutException
 

 Key: HBASE-4462
 URL: https://issues.apache.org/jira/browse/HBASE-4462
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
 Fix For: 0.90.8

 Attachments: HBASE-4462_0.90.x.patch, 
 unittest_that_shows_us_retrying_sockettimeout.txt


 SocketTimeoutException is currently treated like any IOE inside of 
 HCM.getRegionServerWithRetries and I think this is a problem. This method 
 should only do retries in cases where we are pretty sure the operation will 
 complete, but with STE we already waited for (by default) 60 seconds and 
 nothing happened.
 I found this while debugging Douglas Campbell's problem on the mailing list 
 where it seemed like he was using the same scanner from multiple threads, but 
 actually it was just the same client doing retries while the first run didn't 
 even finish yet (that's another problem). You could see the first scanner, 
 then up to two other handlers waiting for it to finish in order to run 
 (because of the synchronization on RegionScanner).
 So what should we do? We could treat STE as a DoNotRetryException and let the 
 client deal with it, or we could retry only once.
 There's also the option of having a different behavior for get/put/icv/scan, 
 the issue with operations that modify a cell is that you don't know if the 
 operation completed or not (same when a RS dies hard after completing let's 
 say a Put but just before returning to the client).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-5110) code enhancement - remove unnecessary if-checks in every loop in HLog class


 [ 
https://issues.apache.org/jira/browse/HBASE-5110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-5110:
--
Resolution: Not A Problem
Status: Resolved  (was: Patch Available)

 code enhancement - remove unnecessary if-checks in every loop in HLog class
 ---

 Key: HBASE-5110
 URL: https://issues.apache.org/jira/browse/HBASE-5110
 Project: HBase
  Issue Type: Improvement
  Components: wal
Affects Versions: 0.90.1, 0.90.2, 0.90.4, 0.92.0
Reporter: Mikael Sitruk
Priority: Minor
 Attachments: HBASE-5110_1.patch


 The HLog class (method findMemstoresWithEditsEqualOrOlderThan) has 
 unnecessary if check in a loop.
  static byte [][] findMemstoresWithEditsEqualOrOlderThan(final long 
 oldestWALseqid,
   final Mapbyte [], Long regionsToSeqids) {
 //  This method is static so it can be unit tested the easier.
 Listbyte [] regions = null;
 for (Map.Entrybyte [], Long e: regionsToSeqids.entrySet()) {
   if (e.getValue().longValue() = oldestWALseqid) {
 if (regions == null) regions = new ArrayListbyte []();
 regions.add(e.getKey());
   }
 }
 return regions == null?
   null: regions.toArray(new byte [][] {HConstants.EMPTY_BYTE_ARRAY});
   }
 The following change is suggested
   static byte [][] findMemstoresWithEditsEqualOrOlderThan(final long 
 oldestWALseqid,
   final Mapbyte [], Long regionsToSeqids) {
 //  This method is static so it can be unit tested the easier.
 Listbyte [] regions = new ArrayListbyte []();
 for (Map.Entrybyte [], Long e: regionsToSeqids.entrySet()) {
   if (e.getValue().longValue() = oldestWALseqid) {
 regions.add(e.getKey());
   }
 }
 return regions.size() == 0?
   null: regions.toArray(new byte [][] {HConstants.EMPTY_BYTE_ARRAY});
   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-8720) Only one snapshot region tasks that can run at a time


 [ 
https://issues.apache.org/jira/browse/HBASE-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-8720:
--
Resolution: Not A Problem
Status: Resolved  (was: Patch Available)

 Only one snapshot region tasks that can run at a time
 -

 Key: HBASE-8720
 URL: https://issues.apache.org/jira/browse/HBASE-8720
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.94.8, 0.95.0
Reporter: binlijin
 Attachments: 8720-v2.txt, HBASE-8720.patch


 {code}
 SnapshotSubprocedurePool(String name, Configuration conf) {
   // configure the executor service
   long keepAlive = conf.getLong(
 RegionServerSnapshotManager.SNAPSHOT_TIMEOUT_MILLIS_KEY,
 RegionServerSnapshotManager.SNAPSHOT_TIMEOUT_MILLIS_DEFAULT);
   int threads = conf.getInt(CONCURENT_SNAPSHOT_TASKS_KEY, 
 DEFAULT_CONCURRENT_SNAPSHOT_TASKS);
   this.name = name;
   executor = new ThreadPoolExecutor(1, threads, keepAlive, 
 TimeUnit.MILLISECONDS,
   new LinkedBlockingQueueRunnable(), new DaemonThreadFactory(rs(
   + name + )-snapshot-pool));
   taskPool = new ExecutorCompletionServiceVoid(executor);
 }
 {code}
 ThreadPoolExecutor： 
 corePoolSize：1
 maximumPoolSize：3
 workQueue：LinkedBlockingQueue，unlimited
 so when a new task submit to the ThreadPoolExecutor, if there is a task is 
 running, the new task is queued in the queue, so all snapshot region tasks 
 execute one by one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-7218) Rename Snapshot


 [ 
https://issues.apache.org/jira/browse/HBASE-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-7218:
--
Fix Version/s: (was: hbase-6055)
 Assignee: (was: Matteo Bertozzi)
Affects Version/s: (was: hbase-6055)
   Status: Open  (was: Patch Available)

Cancelling stale patch. CLose this?

 Rename Snapshot
 ---

 Key: HBASE-7218
 URL: https://issues.apache.org/jira/browse/HBASE-7218
 Project: HBase
  Issue Type: New Feature
  Components: snapshots
Reporter: Matteo Bertozzi
Priority: Minor
 Attachments: HBASE-7218-v0.patch, HBASE-7218-v1.patch


 Add the ability to rename a snapshot.
 HBaseAdmin.renameSnapshot(oldName, newName)
 shell: snapshot_rename 'oldName', 'newName'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-8064) hbase connection could not reuse


 [ 
https://issues.apache.org/jira/browse/HBASE-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-8064:
--
  Resolution: Cannot Reproduce
Assignee: (was: Yuan Kang)
Release Note:   (was: hbase connection manager can't resuse the connection 
for this code,the patch resolve it)
  Status: Resolved  (was: Patch Available)

 hbase connection could not reuse
 

 Key: HBASE-8064
 URL: https://issues.apache.org/jira/browse/HBASE-8064
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.0
 Environment: hadoop-1.0.2 hbase-0.94.0
Reporter: Yuan Kang
  Labels: patch
 Attachments: HConnectionManager-connection-could-not-reuse.patch


 when hconnection is used by one matchine,the connection return to the pool. 
 if anather matchine reget the connection,it can be resued.
 but in the code the caching map don't be managered correctly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13071) Hbase Streaming Scan Feature

2015-04-20 Thread Eshcar Hillel (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503549#comment-14503549
]

Eshcar Hillel commented on HBASE-13071:
---

Done rebase.
Thanks to HBASE-13090 next and loadCache methods are separated so this rebase
wasn't too painful (thanks [~jonathan.lawlor]).
I also changed some new scanner tests to account for the change in scanner
cache interface (it is now a Queue).

Hbase Streaming Scan Feature

Key: HBASE-13071
URL: https://issues.apache.org/jira/browse/HBASE-13071
Project: HBase
Issue Type: New Feature
Reporter: Eshcar Hillel
Attachments: 99.eshcar.png, HBASE-13071_98_1.patch,
HBASE-13071_trunk_1.patch, HBASE-13071_trunk_10.patch,
HBASE-13071_trunk_2.patch, HBASE-13071_trunk_3.patch,
HBASE-13071_trunk_4.patch, HBASE-13071_trunk_5.patch,
HBASE-13071_trunk_6.patch, HBASE-13071_trunk_7.patch,
HBASE-13071_trunk_8.patch, HBASE-13071_trunk_9.patch,
HBASE-13071_trunk_rebase_1.0.patch, HBaseStreamingScanDesign.pdf,
HbaseStreamingScanEvaluation.pdf,
HbaseStreamingScanEvaluationwithMultipleClients.pdf, gc.delay.png,
gc.eshcar.png, gc.png, hits.delay.png, hits.eshcar.png, hits.png,
latency.delay.png, latency.png, network.png

A scan operation iterates over all rows of a table or a subrange of the
table. The synchronous nature in which the data is served at the client side
hinders the speed the application traverses the data: it increases the
overall processing time, and may cause a great variance in the times the
application waits for the next piece of data.
The scanner next() method at the client side invokes an RPC to the
regionserver and then stores the results in a cache. The application can
specify how many rows will be transmitted per RPC; by default this is set to
100 rows.
The cache can be considered as a producer-consumer queue, where the hbase
client pushes the data to the queue and the application consumes it.
Currently this queue is synchronous, i.e., blocking. More specifically, when
the application consumed all the data from the cache --- so the cache is
empty --- the hbase client retrieves additional data from the server and
re-fills the cache with new data. During this time the application is blocked.
Under the assumption that the application processing time can be balanced by
the time it takes to retrieve the data, an asynchronous approach can reduce
the time the application is waiting for data.
We attach a design document.
We also have a patch that is based on a private branch, and some evaluation
results of this code.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13469) [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1

2015-04-20 Thread Stephen Yuan Jiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13469:
---
Attachment: HBASE-13469.v1-branch-1.1.patch

 [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1
 

 Key: HBASE-13469
 URL: https://issues.apache.org/jira/browse/HBASE-13469
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Stephen Yuan Jiang
 Fix For: 1.1.0

 Attachments: HBASE-13469.v1-branch-1.1.patch


 In branch-1, I think we want proc v2 to be configurable, so that if any 
 non-recoverable issue is found, at least there is a workaround. We already 
 have the handlers and code laying around. It will be just introducing the 
 config to enable / disable. We can even make it dynamically configurable via 
 the new framework. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-13516) Increase PermSize to 128MB

Enis Soztutar created HBASE-13516:
-

 Summary: Increase PermSize to 128MB
 Key: HBASE-13516
 URL: https://issues.apache.org/jira/browse/HBASE-13516
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0


HBase uses ~40MB, and with Phoenix we use ~56MB of Perm space out of 64MB by 
default. Every Filter and Coprocessor increases that.

Running out of perm space triggers a stop the world full GC of the entire heap. 
We have seen this in misconfigured cluster. 

Should we default to  {{-XX:PermSize=128m -XX:MaxPermSize=128m}} out of the box 
as a convenience for users? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout

2015-04-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503985#comment-14503985
 ] 

Hadoop QA commented on HBASE-13514:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12726661/HBASE-13514-branch-1.patch
  against branch-1 branch at commit 702aea5b38ed6ad0942b0c59c3accca476b46873.
  ATTACHMENT ID: 12726661

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13745//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13745//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13745//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13745//console

This message is automatically generated.

 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Assignee: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0

 Attachments: HBASE-13514-branch-1.1.patch, 
 HBASE-13514-branch-1.patch, HBASE-13514.patch


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine

2015-04-20 Thread Nick Dimiduk (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504049#comment-14504049
]

Nick Dimiduk commented on HBASE-13259:
--

Right. Sounds good.

mmap() based BucketCache IOEngine
-

Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg,
mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-7750) We should throw IOE when calling HRegionServer#replicateLogEntries if ReplicationSink is null


 [ 
https://issues.apache.org/jira/browse/HBASE-7750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-7750:
--
Resolution: Incomplete
  Assignee: (was: Jieshan Bean)
Status: Resolved  (was: Patch Available)

 We should throw IOE when calling HRegionServer#replicateLogEntries if 
 ReplicationSink is null
 -

 Key: HBASE-7750
 URL: https://issues.apache.org/jira/browse/HBASE-7750
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.4, 0.95.2
Reporter: Jieshan Bean
 Attachments: HBASE-7750-94.patch, HBASE-7750-trunk.patch


 It may be an expected behavior, but I think it's better to do something. 
 We configured hbase.replication as true in master cluster, and added peer. 
 But forgot to configure hbase.replication on slave cluster side.
 ReplicationSource read HLog, shipped log edits, and logged position. 
 Everything seemed alright. But data was not present in slave cluster.
 So I think, slave cluster should throw exception to master cluster instead of 
 return directly:
 {code}
   public void replicateLogEntries(final HLog.Entry[] entries)
   throws IOException {
 checkOpen();
 if (this.replicationSinkHandler == null) return;
 this.replicationSinkHandler.replicateLogEntries(entries);
   }
 {code}
 I would like to hear your comments on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-8378) add 'force' option for drop table


 [ 
https://issues.apache.org/jira/browse/HBASE-8378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-8378:
--
Status: Open  (was: Patch Available)

Cancelling stale patch.
Close?

 add 'force' option for drop table
 -

 Key: HBASE-8378
 URL: https://issues.apache.org/jira/browse/HBASE-8378
 Project: HBase
  Issue Type: Improvement
  Components: shell, Usability
Affects Versions: 0.95.0, 0.94.6.1
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
 Attachments: 0001-HBASE-8378-shell-add-force-option-to-drop.patch, 
 0001-HBASE-8378-shell-add-force-option-to-drop.patch


 Does this logic look familiar?
 {noformat}
 def drop_table(name):
   if (!admin.table_exists?(name):
 return
   if (admin.enabled?(name)):
 admin.disable_table(name)
   admin.drop_table(name)
 {noformat}
 Let's add a force option to 'drop' that does exactly this. We'll save 6 lines 
 of code for thousands of developers in millions of scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13259) mmap() based BucketCache IOEngine

2015-04-20 Thread Nick Dimiduk (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nick Dimiduk updated HBASE-13259:
-
Fix Version/s: (was: 1.1.0)
1.2.0

mmap() based BucketCache IOEngine
-

Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg,
mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-13518) Typo in hbase.hconnection.meta.lookup.threads.core parameter

Enis Soztutar created HBASE-13518:
-

 Summary: Typo in hbase.hconnection.meta.lookup.threads.core 
parameter
 Key: HBASE-13518
 URL: https://issues.apache.org/jira/browse/HBASE-13518
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Devaraj Das
 Fix For: 2.0.0, 1.1.0


A possible typo coming from patch in HBASE-13036. 

I think we want {{hbase.hconnection.meta.lookup.threads.core}}, not 
{{hbase.hconnection.meta.lookup.threads.max.core}} to be in line with the 
regular thread pool configuration. 

{code}
//To start with, threads.max.core threads can hit the meta 
(including replicas).
//After that, requests will get queued up in the passed queue, and 
only after
//the queue is full, a new thread will be started
this.metaLookupPool = getThreadPool(
   conf.getInt(hbase.hconnection.meta.lookup.threads.max, 128),
   conf.getInt(hbase.hconnection.meta.lookup.threads.max.core, 
10),
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-7937) Retry log rolling to support HA NN scenario


 [ 
https://issues.apache.org/jira/browse/HBASE-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-7937:
--
Resolution: Incomplete
  Assignee: (was: Himanshu Vashishtha)
Status: Resolved  (was: Patch Available)

 Retry log rolling to support HA NN scenario
 ---

 Key: HBASE-7937
 URL: https://issues.apache.org/jira/browse/HBASE-7937
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.5
Reporter: Himanshu Vashishtha
 Attachments: HBASE-7937-trunk.patch, HBASE-7937-v1.patch, 
 HBase-7937-0.94.txt, HBase-7937-trunk.txt


 A failure in log rolling causes regionserver abort. In case of HA NN, it will 
 be good if there is a retry mechanism to roll the logs.
 A corresponding jira for MemStore retries is HBASE-7507.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-4535) hbase-env.sh in hbase rpm does not set HBASE_CONF_DIR


 [ 
https://issues.apache.org/jira/browse/HBASE-4535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4535:
--
Resolution: Incomplete
  Assignee: (was: Eric Yang)
Status: Resolved  (was: Patch Available)

 hbase-env.sh in hbase rpm does not set HBASE_CONF_DIR
 -

 Key: HBASE-4535
 URL: https://issues.apache.org/jira/browse/HBASE-4535
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.90.3
Reporter: Ramya Sunil
 Attachments: HBASE-4535.patch


 After a hbase rpm install, hbase-env.sh does not define HBASE_CONF_DIR. This 
 needs to be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13090) Progress heartbeats for long running scanners


[ 
https://issues.apache.org/jira/browse/HBASE-13090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503574#comment-14503574
 ] 

Jonathan Lawlor commented on HBASE-13090:
-

Filed HBASE-13514 to address the test failures in branch-1 and branch-1.1

 Progress heartbeats for long running scanners
 -

 Key: HBASE-13090
 URL: https://issues.apache.org/jira/browse/HBASE-13090
 Project: HBase
  Issue Type: New Feature
Reporter: Andrew Purtell
Assignee: Jonathan Lawlor
 Fix For: 2.0.0, 1.1.0, 1.2.0

 Attachments: 13090-branch-1.addendum, HBASE-13090-v1.patch, 
 HBASE-13090-v2.patch, HBASE-13090-v3.patch, HBASE-13090-v3.patch, 
 HBASE-13090-v4.patch, HBASE-13090-v6.patch, HBASE-13090-v7.patch


 It can be necessary to set very long timeouts for clients that issue scans 
 over large regions when all data in the region might be filtered out 
 depending on scan criteria. This is a usability concern because it can be 
 hard to identify what worst case timeout to use until scans are 
 occasionally/intermittently failing in production, depending on variable scan 
 criteria. It would be better if the client-server scan protocol can send back 
 periodic progress heartbeats to clients as long as server scanners are alive 
 and making progress.
 This is related but orthogonal to streaming scan (HBASE-13071). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages in branch-1.1 and branch-1

Jonathan Lawlor created HBASE-13514:
---

 Summary: Fix test failures in TestScannerHeartbeatMessages in 
branch-1.1 and branch-1
 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Priority: Minor


The test inside TestScannerHeartbeatMessages is failing because the configured 
value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and 
branch-1.1 but the test expects that it can be set to 0.5 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13482) Phoenix is failing to scan tables on secure environments.


 [ 
https://issues.apache.org/jira/browse/HBASE-13482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-13482:
---
Fix Version/s: 0.98.13

Cherry picked to 0.98, thanks for the heads up!

 Phoenix is failing to scan tables on secure environments. 
 --

 Key: HBASE-13482
 URL: https://issues.apache.org/jira/browse/HBASE-13482
 Project: HBase
  Issue Type: Bug
Reporter: Alicia Ying Shu
Assignee: Alicia Ying Shu
 Fix For: 1.1.0, 0.98.13

 Attachments: Hbase-13482-v1.patch, Hbase-13482.patch


 When executed on secure environments, phoenix query is getting the following 
 exception message:
 java.util.concurrent.ExecutionException: 
 org.apache.hadoop.hbase.security.AccessDeniedException: 
 org.apache.hadoop.hbase.security.AccessDeniedException: User 'null' is not 
 the scanner owner! 
 org.apache.hadoop.hbase.security.access.AccessController.requireScannerOwner(AccessController.java:2048)
 org.apache.hadoop.hbase.security.access.AccessController.preScannerNext(AccessController.java:2022)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$53.call(RegionCoprocessorHost.java:1336)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1671)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1746)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1720)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preScannerNext(RegionCoprocessorHost.java:1331)
 org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2227)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout


[ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503740#comment-14503740
 ] 

Ted Yu commented on HBASE-13514:


Thanks for the patch, Jonathan.

 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Assignee: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0

 Attachments: HBASE-13514-branch-1.1.patch, 
 HBASE-13514-branch-1.patch, HBASE-13514.patch


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout


 [ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-13514:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Assignee: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0

 Attachments: HBASE-13514-branch-1.1.patch, 
 HBASE-13514-branch-1.patch, HBASE-13514.patch


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-6639) Class.newInstance() can throw any checked exceptions and must be encapsulated with catching Exception


 [ 
https://issues.apache.org/jira/browse/HBASE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6639:
--
Resolution: Incomplete
  Assignee: (was: Hiroshi Ikeda)
Status: Resolved  (was: Patch Available)

 Class.newInstance() can throw any checked exceptions and must be encapsulated 
 with catching Exception
 -

 Key: HBASE-6639
 URL: https://issues.apache.org/jira/browse/HBASE-6639
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Hiroshi Ikeda
Priority: Minor
 Attachments: HBASE-6639-V2.patch, HBASE-6639-V3.patch, 
 HBASE-6639.patch


 There are some logics to call Class.newInstance() without catching Exception,
 for example, in the method CoprocessorHost.loadInstance().
 Class.newInstance() is declared to throw InstantiationException and 
 IllegalAccessException but indeed the method can throw any checked exceptions 
 without declaration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.


 [ 
https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6289:
--
Resolution: Cannot Reproduce
  Assignee: (was: Maryann Xue)
Status: Resolved  (was: Patch Available)

Reopen if reproducible with current release code.

 ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is 
 still working but only the RS's ZK node expires.
 --

 Key: HBASE-6289
 URL: https://issues.apache.org/jira/browse/HBASE-6289
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.94.0
Reporter: Maryann Xue
Priority: Critical
 Attachments: HBASE-6289-v2.patch, HBASE-6289-v2.patch, 
 HBASE-6289.patch


 The ROOT RS has some network problem and its ZK node expires first, which 
 kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to 
 re-assign ROOT. At that time, the RS is actually still working and passes the 
 verifyRootRegionLocation() check, so the ROOT region is skipped from 
 re-assignment.
 {code}
   private void verifyAndAssignRoot()
   throws InterruptedException, IOException, KeeperException {
 long timeout = this.server.getConfiguration().
   getLong(hbase.catalog.verification.timeout, 1000);
 if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) {
   this.services.getAssignmentManager().assignRoot();
 }
   }
 {code}
 After a few moments, this RS encounters DFS write problem and decides to 
 abort. The RS then soon gets restarted from commandline, and constantly 
 report:
 {code}
 2012-06-27 23:13:08,627 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: 
 NotServingRegionException; Region is not online: -ROOT-,,0
 2012-06-27 23:13:08,627 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: 
 NotServingRegionException; Region is not online: -ROOT-,,0
 2012-06-27 23:13:08,628 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: 
 NotServingRegionException; Region is not online: -ROOT-,,0
 2012-06-27 23:13:08,628 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: 
 NotServingRegionException; Region is not online: -ROOT-,,0
 2012-06-27 23:13:08,630 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: 
 NotServingRegionException; Region is not online: -ROOT-,,0
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-5790) ZKUtil deleteRecursively should be a recoverable operation


 [ 
https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-5790:
--
Status: Open  (was: Patch Available)

Cancelling stale patch.
We're up to minimum necessary ZK now I'd say. Revisit? Or close.

 ZKUtil deleteRecursively should be a recoverable operation
 --

 Key: HBASE-5790
 URL: https://issues.apache.org/jira/browse/HBASE-5790
 Project: HBase
  Issue Type: Improvement
Reporter: Jesse Yates
Assignee: Jesse Yates
  Labels: zookeeper
 Attachments: java_HBASE-5790-v1.patch, java_HBASE-5790.patch


 As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means 
 we can wholesale delete chunks of the zk tree and ensure that we don't have 
 any pesky recursive delete issues where we delete the children of a node, but 
 then a child joins before deletion of the parent. Even without transactions, 
 this should be the behavior, but it is possible to make it much cleaner now 
 that we have this new feature in zk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13471) Deadlock closing a region

2015-04-20 Thread Rajesh Nishtala (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504047#comment-14504047
 ] 

Rajesh Nishtala commented on HBASE-13471:
-

In fairness I think there are two bugs here. (1) the client has a row / region 
mismatch under some circumstances that are yet TBD and (2) when that occurs 
there's a possible infinite loop. This addresses the later by propagating up 
the wrong region information to the client. With this fix in we can hopefully 
find the cause of (1) with the extra debugging information that results from 
the fix for (2).

 Deadlock closing a region
 -

 Key: HBASE-13471
 URL: https://issues.apache.org/jira/browse/HBASE-13471
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Rajesh Nishtala
 Attachments: HBASE-13471.patch


 {code}
 Thread 4139 
 (regionserver/hbase412.example.com/10.158.6.53:60020-splits-1429003183537):
   State: WAITING
   Blocked count: 131
   Waited count: 228
   Waiting on 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@50714dc3
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1371)
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1325)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:352)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:252)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:509)
 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84)
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-8489) Fix HBASE-8482 on trunk


 [ 
https://issues.apache.org/jira/browse/HBASE-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-8489:
--
Resolution: Incomplete
Status: Resolved  (was: Patch Available)

 Fix HBASE-8482 on trunk
 ---

 Key: HBASE-8489
 URL: https://issues.apache.org/jira/browse/HBASE-8489
 Project: HBase
  Issue Type: Bug
Reporter: Nicolas Liochon
 Attachments: 8482.v2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-4462) Properly treating SocketTimeoutException


 [ 
https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4462:
--
Resolution: Incomplete
  Assignee: (was: ramkrishna.s.vasudevan)
Status: Resolved  (was: Patch Available)

 Properly treating SocketTimeoutException
 

 Key: HBASE-4462
 URL: https://issues.apache.org/jira/browse/HBASE-4462
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Jean-Daniel Cryans
 Attachments: HBASE-4462_0.90.x.patch, 
 unittest_that_shows_us_retrying_sockettimeout.txt


 SocketTimeoutException is currently treated like any IOE inside of 
 HCM.getRegionServerWithRetries and I think this is a problem. This method 
 should only do retries in cases where we are pretty sure the operation will 
 complete, but with STE we already waited for (by default) 60 seconds and 
 nothing happened.
 I found this while debugging Douglas Campbell's problem on the mailing list 
 where it seemed like he was using the same scanner from multiple threads, but 
 actually it was just the same client doing retries while the first run didn't 
 even finish yet (that's another problem). You could see the first scanner, 
 then up to two other handlers waiting for it to finish in order to run 
 (because of the synchronization on RegionScanner).
 So what should we do? We could treat STE as a DoNotRetryException and let the 
 client deal with it, or we could retry only once.
 There's also the option of having a different behavior for get/put/icv/scan, 
 the issue with operations that modify a cell is that you don't know if the 
 operation completed or not (same when a RS dies hard after completing let's 
 say a Put but just before returning to the client).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-3577) enables Thrift client to get the Region location


 [ 
https://issues.apache.org/jira/browse/HBASE-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-3577:
--
Resolution: Not A Problem
Status: Resolved  (was: Patch Available)

 enables Thrift client to get the Region location
 

 Key: HBASE-3577
 URL: https://issues.apache.org/jira/browse/HBASE-3577
 Project: HBase
  Issue Type: Improvement
  Components: Thrift
Reporter: Kazuki Ohta
 Attachments: HBASE3577-1.patch, HBASE3577-2.patch


 The current thrift interface has the getTableRegions() interface like below.
 {code}
   listTRegionInfo getTableRegions(
 /** table name */
 1:Text tableName)
 throws (1:IOError io)
 {code}
 {code}
 struct TRegionInfo {
   1:Text startKey,
   2:Text endKey,
   3:i64 id,
   4:Text name,
   5:byte version
 }
 {code}
 But the method don't have the region location information (where the region 
 is located).
 I want to add the Thrift interfaces like below in HTable.java.
 {code}
 public MapHRegionInfo, HServerAddress getRegionsInfo() throws IOException
 {code}
 {code}
 public HRegionLocation getRegionLocation(final String row)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by a too restrictive setting of hbase.rpc.timeout


 [ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Lawlor updated HBASE-13514:

Summary: Fix test failures in TestScannerHeartbeatMessages caused by a too 
restrictive setting of hbase.rpc.timeout  (was: Fix test failures in 
TestScannerHeartbeatMessages caused by a too restrictive setting for )

 Fix test failures in TestScannerHeartbeatMessages caused by a too restrictive 
 setting of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout


 [ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Lawlor updated HBASE-13514:

Summary: Fix test failures in TestScannerHeartbeatMessages caused by 
incorrect setting of hbase.rpc.timeout  (was: Fix test failures in 
TestScannerHeartbeatMessages caused by a too restrictive setting of 
hbase.rpc.timeout)

 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13481) Master should respect master (old) DNS/bind related configurations


[ 
https://issues.apache.org/jira/browse/HBASE-13481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503667#comment-14503667
 ] 

Enis Soztutar commented on HBASE-13481:
---

yes. A test is failing since this went in 
org.apache.hadoop.hbase.regionserver.TestRegionServerHostname.testRegionServerHostname
Sorry my b. v2 patch passed hadoopqa, but I committed v3 without waiting for 
another, because I was in a hurry to spin 1.0.1 RC. Anyway, Ted's addendum 
fixed the test already. 

 Master should respect master (old) DNS/bind related configurations
 --

 Key: HBASE-13481
 URL: https://issues.apache.org/jira/browse/HBASE-13481
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.0.1, 1.1.0

 Attachments: 13481-addendum.txt, hbase-13481_v1.patch, 
 hbase-13481_v2.patch, hbase-13481_v3-branch-1.0.patch, hbase-13481_v3.patch


 This is a continuation of parent HBASE-13453. We should continue respecting 
 the following parameters that 1.0.0 does not: 
 {code}
 hbase.master.dns.interface
 hbase.master.dns.nameserver
 hbase.master.ipc.address
 {code}
 Credit goes to [~jerryhe] for pointing that out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine

[
https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503950#comment-14503950
]

stack commented on HBASE-13259:
---

I suggest we kick it out of 1.1 then. It should be finished with a definitive
story before it gets committed IMO.

mmap() based BucketCache IOEngine
-

Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg,
mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-13517) Publish a client artifact with shaded dependencies

Elliott Clark created HBASE-13517:
-

 Summary: Publish a client artifact with shaded dependencies
 Key: HBASE-13517
 URL: https://issues.apache.org/jira/browse/HBASE-13517
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-4916) LoadTest MR Job


 [ 
https://issues.apache.org/jira/browse/HBASE-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4916:
--
Resolution: Incomplete
  Assignee: (was: Karthik Ranganathan)
Status: Resolved  (was: Patch Available)

 LoadTest MR Job
 ---

 Key: HBASE-4916
 URL: https://issues.apache.org/jira/browse/HBASE-4916
 Project: HBase
  Issue Type: Sub-task
  Components: Client, regionserver
Reporter: Nicolas Spiegelberg
 Attachments: ASF.LICENSE.NOT.GRANTED--HBASE-4916.D741.1.patch


 Add a script to start a streaming map-reduce job where each map tasks runs an 
 instance of the load tester for a partition of the key-space. Ensure that the 
 load tester takes a parameter indicating the start key for write operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-6480) If callQueueSize exceed maxQueueSize, all call will be rejected, do not reject priorityCall


 [ 
https://issues.apache.org/jira/browse/HBASE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6480:
--
Resolution: Not A Problem
Status: Resolved  (was: Patch Available)

 If callQueueSize exceed maxQueueSize, all call will be rejected, do not 
 reject priorityCall 
 

 Key: HBASE-6480
 URL: https://issues.apache.org/jira/browse/HBASE-6480
 Project: HBase
  Issue Type: Improvement
Reporter: binlijin
 Attachments: HBASE-6480-94.patch, HBASE-6480-trunk.patch


 Current if the callQueueSize exceed maxQueueSize, all call will be rejected, 
 Should we let the priority Call pass through?
 Current:
 {code}
 if ((callSize + callQueueSize.get())  maxQueueSize) {
   Call callTooBig = xxx
   return ;
 }
 if (priorityCallQueue != null  getQosLevel(param)  highPriorityLevel) {
   priorityCallQueue.put(call);
   updateCallQueueLenMetrics(priorityCallQueue);
 } else {
   callQueue.put(call);  // queue the call; maybe blocked here
   updateCallQueueLenMetrics(callQueue);
 }
 {code}
 Should we change it to :
 {code}
 if (priorityCallQueue != null  getQosLevel(param)  highPriorityLevel) {
   priorityCallQueue.put(call);
   updateCallQueueLenMetrics(priorityCallQueue);
 } else {
   if ((callSize + callQueueSize.get())  maxQueueSize) {
Call callTooBig = xxx
return ;
   }
   callQueue.put(call);  // queue the call; maybe blocked here
   updateCallQueueLenMetrics(callQueue);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13071) Hbase Streaming Scan Feature

2015-04-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503897#comment-14503897
 ] 

Hadoop QA commented on HBASE-13071:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12726649/HBASE-13071_trunk_rebase_1.0.patch
  against master branch at commit 702aea5b38ed6ad0942b0c59c3accca476b46873.
  ATTACHMENT ID: 12726649

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 16 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
1902 checkstyle errors (more than the master's current 1898 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13744//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13744//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13744//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13744//console

This message is automatically generated.

 Hbase Streaming Scan Feature
 

 Key: HBASE-13071
 URL: https://issues.apache.org/jira/browse/HBASE-13071
 Project: HBase
  Issue Type: New Feature
Reporter: Eshcar Hillel
 Attachments: 99.eshcar.png, HBASE-13071_98_1.patch, 
 HBASE-13071_trunk_1.patch, HBASE-13071_trunk_10.patch, 
 HBASE-13071_trunk_2.patch, HBASE-13071_trunk_3.patch, 
 HBASE-13071_trunk_4.patch, HBASE-13071_trunk_5.patch, 
 HBASE-13071_trunk_6.patch, HBASE-13071_trunk_7.patch, 
 HBASE-13071_trunk_8.patch, HBASE-13071_trunk_9.patch, 
 HBASE-13071_trunk_rebase_1.0.patch, HBaseStreamingScanDesign.pdf, 
 HbaseStreamingScanEvaluation.pdf, 
 HbaseStreamingScanEvaluationwithMultipleClients.pdf, gc.delay.png, 
 gc.eshcar.png, gc.png, hits.delay.png, hits.eshcar.png, hits.png, 
 latency.delay.png, latency.png, network.png


 A scan operation iterates over all rows of a table or a subrange of the 
 table. The synchronous nature in which the data is served at the client side 
 hinders the speed the application traverses the data: it increases the 
 overall processing time, and may cause a great variance in the times the 
 application waits for the next piece of data.
 The scanner next() method at the client side invokes an RPC to the 
 regionserver and then stores the results in a cache. The application can 
 specify how many rows will be transmitted per RPC; by default this is set to 
 100 rows. 
 The cache can be considered as a producer-consumer queue, where the hbase 
 client pushes the data to the queue and the application consumes it. 
 Currently this queue is synchronous, i.e., blocking. More specifically, when 
 the application consumed all the data from the cache --- so the cache is 
 empty --- the hbase client retrieves additional data from the server and 
 re-fills the cache with new data. During this time the application is blocked.
 Under the assumption that the application processing time can be balanced by 
 the time it takes to retrieve the data, an asynchronous approach can reduce 
 the time the application is waiting for data.
 We attach a design document.
 We also have a patch that is based on a private branch, and some evaluation 
 results of this code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout


[ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503932#comment-14503932
 ] 

Hudson commented on HBASE-13514:


FAILURE: Integrated in HBase-1.2 #9 (See 
[https://builds.apache.org/job/HBase-1.2/9/])
HBASE-13514 Fix test failures in TestScannerHeartbeatMessages caused by 
incorrect setting of hbase.rpc.timeout (Jonathan Lawlor) (tedyu: rev 
cac134c14af9df7d4219bd77abf817a84c975499)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestScannerHeartbeatMessages.java


 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Assignee: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0

 Attachments: HBASE-13514-branch-1.1.patch, 
 HBASE-13514-branch-1.patch, HBASE-13514.patch


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-13519) Support coupled compactions with secondary index

2015-04-20 Thread tristartom (JIRA)

tristartom created HBASE-13519:
--

 Summary: Support coupled compactions with secondary index 
 Key: HBASE-13519
 URL: https://issues.apache.org/jira/browse/HBASE-13519
 Project: HBase
  Issue Type: New Feature
Reporter: tristartom


Hi,

DELI (DEferred Lightweight Indexing) is our research prototype from Syracuse 
University with collaboration from Georgia Tech and IBM Research. 

In DELI, we propose that when supporting secondary index on HBase, the 
index-to-base-table sync-up should be coupled with compaction. The benefit of 
this is that online Put stays to be append-only and performance would be 
guaranteed. 

The code of DELI is shared in github: 
https://github.com/tristartom/nosql-indexing

Details can be found in the following research paper published in CCGrid 2015: 
http://tristartom.github.io/docs/ccgrid15.pdf

We are grateful for the HBase community, and any comments/suggestions are 
appreciated.

Yuzhe



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13471) Deadlock closing a region

2015-04-20 Thread Rajesh Nishtala (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504036#comment-14504036
 ] 

Rajesh Nishtala commented on HBASE-13471:
-

The fix is up at https://reviews.facebook.net/D37437. Looks like there's a 
possible infinite loop that can occur in doMiniBatchMutation with the readLock 
held causing the doClose() to never be able to grab its lock.

 Deadlock closing a region
 -

 Key: HBASE-13471
 URL: https://issues.apache.org/jira/browse/HBASE-13471
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Rajesh Nishtala
 Attachments: HBASE-13471.patch


 {code}
 Thread 4139 
 (regionserver/hbase412.example.com/10.158.6.53:60020-splits-1429003183537):
   State: WAITING
   Blocked count: 131
   Waited count: 228
   Waiting on 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@50714dc3
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1371)
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1325)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:352)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:252)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:509)
 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84)
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout


[ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504038#comment-14504038
 ] 

Hudson commented on HBASE-13514:


SUCCESS: Integrated in HBase-TRUNK #6394 (See 
[https://builds.apache.org/job/HBase-TRUNK/6394/])
HBASE-13514 Fix test failures in TestScannerHeartbeatMessages caused by 
incorrect setting of hbase.rpc.timeout (Jonathan Lawlor) (tedyu: rev 
eb82b8b3098d6a9ac62aa50189f9d4b289f38472)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestScannerHeartbeatMessages.java


 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Assignee: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 1.2.0

 Attachments: HBASE-13514-branch-1.1.patch, 
 HBASE-13514-branch-1.patch, HBASE-13514.patch


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-4635) Remove dependency of java for rpm/deb packaging


 [ 
https://issues.apache.org/jira/browse/HBASE-4635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4635:
--
Resolution: Not A Problem
  Assignee: (was: Eric Yang)
Status: Resolved  (was: Patch Available)

 Remove dependency of java for rpm/deb packaging
 ---

 Key: HBASE-4635
 URL: https://issues.apache.org/jira/browse/HBASE-4635
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.92.0
 Environment: Java, Ubuntu, RHEL
Reporter: Eric Yang
 Attachments: HBASE-4635-1.patch, HBASE-4635.patch


 Comment from HBASE-3606:
 Eric, it looks like hbase rpm spec file sets dependency on jdk. Can we remove 
 the jdk dependency ? As everyone will not be installing jdk through rpm.
 There are multiple ways to install Java on Linux.  It would be better to 
 remove Java dependency declaration for packaging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-4523) dfs.support.append config should be present in the hadoop configs, we should remove them from hbase so the user is not confused when they see the config in 2 places


 [ 
https://issues.apache.org/jira/browse/HBASE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4523:
--
  Resolution: Not A Problem
Assignee: (was: Eric Yang)
Hadoop Flags:   (was: Reviewed)
  Status: Resolved  (was: Patch Available)

 dfs.support.append config should be present in the hadoop configs, we should 
 remove them from hbase so the user is not confused when they see the config 
 in 2 places
 

 Key: HBASE-4523
 URL: https://issues.apache.org/jira/browse/HBASE-4523
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4, 0.92.0
Reporter: Arpit Gupta
 Attachments: HBASE-4523.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-4337) Update HBase directory structure layout to be aligned with Hadoop


 [ 
https://issues.apache.org/jira/browse/HBASE-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4337:
--
  Resolution: Not A Problem
Assignee: (was: Eric Yang)
Release Note:   (was: Added binary only profile for building binary only 
tar ball.)
  Status: Resolved  (was: Patch Available)

 Update HBase directory structure layout to be aligned with Hadoop
 -

 Key: HBASE-4337
 URL: https://issues.apache.org/jira/browse/HBASE-4337
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.0
Reporter: Eric Yang
 Attachments: HBASE-4337-1.patch, HBASE-4337-2.patch, 
 HBASE-4337-3.patch, HBASE-4337-4.patch, HBASE-4337-5.patch, 
 HBASE-4337-6.patch, HBASE-4337.patch, hbase-4337-7.patch


 In HADOOP-6255, a proposal was made for common directory layout for Hadoop 
 ecosystem.  This jira is to track the necessary work for making HBase 
 directory structure aligned with Hadoop for better integration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-4415) Add configuration script for setup HBase (hbase-setup-conf.sh)


 [ 
https://issues.apache.org/jira/browse/HBASE-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4415:
--
Resolution: Later
  Assignee: (was: Eric Yang)
Status: Resolved  (was: Patch Available)

 Add configuration script for setup HBase (hbase-setup-conf.sh)
 --

 Key: HBASE-4415
 URL: https://issues.apache.org/jira/browse/HBASE-4415
 Project: HBase
  Issue Type: New Feature
  Components: scripts
Affects Versions: 0.90.4, 0.92.0
 Environment: Java 6, Linux
Reporter: Eric Yang
 Attachments: HBASE-4415-1.patch, HBASE-4415-2.patch, 
 HBASE-4415-3.patch, HBASE-4415-4.patch, HBASE-4415-5.patch, 
 HBASE-4415-6.patch, HBASE-4415-7.patch, HBASE-4415-8.patch, 
 HBASE-4415-9.patch, HBASE-4415.patch


 The goal of this jura is to provide a installation script for configuring 
 HBase environment and configuration.  By using the same pattern of 
 *-setup-conf.sh for all Hadoop related projects.  For HBase, the usage of the 
 script looks like this:
 {noformat}
 usage: ./hbase-setup-conf.sh parameters
   Optional parameters:
 --hadoop-conf=/etc/hadoopSet Hadoop configuration directory 
 location
 --hadoop-home=/usr   Set Hadoop directory location
 --hadoop-namenode=localhost  Set Hadoop namenode hostname
 --hadoop-replication=3   Set HDFS replication
 --hbase-home=/usrSet HBase directory location
 --hbase-conf=/etc/hbase  Set HBase configuration 
 directory location
 --hbase-log=/var/log/hbase   Set HBase log directory location
 --hbase-pid=/var/run/hbase   Set HBase pid directory location
 --hbase-user=hbase   Set HBase user
 --java-home=/usr/java/defaultSet JAVA_HOME directory location
 --kerberos-realm=KERBEROS.EXAMPLE.COMSet Kerberos realm
 --kerberos-principal-id=_HOSTSet Kerberos principal ID 
 --keytab-dir=/etc/security/keytabs   Set keytab directory
 --regionservers=localhostSet regionservers hostnames
 --zookeeper-home=/usrSet ZooKeeper directory location
 --zookeeper-quorum=localhost Set ZooKeeper Quorum
 --zookeeper-snapshot=/var/lib/zookeeper  Set ZooKeeper snapshot location
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13515) Handle FileNotFoundException in region replica replay for flush/compaction events

2015-04-20 Thread Devaraj Das (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504088#comment-14504088
 ] 

Devaraj Das commented on HBASE-13515:
-

LGTM

 Handle FileNotFoundException in region replica replay for flush/compaction 
 events
 -

 Key: HBASE-13515
 URL: https://issues.apache.org/jira/browse/HBASE-13515
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0

 Attachments: hbase-13515_v1.patch


 I had this patch laying around that somehow dropped from my plate. We should 
 skip replaying compaction / flush and region open event markers if the files 
 (from flush or compaction) can no longer be found from the secondary. If we 
 do not skip, the replay will be retried forever, effectively blocking the 
 replication further. 
 Bulk load already does this, we just need to do it for flush / compaction and 
 region open events as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-13517) Publish a client artifact with shaded dependencies


 [ 
https://issues.apache.org/jira/browse/HBASE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virag Kothari reassigned HBASE-13517:
-

Assignee: Virag Kothari

 Publish a client artifact with shaded dependencies
 --

 Key: HBASE-13517
 URL: https://issues.apache.org/jira/browse/HBASE-13517
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Virag Kothari

 Guava's moved on. Hadoop has not.
 Jackson moves whenever it feels like it.
 Protobuf moves with breaking point changes.
 While shading all of the time would break people that require the transitive 
 dependencies for MR or other things. Lets provide an artifact with our 
 dependencies shaded. Then users can have the choice to use the shaded version 
 or the non-shaded version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13520) NullPointerException in TagRewriteCell


[ 
https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504328#comment-14504328
 ] 

Josh Elser commented on HBASE-13520:


I thought about this for a little bit. I'm undecided on whether or not it's a 
good idea to avoid returning null. On one hand, we made the conscious decision 
which states the underlying cell's tags should never be accessed again by this 
object. This implies that it would be an error if the caller tries to access 
this array when it is null (leads me to think something like {{assert null != 
this.tags}} could be added). On the other hand, we might avoid a future bug if 
we fail gracefully to an empty byte array.

I couldn't make up my mind if one was better than the other, so I didn't make a 
change. I'm happy to make a change if there are those who are more strongly 
opinionated than me :)

 NullPointerException in TagRewriteCell
 --

 Key: HBASE-13520
 URL: https://issues.apache.org/jira/browse/HBASE-13520
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.0, 1.0.2

 Attachments: HBASE-13520-v1.patch, HBASE-13520.patch


 Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos 
 enabled.
 {noformat}
 2015-04-20 18:54:36,712 ERROR 
 [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected 
 throwable object
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157)
 at 
 org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186)
 at 
 org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541)
 at 
 org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
 at 
 org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be 
 altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided 
 copying all components of the original {{Cell}}. In an attempt to help free 
 the tags on the old cell that we wouldn't be referencing anymore, 
 {{TagRewriteCell}} nulls out the original {{byte[] tags}}.
 This causes a problem in that the implementation of {{heapSize()}} as it 
 {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. 
 Because the tags on the passed in {{Cell}} (which was also a 
 {{TagRewriteCell}}) were null'ed out in the constructor, this results in a 
 NPE by the byte array is null.
 I believe this isn't observed in normal, unsecure deployments because there 
 is only one RegionObserver/Coprocessor loaded that gets invoked via 
 {{postMutationBeforeWAL}}. When there is only one RegionObserver, the 
 TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from 
 the wire/protobuf. This means that the optimization isn't performed. When we 
 have two (or more) observers that a TagRewriteCell passes through (and a new 
 TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), 
 this enables the described-above NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13501) Deprecate/Remove getComparator() in HRegionInfo.

2015-04-20 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504327#comment-14504327
 ] 

ramkrishna.s.vasudevan commented on HBASE-13501:


When we talk about removing getComparator in HRegionInfo which is marked 
public, ideally HRegionInfo should not have been public.
The only place where we expose that is in Admin.java
{code}
  /**
   * Close a region.  For expert-admins  Runs close on the regionserver.  The 
master will not be
   * informed of the close.
   *
   * @param sn
   * @param hri
   * @throws IOException
   */
  void closeRegion(final ServerName sn, final HRegionInfo hri) throws 
IOException;
{code}
Here we really don't need an HRegionInfo which could have been always created 
from a TableName.  I would say we could deprecate/remove these methods so that 
HRegionInfo can go to LimitedPrivate so that atleast CPs can use it and not a 
direct client facing Interface.
Thoughts?

 Deprecate/Remove getComparator() in HRegionInfo.
 

 Key: HBASE-13501
 URL: https://issues.apache.org/jira/browse/HBASE-13501
 Project: HBase
  Issue Type: Sub-task
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 2.0.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13471) Deadlock closing a region

2015-04-20 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504347#comment-14504347
 ] 

Lars Hofhansl commented on HBASE-13471:
---

+1 on patch.

 Deadlock closing a region
 -

 Key: HBASE-13471
 URL: https://issues.apache.org/jira/browse/HBASE-13471
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Rajesh Nishtala
 Attachments: HBASE-13471.patch


 {code}
 Thread 4139 
 (regionserver/hbase412.example.com/10.158.6.53:60020-splits-1429003183537):
   State: WAITING
   Blocked count: 131
   Waited count: 228
   Waiting on 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@50714dc3
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1371)
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1325)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:352)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:252)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:509)
 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84)
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13471) Deadlock closing a region


 [ 
https://issues.apache.org/jira/browse/HBASE-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-13471:
--
Affects Version/s: 1.1.0
   2.0.0

 Deadlock closing a region
 -

 Key: HBASE-13471
 URL: https://issues.apache.org/jira/browse/HBASE-13471
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0, 1.1.0
Reporter: Elliott Clark
Assignee: Rajesh Nishtala
 Attachments: HBASE-13471.patch


 {code}
 Thread 4139 
 (regionserver/hbase412.example.com/10.158.6.53:60020-splits-1429003183537):
   State: WAITING
   Blocked count: 131
   Waited count: 228
   Waiting on 
 java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@50714dc3
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
 
 java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1371)
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1325)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:352)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:252)
 
 org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:509)
 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84)
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13520) NullPointerException in TagRewriteCell


 [ 
https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-13520:
---
Status: Patch Available  (was: Open)

 NullPointerException in TagRewriteCell
 --

 Key: HBASE-13520
 URL: https://issues.apache.org/jira/browse/HBASE-13520
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.0, 1.0.2

 Attachments: HBASE-13520.patch


 Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos 
 enabled.
 {noformat}
 2015-04-20 18:54:36,712 ERROR 
 [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected 
 throwable object
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157)
 at 
 org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186)
 at 
 org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541)
 at 
 org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
 at 
 org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be 
 altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided 
 copying all components of the original {{Cell}}. In an attempt to help free 
 the tags on the old cell that we wouldn't be referencing anymore, 
 {{TagRewriteCell}} nulls out the original {{byte[] tags}}.
 This causes a problem in that the implementation of {{heapSize()}} as it 
 {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. 
 Because the tags on the passed in {{Cell}} (which was also a 
 {{TagRewriteCell}}) were null'ed out in the constructor, this results in a 
 NPE by the byte array is null.
 I believe this isn't observed in normal, unsecure deployments because there 
 is only one RegionObserver/Coprocessor loaded that gets invoked via 
 {{postMutationBeforeWAL}}. When there is only one RegionObserver, the 
 TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from 
 the wire/protobuf. This means that the optimization isn't performed. When we 
 have two (or more) observers that a TagRewriteCell passes through (and a new 
 TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), 
 this enables the described-above NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13520) NullPointerException in TagRewriteCell


[ 
https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504316#comment-14504316
 ] 

Anoop Sam John commented on HBASE-13520:


Thanks for the find and fix Josh.  My bad..  I missed the null check in this 
place. (added in another I place I guess)
+1
nit on test case,  need add SmallTests Category

 NullPointerException in TagRewriteCell
 --

 Key: HBASE-13520
 URL: https://issues.apache.org/jira/browse/HBASE-13520
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.0, 1.0.2

 Attachments: HBASE-13520.patch


 Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos 
 enabled.
 {noformat}
 2015-04-20 18:54:36,712 ERROR 
 [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected 
 throwable object
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157)
 at 
 org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186)
 at 
 org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541)
 at 
 org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
 at 
 org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be 
 altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided 
 copying all components of the original {{Cell}}. In an attempt to help free 
 the tags on the old cell that we wouldn't be referencing anymore, 
 {{TagRewriteCell}} nulls out the original {{byte[] tags}}.
 This causes a problem in that the implementation of {{heapSize()}} as it 
 {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. 
 Because the tags on the passed in {{Cell}} (which was also a 
 {{TagRewriteCell}}) were null'ed out in the constructor, this results in a 
 NPE by the byte array is null.
 I believe this isn't observed in normal, unsecure deployments because there 
 is only one RegionObserver/Coprocessor loaded that gets invoked via 
 {{postMutationBeforeWAL}}. When there is only one RegionObserver, the 
 TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from 
 the wire/protobuf. This means that the optimization isn't performed. When we 
 have two (or more) observers that a TagRewriteCell passes through (and a new 
 TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), 
 this enables the described-above NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13375) Provide HBase superuser higher priority over other users in the RPC handling


[ 
https://issues.apache.org/jira/browse/HBASE-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504308#comment-14504308
 ] 

Anoop Sam John commented on HBASE-13375:


bq.what's the difference between system and super users?
Super users are those which come from the xml configuration where as the system 
user is the user who started the server process. All these considered as super 
users of HBase. Make sense?  But we can make the name simple getSuperUsers() 
and just add doc that it include which all users.

 Provide HBase superuser higher priority over other users in the RPC handling
 

 Key: HBASE-13375
 URL: https://issues.apache.org/jira/browse/HBASE-13375
 Project: HBase
  Issue Type: Improvement
  Components: rpc
Reporter: Devaraj Das
Assignee: Mikhail Antonov
 Fix For: 1.1.0

 Attachments: HBASE-13375-v0.patch, HBASE-13375-v1.patch, 
 HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v2.patch


 HBASE-13351 annotates Master RPCs so that RegionServer RPCs are treated with 
 a higher priority compared to user RPCs (and they are handled by a separate 
 set of handlers, etc.). It may be good to stretch this to users too - hbase 
 superuser (configured via hbase.superuser) gets higher priority over other 
 users in the RPC handling. That way the superuser can always perform 
 administrative operations on the cluster even if all the normal priority 
 handlers are occupied (for example, we had a situation where all the master's 
 handlers were tied up with many simultaneous createTable RPC calls from 
 multiple users and the master wasn't able to perform any operations initiated 
 by the admin). (Discussed this some with [~enis] and [~elserj]).
 Does this make sense to others?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13375) Provide HBase superuser higher priority over other users in the RPC handling

2015-04-20 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504309#comment-14504309
 ] 

Mikhail Antonov commented on HBASE-13375:
-

Right. Will update the patch later today, incorporating the feedback. Going to 
make these collection static fields of User class with lazy loading from conf 
(as that was the idea in other jira - makes sense to me).

 Provide HBase superuser higher priority over other users in the RPC handling
 

 Key: HBASE-13375
 URL: https://issues.apache.org/jira/browse/HBASE-13375
 Project: HBase
  Issue Type: Improvement
  Components: rpc
Reporter: Devaraj Das
Assignee: Mikhail Antonov
 Fix For: 1.1.0

 Attachments: HBASE-13375-v0.patch, HBASE-13375-v1.patch, 
 HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v2.patch


 HBASE-13351 annotates Master RPCs so that RegionServer RPCs are treated with 
 a higher priority compared to user RPCs (and they are handled by a separate 
 set of handlers, etc.). It may be good to stretch this to users too - hbase 
 superuser (configured via hbase.superuser) gets higher priority over other 
 users in the RPC handling. That way the superuser can always perform 
 administrative operations on the cluster even if all the normal priority 
 handlers are occupied (for example, we had a situation where all the master's 
 handlers were tied up with many simultaneous createTable RPC calls from 
 multiple users and the master wasn't able to perform any operations initiated 
 by the admin). (Discussed this some with [~enis] and [~elserj]).
 Does this make sense to others?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13520) NullPointerException in TagRewriteCell

2015-04-20 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504325#comment-14504325
 ] 

ramkrishna.s.vasudevan commented on HBASE-13520:


{code}
  @Override
  public byte[] getTagsArray() {
return this.tags;
  }
{code}
Just asking, so for any callers calling getTagsArray() when tags == null, is it 
better to return an EMPTY_BYTE_ARRAY when tags == null. +1 on patch.

 NullPointerException in TagRewriteCell
 --

 Key: HBASE-13520
 URL: https://issues.apache.org/jira/browse/HBASE-13520
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.0, 1.0.2

 Attachments: HBASE-13520-v1.patch, HBASE-13520.patch


 Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos 
 enabled.
 {noformat}
 2015-04-20 18:54:36,712 ERROR 
 [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected 
 throwable object
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157)
 at 
 org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186)
 at 
 org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541)
 at 
 org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
 at 
 org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be 
 altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided 
 copying all components of the original {{Cell}}. In an attempt to help free 
 the tags on the old cell that we wouldn't be referencing anymore, 
 {{TagRewriteCell}} nulls out the original {{byte[] tags}}.
 This causes a problem in that the implementation of {{heapSize()}} as it 
 {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. 
 Because the tags on the passed in {{Cell}} (which was also a 
 {{TagRewriteCell}}) were null'ed out in the constructor, this results in a 
 NPE by the byte array is null.
 I believe this isn't observed in normal, unsecure deployments because there 
 is only one RegionObserver/Coprocessor loaded that gets invoked via 
 {{postMutationBeforeWAL}}. When there is only one RegionObserver, the 
 TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from 
 the wire/protobuf. This means that the optimization isn't performed. When we 
 have two (or more) observers that a TagRewriteCell passes through (and a new 
 TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), 
 this enables the described-above NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11758) Meta region location should be cached


 [ 
https://issues.apache.org/jira/browse/HBASE-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virag Kothari updated HBASE-11758:
--
Assignee: (was: Virag Kothari)

 Meta region location should be cached
 -

 Key: HBASE-11758
 URL: https://issues.apache.org/jira/browse/HBASE-11758
 Project: HBase
  Issue Type: Sub-task
Reporter: Virag Kothari

 The zk less assignment involves only master updating the meta and  this can 
 be faster if we cache the meta location instead of reading the meta znode 
 every time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11289) Speedup balance


 [ 
https://issues.apache.org/jira/browse/HBASE-11289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virag Kothari updated HBASE-11289:
--
Assignee: (was: Virag Kothari)

 Speedup balance
 ---

 Key: HBASE-11289
 URL: https://issues.apache.org/jira/browse/HBASE-11289
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13517) Publish a client artifact with shaded dependencies


[ 
https://issues.apache.org/jira/browse/HBASE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504354#comment-14504354
 ] 

Elliott Clark commented on HBASE-13517:
---

Here's a patch that adds in hbase-sharded, hbase-sharded-client, and 
hbase-sharded-server.
When using the sharded versions there is a trade off. You can't use 
HBaseTestingUtil because all of the jsp, jersey, servlet classloading. If 
someone has a fix for this I'd be all ears. It's just beyond me.

 Publish a client artifact with shaded dependencies
 --

 Key: HBASE-13517
 URL: https://issues.apache.org/jira/browse/HBASE-13517
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0, 1.1.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13517.patch


 Guava's moved on. Hadoop has not.
 Jackson moves whenever it feels like it.
 Protobuf moves with breaking point changes.
 While shading all of the time would break people that require the transitive 
 dependencies for MR or other things. Lets provide an artifact with our 
 dependencies shaded. Then users can have the choice to use the shaded version 
 or the non-shaded version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13501) Deprecate/Remove getComparator() in HRegionInfo.


[ 
https://issues.apache.org/jira/browse/HBASE-13501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504389#comment-14504389
 ] 

stack commented on HBASE-13501:
---

That seems like the right direction. HRI is all over the code base but I can't 
think of a good reason why it should be popping up in public client methods. 

Not sure though about calling a close region and passing a table How we for 
sure going to close the right region?  On other hand, HRI is 'wrong'  How is it 
used internally? To find the 'name' or region id?

 Deprecate/Remove getComparator() in HRegionInfo.
 

 Key: HBASE-13501
 URL: https://issues.apache.org/jira/browse/HBASE-13501
 Project: HBase
  Issue Type: Sub-task
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 2.0.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13078) IntegrationTestSendTraceRequests is a noop


[ 
https://issues.apache.org/jira/browse/HBASE-13078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504142#comment-14504142
 ] 

Josh Elser commented on HBASE-13078:


Friends, any chance we can get this committed in all but 0.98 for now? We can 
deal with whether or not this makes it into 0.98 after HBASE-12938 like Andrew 
stated.

 IntegrationTestSendTraceRequests is a noop
 --

 Key: HBASE-13078
 URL: https://issues.apache.org/jira/browse/HBASE-13078
 Project: HBase
  Issue Type: Test
  Components: integration tests
Reporter: Nick Dimiduk
Assignee: Josh Elser
Priority: Critical
 Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2

 Attachments: HBASE-13078-0.98-removal.patch, 
 HBASE-13078-0.98-v1.patch, HBASE-13078-v1.patch, HBASE-13078.patch


 While pair-debugging with [~jeffreyz] on HBASE-13077, we noticed that 
 IntegrationTestSendTraceRequests doesn't actually assert anything. This test 
 should be converted to use a mini cluster, setup a POJOSpanReceiver, and then 
 verify the spans collected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13520) NullPointerException in TagRewriteCell


 [ 
https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-13520:
---
Attachment: HBASE-13520.patch

Grossly simple patch for what was a rather convoluted bug to track down. 
Applies cleanly to master, branch-1, branch-1.1, and branch-1.0.

 NullPointerException in TagRewriteCell
 --

 Key: HBASE-13520
 URL: https://issues.apache.org/jira/browse/HBASE-13520
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.0, 1.0.2

 Attachments: HBASE-13520.patch


 Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos 
 enabled.
 {noformat}
 2015-04-20 18:54:36,712 ERROR 
 [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected 
 throwable object
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157)
 at 
 org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186)
 at 
 org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541)
 at 
 org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
 at 
 org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be 
 altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided 
 copying all components of the original {{Cell}}. In an attempt to help free 
 the tags on the old cell that we wouldn't be referencing anymore, 
 {{TagRewriteCell}} nulls out the original {{byte[] tags}}.
 This causes a problem in that the implementation of {{heapSize()}} as it 
 {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. 
 Because the tags on the passed in {{Cell}} (which was also a 
 {{TagRewriteCell}}) were null'ed out in the constructor, this results in a 
 NPE by the byte array is null.
 I believe this isn't observed in normal, unsecure deployments because there 
 is only one RegionObserver/Coprocessor loaded that gets invoked via 
 {{postMutationBeforeWAL}}. When there is only one RegionObserver, the 
 TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from 
 the wire/protobuf. This means that the optimization isn't performed. When we 
 have two (or more) observers that a TagRewriteCell passes through (and a new 
 TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), 
 this enables the described-above NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13469) [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1

2015-04-20 Thread Stephen Yuan Jiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13469:
---
Attachment: HBASE-13469.v1-branch-1.1.patch

 [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1
 

 Key: HBASE-13469
 URL: https://issues.apache.org/jira/browse/HBASE-13469
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Stephen Yuan Jiang
 Fix For: 1.1.0

 Attachments: HBASE-13469.v1-branch-1.1.patch


 In branch-1, I think we want proc v2 to be configurable, so that if any 
 non-recoverable issue is found, at least there is a workaround. We already 
 have the handlers and code laying around. It will be just introducing the 
 config to enable / disable. We can even make it dynamically configurable via 
 the new framework. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13375) Provide HBase superuser higher priority over other users in the RPC handling


[ 
https://issues.apache.org/jira/browse/HBASE-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504304#comment-14504304
 ] 

Anoop Sam John commented on HBASE-13375:


IMO it can be done now as using VisibilityUtil in places like core server area 
like QoS look bit strange. 
Yes as it is becoming used by many areas of code, now we can better the method 
name and signature. 

 Provide HBase superuser higher priority over other users in the RPC handling
 

 Key: HBASE-13375
 URL: https://issues.apache.org/jira/browse/HBASE-13375
 Project: HBase
  Issue Type: Improvement
  Components: rpc
Reporter: Devaraj Das
Assignee: Mikhail Antonov
 Fix For: 1.1.0

 Attachments: HBASE-13375-v0.patch, HBASE-13375-v1.patch, 
 HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v2.patch


 HBASE-13351 annotates Master RPCs so that RegionServer RPCs are treated with 
 a higher priority compared to user RPCs (and they are handled by a separate 
 set of handlers, etc.). It may be good to stretch this to users too - hbase 
 superuser (configured via hbase.superuser) gets higher priority over other 
 users in the RPC handling. That way the superuser can always perform 
 administrative operations on the cluster even if all the normal priority 
 handlers are occupied (for example, we had a situation where all the master's 
 handlers were tied up with many simultaneous createTable RPC calls from 
 multiple users and the master wasn't able to perform any operations initiated 
 by the admin). (Discussed this some with [~enis] and [~elserj]).
 Does this make sense to others?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13469) [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1

2015-04-20 Thread Stephen Yuan Jiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13469:
---
Attachment: (was: HBASE-13469.v1-branch-1.1.patch)

 [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1
 

 Key: HBASE-13469
 URL: https://issues.apache.org/jira/browse/HBASE-13469
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Stephen Yuan Jiang
 Fix For: 1.1.0

 Attachments: HBASE-13469.v1-branch-1.1.patch


 In branch-1, I think we want proc v2 to be configurable, so that if any 
 non-recoverable issue is found, at least there is a workaround. We already 
 have the handlers and code laying around. It will be just introducing the 
 config to enable / disable. We can even make it dynamically configurable via 
 the new framework. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2015-04-20 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504350#comment-14504350
]

Lars Hofhansl commented on HBASE-13082:
---

Correct. No more locking other than to fix the current version of access data
structure at the beginning of the scan, and StoreScanner would indeed be single
threaded (which is it 99.% of the already :) ). That would be bigger change.

Coarsen StoreScanner locks to RegionScanner
---

Key: HBASE-13082
URL: https://issues.apache.org/jira/browse/HBASE-13082
Project: HBase
Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt,
13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png,
next.png, next.png

Continuing where HBASE-10015 left of.
We can avoid locking (and memory fencing) inside StoreScanner by deferring to
the lock already held by the RegionScanner.
In tests this shows quite a scan improvement and reduced CPU (the fences make
the cores wait for memory fetches).
There are some drawbacks too:
* All calls to RegionScanner need to be remain synchronized
* Implementors of coprocessors need to be diligent in following the locking
contract. For example Phoenix does not lock RegionScanner.nextRaw() and
required in the documentation (not picking on Phoenix, this one is my fault
as I told them it's OK)
* possible starving of flushes and compaction with heavy read load.
RegionScanner operations would keep getting the locks and the
flushes/compactions would not be able finalize the set of files.
I'll have a patch soon.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11290) Unlock RegionStates

2015-04-20 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504392#comment-14504392
 ] 

Mikhail Antonov commented on HBASE-11290:
-

Yeah, I think that would just work. After all, the lock names are encoded 
region names..so even for cluster with 1M regions with edge case of bulk 
re-assignment the cache shouldn't take more than few hundred mb of RAM, HMaster 
should be able to handle it.

Alternatively, I guess we could just have wrapper class around CHMname, lock 
with lock(), unlock() methods, but current patch would work, too (maybe as 
further improvement we can limit the size of the cache and make getLock() block 
if the cache is waiting for GC?)

What's funny, the length of this thread 
(http://stackoverflow.com/questions/5639870/simple-java-name-based-locks/) 
suggests that simple named locks aren't that simple ;)

 Unlock RegionStates
 ---

 Key: HBASE-11290
 URL: https://issues.apache.org/jira/browse/HBASE-11290
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Virag Kothari
 Fix For: 2.0.0, 1.1.0, 0.98.13

 Attachments: HBASE-11290-0.98.patch, HBASE-11290-0.98_v2.patch, 
 HBASE-11290.draft.patch


 Even though RegionStates is a highly accessed data structure in HMaster. Most 
 of it's methods are synchronized. Which limits concurrency. Even simply 
 making some of the getters non-synchronized by using concurrent data 
 structures has helped with region assignments. We can go as simple as this 
 approach or create locks per region or a bucket lock per region bucket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13420) RegionEnvironment.offerExecutionLatency Blocks Threads under Heavy Load


 [ 
https://issues.apache.org/jira/browse/HBASE-13420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-13420:
---
Attachment: 1M-0.98.13-SNAPSHOT.svg
1M-0.98.12.svg

I did a quick comparison using LoadTestTool on an all-localhost HDFS+HBase 
cluster between 0.98.12 and an 0.98.13-SNAPSHOT which was .12 plus this 
patch.  The server has 32 GB of RAM and 12 cores, Xeon E5-1660s running at 
3.70GHz. All JVMs except the regionserver were given 1 GB heap. The 
regionserver ran with 8 GB. (No particular reason for that heap size, just 
reusing a setting from another test.) I installed the AccessController with 
hbase.security.authorization set to false so every region would run with a 
coprocessor (largely inert) so we'd exercise this change. CMS GC. LoadTestTool 
arguments: -read 100:10 -write 1:1024:10 -update 20:10 -num_keys 100

*0.98.12*
||read|| ||update|| ||write||   ||
||keys_sec||latency_ms||keys_sec||latency_ms||keys_sec||latency_ms||
|19831.5102|0|786.3265306|5.285714286|3929.142857|2.102040816|

*0.98.13-SNAPSHOT*
||read|| ||update|| ||write||   ||
||keys_sec||latency_ms||keys_sec||latency_ms||keys_sec||latency_ms||
|19377.10204|0|783.755102|5.265306122|3924.530612|2.102040816|

Profiles attached. They look almost identical with a quick glance.

I will run a longer comparison tomorrow with 25M keys.

 RegionEnvironment.offerExecutionLatency Blocks Threads under Heavy Load
 ---

 Key: HBASE-13420
 URL: https://issues.apache.org/jira/browse/HBASE-13420
 Project: HBase
  Issue Type: Improvement
Reporter: John Leach
Assignee: Andrew Purtell
 Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2

 Attachments: 1M-0.98.12.svg, 1M-0.98.13-SNAPSHOT.svg, 
 HBASE-13420.patch, HBASE-13420.txt, hbase-13420.tar.gz, 
 offerExecutionLatency.tiff

   Original Estimate: 3h
  Remaining Estimate: 3h

 The ArrayBlockingQueue blocks threads for 20s during a performance run 
 focusing on creating numerous small scans.  
 I see a buffer size of (100)
 private final BlockingQueueLong coprocessorTimeNanos = new 
 ArrayBlockingQueueLong(
 LATENCY_BUFFER_SIZE);
 and then I see a drain coming from
  MetricsRegionWrapperImpl with 45 second executor
  HRegionMetricsWrapperRunable
  RegionCoprocessorHost#getCoprocessorExecutionStatistics()   
  RegionCoprocessorHost#getExecutionLatenciesNanos()
 Am I missing something?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-13520) NullPointerException in TagRewriteCell

Josh Elser created HBASE-13520:
--

 Summary: NullPointerException in TagRewriteCell
 Key: HBASE-13520
 URL: https://issues.apache.org/jira/browse/HBASE-13520
 Project: HBase
  Issue Type: Bug
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.0, 1.0.2


Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos 
enabled.

{noformat}
2015-04-20 18:54:36,712 ERROR 
[B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected 
throwable object
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157)
at 
org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186)
at 
org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568)
at 
org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024)
at 
org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259)
at 
org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567)
at 
org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541)
at org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154)
at 
org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
{noformat}

HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be 
altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided 
copying all components of the original {{Cell}}. In an attempt to help free the 
tags on the old cell that we wouldn't be referencing anymore, 
{{TagRewriteCell}} nulls out the original {{byte[] tags}}.

This causes a problem in that the implementation of {{heapSize()}} as it 
{{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. 
Because the tags on the passed in {{Cell}} (which was also a 
{{TagRewriteCell}}) were null'ed out in the constructor, this results in a NPE 
by the byte array is null.

I believe this isn't observed in normal, unsecure deployments because there is 
only one RegionObserver/Coprocessor loaded that gets invoked via 
{{postMutationBeforeWAL}}. When there is only one RegionObserver, the 
TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from the 
wire/protobuf. This means that the optimization isn't performed. When we have 
two (or more) observers that a TagRewriteCell passes through (and a new 
TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), 
this enables the described-above NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13517) Publish a client artifact with shaded dependencies


 [ 
https://issues.apache.org/jira/browse/HBASE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virag Kothari updated HBASE-13517:
--
Assignee: (was: Virag Kothari)

 Publish a client artifact with shaded dependencies
 --

 Key: HBASE-13517
 URL: https://issues.apache.org/jira/browse/HBASE-13517
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark

 Guava's moved on. Hadoop has not.
 Jackson moves whenever it feels like it.
 Protobuf moves with breaking point changes.
 While shading all of the time would break people that require the transitive 
 dependencies for MR or other things. Lets provide an artifact with our 
 dependencies shaded. Then users can have the choice to use the shaded version 
 or the non-shaded version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13482) Phoenix is failing to scan tables on secure environments.


[ 
https://issues.apache.org/jira/browse/HBASE-13482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504337#comment-14504337
 ] 

Hudson commented on HBASE-13482:


SUCCESS: Integrated in HBase-0.98 #955 (See 
[https://builds.apache.org/job/HBase-0.98/955/])
HBASE-13482. Phoenix is failing to scan tables on secure environments. (Alicia 
Shu) (apurtell: rev 50010ca31ed0587e3bf112a5789ec42185a9b939)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/visibility/VisibilityController.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java


 Phoenix is failing to scan tables on secure environments. 
 --

 Key: HBASE-13482
 URL: https://issues.apache.org/jira/browse/HBASE-13482
 Project: HBase
  Issue Type: Bug
Reporter: Alicia Ying Shu
Assignee: Alicia Ying Shu
 Fix For: 1.1.0, 0.98.13

 Attachments: Hbase-13482-v1.patch, Hbase-13482.patch


 When executed on secure environments, phoenix query is getting the following 
 exception message:
 java.util.concurrent.ExecutionException: 
 org.apache.hadoop.hbase.security.AccessDeniedException: 
 org.apache.hadoop.hbase.security.AccessDeniedException: User 'null' is not 
 the scanner owner! 
 org.apache.hadoop.hbase.security.access.AccessController.requireScannerOwner(AccessController.java:2048)
 org.apache.hadoop.hbase.security.access.AccessController.preScannerNext(AccessController.java:2022)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$53.call(RegionCoprocessorHost.java:1336)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1671)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1746)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1720)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preScannerNext(RegionCoprocessorHost.java:1331)
 org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2227)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11290) Unlock RegionStates

[
https://issues.apache.org/jira/browse/HBASE-11290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504339#comment-14504339
]

Virag Kothari commented on HBASE-11290:
---

IdLock is not reentrant. So, wont be able to use that
So probably we have to go with LockCache impl as in patch. For eviction,
currently values are wrapped using soft references so garbage collection will
be triggered on it on demand. Its not a great eviction policy and will create
memory pressure for large no of regions. As LockCache uses guava's cache
builder, it can support quite a few eviction schemes
(https://code.google.com/p/guava-libraries/wiki/CachesExplained) I think they
can be investigated and added later on in a new jira. Thoughts?

Unlock RegionStates
---

Key: HBASE-11290
URL: https://issues.apache.org/jira/browse/HBASE-11290
Project: HBase
Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Virag Kothari
Fix For: 2.0.0, 1.1.0, 0.98.13

Attachments: HBASE-11290-0.98.patch, HBASE-11290-0.98_v2.patch,
HBASE-11290.draft.patch

Even though RegionStates is a highly accessed data structure in HMaster. Most
of it's methods are synchronized. Which limits concurrency. Even simply
making some of the getters non-synchronized by using concurrent data
structures has helped with region assignments. We can go as simple as this
approach or create locks per region or a bucket lock per region bucket.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13517) Publish a client artifact with shaded dependencies


[ 
https://issues.apache.org/jira/browse/HBASE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504286#comment-14504286
 ] 

stack commented on HBASE-13517:
---

Guava first!

 Publish a client artifact with shaded dependencies
 --

 Key: HBASE-13517
 URL: https://issues.apache.org/jira/browse/HBASE-13517
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark

 Guava's moved on. Hadoop has not.
 Jackson moves whenever it feels like it.
 Protobuf moves with breaking point changes.
 While shading all of the time would break people that require the transitive 
 dependencies for MR or other things. Lets provide an artifact with our 
 dependencies shaded. Then users can have the choice to use the shaded version 
 or the non-shaded version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13520) NullPointerException in TagRewriteCell


[ 
https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504318#comment-14504318
 ] 

Josh Elser commented on HBASE-13520:


bq. nit on test case, need add SmallTests Category

Ack! Forgot about that. Thanks. Will post a new version shortly.

bq. I missed the null check in this place. (added in another I place I guess)

That was the confusing part. The heapSize implementation made it seem like it 
wasn't being handled correctly, but the fact that it only appeared with 1 
RegionObserver was very misleading :)

 NullPointerException in TagRewriteCell
 --

 Key: HBASE-13520
 URL: https://issues.apache.org/jira/browse/HBASE-13520
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.0, 1.0.2

 Attachments: HBASE-13520.patch


 Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos 
 enabled.
 {noformat}
 2015-04-20 18:54:36,712 ERROR 
 [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected 
 throwable object
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157)
 at 
 org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186)
 at 
 org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541)
 at 
 org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
 at 
 org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be 
 altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided 
 copying all components of the original {{Cell}}. In an attempt to help free 
 the tags on the old cell that we wouldn't be referencing anymore, 
 {{TagRewriteCell}} nulls out the original {{byte[] tags}}.
 This causes a problem in that the implementation of {{heapSize()}} as it 
 {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. 
 Because the tags on the passed in {{Cell}} (which was also a 
 {{TagRewriteCell}}) were null'ed out in the constructor, this results in a 
 NPE by the byte array is null.
 I believe this isn't observed in normal, unsecure deployments because there 
 is only one RegionObserver/Coprocessor loaded that gets invoked via 
 {{postMutationBeforeWAL}}. When there is only one RegionObserver, the 
 TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from 
 the wire/protobuf. This means that the optimization isn't performed. When we 
 have two (or more) observers that a TagRewriteCell passes through (and a new 
 TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), 
 this enables the described-above NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13520) NullPointerException in TagRewriteCell


 [ 
https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-13520:
---
Attachment: HBASE-13520-v1.patch

 NullPointerException in TagRewriteCell
 --

 Key: HBASE-13520
 URL: https://issues.apache.org/jira/browse/HBASE-13520
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.0.0, 1.1.0, 1.0.2

 Attachments: HBASE-13520-v1.patch, HBASE-13520.patch


 Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos 
 enabled.
 {noformat}
 2015-04-20 18:54:36,712 ERROR 
 [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected 
 throwable object
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157)
 at 
 org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186)
 at 
 org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567)
 at 
 org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541)
 at 
 org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
 at 
 org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be 
 altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided 
 copying all components of the original {{Cell}}. In an attempt to help free 
 the tags on the old cell that we wouldn't be referencing anymore, 
 {{TagRewriteCell}} nulls out the original {{byte[] tags}}.
 This causes a problem in that the implementation of {{heapSize()}} as it 
 {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. 
 Because the tags on the passed in {{Cell}} (which was also a 
 {{TagRewriteCell}}) were null'ed out in the constructor, this results in a 
 NPE by the byte array is null.
 I believe this isn't observed in normal, unsecure deployments because there 
 is only one RegionObserver/Coprocessor loaded that gets invoked via 
 {{postMutationBeforeWAL}}. When there is only one RegionObserver, the 
 TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from 
 the wire/protobuf. This means that the optimization isn't performed. When we 
 have two (or more) observers that a TagRewriteCell passes through (and a new 
 TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), 
 this enables the described-above NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13320) 'hbase.bucketcache.size' configuration value is not correct in hbase-default.xml

2015-04-20 Thread ramkrishna.s.vasudevan (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504319#comment-14504319
]

ramkrishna.s.vasudevan commented on HBASE-13320:

bq.Looks like we explain it in the book but failed to do it in hbase-default
HBASE-13281 did it but was not aware that the default was there in the
hbase-default. Any ways the value is not right seeing the code and its
calculation.

bq.If remove it, why not remove all to do w/ bucketcache since all but one
value are unset (hbase.bucketcache.sizes description should list default values
too?)
+1 on doing this.
{code}
/property
property
namehbase.bucketcache.ioengine/name
value/value
descriptionWhere to store the contents of the bucketcache. One of:
onheap,
offheap, or file. If a file, set it to file:PATH_TO_FILE. See
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html
for more information.
/description
/property
property
namehbase.bucketcache.combinedcache.enabled/name
valuetrue/value
descriptionWhether or not the bucketcache is used in league with the LRU
on-heap block cache. In this mode, indices and blooms are kept in the LRU
blockcache and the data blocks are kept in the bucketcache./description
/property
property
namehbase.bucketcache.size/name
value65536/value
descriptionThe size of the buckets for the bucketcache if you only use a
single size.
Defaults to the default blocksize, which is 64 * 1024./description
/property
property
namehbase.bucketcache.sizes/name
value/value
descriptionA comma-separated list of sizes for buckets for the
bucketcache
if you use multiple sizes. Should be a list of block sizes in order from
smallest
to largest. The sizes you use will depend on your data access
patterns./description
/property
{code}
Currenly even hbase.bucketcache.sizes are not set. We can just describe the
default value.

'hbase.bucketcache.size' configuration value is not correct in
hbase-default.xml
-

Key: HBASE-13320
URL: https://issues.apache.org/jira/browse/HBASE-13320
Project: HBase
Issue Type: Bug
Components: hbase
Affects Versions: 2.0.0
Reporter: Y. SREENIVASULU REDDY
Assignee: Y. SREENIVASULU REDDY
Fix For: 2.0.0

Attachments: HBASE-13320.patch, HBASE-v2-13320.patch

In hbase-default.xml file
* 'hbase.bucketcache.size' is not correct
We either specify it as a float or in MB's and the default value that is
mentioned is never used
{code}
property
namehbase.bucketcache.size/name
value65536/value
sourcehbase-default.xml/source
/property
{code}

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11286) BulkDisabler should use a bulk RPC call for opening regions (just like BulkAssigner)