date:20150703


 [ 
https://issues.apache.org/jira/browse/HBASE-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13937:

Fix Version/s: (was: 1.3.0)

 Partially revert HBASE-13172 
 -

 Key: HBASE-13937
 URL: https://issues.apache.org/jira/browse/HBASE-13937
 Project: HBase
  Issue Type: Sub-task
  Components: Region Assignment
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.98.14, 1.0.2, 1.2.0, 1.1.1

 Attachments: hbase-13937_v1.patch, hbase-13937_v2.patch, 
 hbase-13937_v3-branch-1.1.patch, hbase-13937_v3.patch, hbase-13937_v3.patch


 HBASE-13172 is supposed to fix a UT issue, but causes other problems that 
 parent jira (HBASE-13605) is attempting to fix. 
 However, HBASE-13605 patch v4 uncovers at least 2 different issues which are, 
 to put it mildly, major design flaws in AM / RS. 
 Regardless of 13605, the issue with 13172 is that we catch 
 {{ServerNotRunningYetException}} from {{isServerReachable()}} and return 
 false, which then puts the Server to the {{RegionStates.deadServers}} list. 
 Once it is in that list, we can still assign and unassign regions to the RS 
 after it has started (because regular assignment does not check whether the 
 server is in  {{RegionStates.deadServers}}. However, after the first assign 
 and unassign, we cannot assign the region again since then the check for the 
 lastServer will think that the server is dead. 
 It turns out that a proper patch for 13605 is very hard without fixing rest 
 of  broken AM assumptions (see HBASE-13605, HBASE-13877 and HBASE-13895 for a 
 colorful history). For 1.1.1, I think we should just revert parts of 
 HBASE-13172 for now. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14021) Quota table has a wrong description on the UI


[ 
https://issues.apache.org/jira/browse/HBASE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613225#comment-14613225
 ] 

Ashish Singhi commented on HBASE-14021:
---

Thanks Ted for the quick review.

 Quota table has a wrong description on the UI
 -

 Key: HBASE-14021
 URL: https://issues.apache.org/jira/browse/HBASE-14021
 Project: HBase
  Issue Type: Bug
  Components: UI
Affects Versions: 1.1.0
Reporter: Ashish Singhi
Assignee: Ashish Singhi
Priority: Minor
 Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1

 Attachments: HBASE-14021.patch, error.png, fix.png


 !error.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13984) Add option to allow caller to know the heartbeat and scanner position when scanner timeout

2015-07-03 Thread He Liangliang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613193#comment-14613193
 ] 

He Liangliang commented on HBASE-13984:
---

[~jonathan.lawlor] Hold on before I make the update

 Add option to allow caller to know the heartbeat and scanner position when 
 scanner timeout
 --

 Key: HBASE-13984
 URL: https://issues.apache.org/jira/browse/HBASE-13984
 Project: HBase
  Issue Type: Improvement
  Components: Scanners
Reporter: He Liangliang
Assignee: He Liangliang
 Attachments: HBASE-13984-V1.diff


 HBASE-13090 introduced scanner heartbeat. However, there are still some 
 limitations (see HBASE-13215). In some application, for example, an operation 
 access hbase to scan table data, and there is strict limit that this call 
 must return in a fixed interval. At the same time, this call is stateless, so 
 the call must return the next position to continue the scan. This is typical 
 use case for online applications.
 Based on this requirement, some improvements are proposed:
 1. Allow client set a flag whether pass the heartbeat (a fake row) to the 
 caller (via ResultScanner next)
 2. Allow the client pass a timeout to the server, which can override the 
 server side default value
 3. When requested by the client, the server peek the next cell and return to 
 the client in the heartbeat message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13984) Add option to allow caller to know the heartbeat and scanner position when scanner timeout

2015-07-03 Thread He Liangliang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613207#comment-14613207
 ] 

He Liangliang commented on HBASE-13984:
---

[~stack] What is the expected behavior if the user want peek next and also 
accept partial? Since the kvs in the store heap and the joint heap are not in 
sort order, so what should the next be if the time limit is reached between 
scanning two heaps? Looks like next cell is applicable only for non-partial 
results.

 Add option to allow caller to know the heartbeat and scanner position when 
 scanner timeout
 --

 Key: HBASE-13984
 URL: https://issues.apache.org/jira/browse/HBASE-13984
 Project: HBase
  Issue Type: Improvement
  Components: Scanners
Reporter: He Liangliang
Assignee: He Liangliang
 Attachments: HBASE-13984-V1.diff


 HBASE-13090 introduced scanner heartbeat. However, there are still some 
 limitations (see HBASE-13215). In some application, for example, an operation 
 access hbase to scan table data, and there is strict limit that this call 
 must return in a fixed interval. At the same time, this call is stateless, so 
 the call must return the next position to continue the scan. This is typical 
 use case for online applications.
 Based on this requirement, some improvements are proposed:
 1. Allow client set a flag whether pass the heartbeat (a fake row) to the 
 caller (via ResultScanner next)
 2. Allow the client pass a timeout to the server, which can override the 
 server side default value
 3. When requested by the client, the server peek the next cell and return to 
 the client in the heartbeat message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-8642) [Snapshot] List and delete snapshot by table


[ 
https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613224#comment-14613224
 ] 

Ashish Singhi commented on HBASE-8642:
--

bq. some way for a user to know which set of snapshots got successfully deleted 
or failed.
This is how it will be displayed on the console.
{noformat}
hbase(main):002:0 delete_table_snapshots 't', '.*'
SNAPSHOT  TABLE + CREATION TIME 

 at (Fri Jul 03 19:40:17 +0530 2015)

 st (Fri Jul 03 19:40:19 +0530 2015)

 s1   t (Fri Jul 03 19:40:21 +0530 2015)


Delete the above 3 snapshots (y/n)?
y
Successfully deleted snapshot: a

Failed to delete snapshot: s, due to below exception,
org.apache.hadoop.hbase.snapshot.SnapshotDoesNotExistException: 
org.apache.hadoop.hbase.snapshot.SnapshotDoesNotExistException: Snapshot 's' 
doesn't exist on the filesystem
at 
org.apache.hadoop.hbase.master.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:282)
at 
org.apache.hadoop.hbase.master.MasterRpcServices.deleteSnapshot(MasterRpcServices.java:464)
at 
org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:49860)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2132)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:106)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)

Successfully deleted snapshot: s1

0 row(s) in 0.0460 seconds

{noformat}

 [Snapshot] List and delete snapshot by table
 

 Key: HBASE-8642
 URL: https://issues.apache.org/jira/browse/HBASE-8642
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2
Reporter: Julian Zhou
Assignee: Ashish Singhi
 Fix For: 2.0.0

 Attachments: 8642-trunk-0.95-v0.patch, 8642-trunk-0.95-v1.patch, 
 8642-trunk-0.95-v2.patch, HBASE-8642-v1.patch, HBASE-8642-v2.patch, 
 HBASE-8642-v3.patch, HBASE-8642-v4.patch, HBASE-8642.patch


 Support list and delete snapshots by table names.
 User scenario:
 A user wants to delete all the snapshots which were taken in January month 
 for a table 't' where snapshot names starts with 'Jan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13394) Failed to recreate a table when quota is enabled


 [ 
https://issues.apache.org/jira/browse/HBASE-13394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13394:

Fix Version/s: (was: 1.2.0)

 Failed to recreate a table when quota is enabled
 

 Key: HBASE-13394
 URL: https://issues.apache.org/jira/browse/HBASE-13394
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 2.0.0, 1.1.0
Reporter: Y. SREENIVASULU REDDY
Assignee: Ashish Singhi
  Labels: quota
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13394-branch-1.1.patch, HBASE-13394-v1.patch, 
 HBASE-13394-v2.patch, HBASE-13394-v3.patch, HBASE-13394-v4.patch, 
 HBASE-13394.patch


 Steps to reproduce.
 Enable quota by setting {{hbase.quota.enabled}} to true
 Create a table say with name 't1', make sure the creation fails after adding  
 this table entry into namespace quota cache.
 Now correct the failure and recreate the table 't1'. It fails with below 
 exception.
 {noformat}
 2015-04-02 14:23:53,729 | ERROR | FifoRpcScheduler.handler1-thread-23 | 
 Unexpected throwable object  | 
 org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2154)
 java.lang.IllegalStateException: Table already in the cache t1
 at 
 org.apache.hadoop.hbase.namespace.NamespaceTableAndRegionInfo.addTable(NamespaceTableAndRegionInfo.java:97)
 at 
 org.apache.hadoop.hbase.namespace.NamespaceStateManager.addTable(NamespaceStateManager.java:171)
 at 
 org.apache.hadoop.hbase.namespace.NamespaceStateManager.checkAndUpdateNamespaceTableCount(NamespaceStateManager.java:147)
 at 
 org.apache.hadoop.hbase.namespace.NamespaceAuditor.checkQuotaToCreateTable(NamespaceAuditor.java:76)
 at 
 org.apache.hadoop.hbase.quotas.MasterQuotaManager.checkNamespaceTableAndRegionQuota(MasterQuotaManager.java:344)
 at 
 org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1781)
 at 
 org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1818)
 at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:42273)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2116)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
 at 
 org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:74)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 P.S: Line numbers may not be in sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13090) Progress heartbeats for long running scanners


 [ 
https://issues.apache.org/jira/browse/HBASE-13090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13090:

Fix Version/s: (was: 1.2.0)

 Progress heartbeats for long running scanners
 -

 Key: HBASE-13090
 URL: https://issues.apache.org/jira/browse/HBASE-13090
 Project: HBase
  Issue Type: New Feature
Reporter: Andrew Purtell
Assignee: Jonathan Lawlor
 Fix For: 2.0.0, 1.1.0

 Attachments: 13090-branch-1.addendum, HBASE-13090-v1.patch, 
 HBASE-13090-v2.patch, HBASE-13090-v3.patch, HBASE-13090-v3.patch, 
 HBASE-13090-v4.patch, HBASE-13090-v6.patch, HBASE-13090-v7.patch


 It can be necessary to set very long timeouts for clients that issue scans 
 over large regions when all data in the region might be filtered out 
 depending on scan criteria. This is a usability concern because it can be 
 hard to identify what worst case timeout to use until scans are 
 occasionally/intermittently failing in production, depending on variable scan 
 criteria. It would be better if the client-server scan protocol can send back 
 periodic progress heartbeats to clients as long as server scanners are alive 
 and making progress.
 This is related but orthogonal to streaming scan (HBASE-13071). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13149) HBase MR is broken on Hadoop 2.5+ Yarn


 [ 
https://issues.apache.org/jira/browse/HBASE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13149:

Fix Version/s: (was: 1.2.0)

 HBase MR is broken on Hadoop 2.5+ Yarn
 --

 Key: HBASE-13149
 URL: https://issues.apache.org/jira/browse/HBASE-13149
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0, 2.0.0, 0.98.10.1
Reporter: Jerry He
Assignee: Jerry He
Priority: Blocker
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13149-0.98.patch, HBASE-13149-master.patch, 
 jackson-core-asl-compat_report.html, jackson-jaxrs-compat_report.html, 
 jackson-mapper-asl-compat_report.html, jackson-xc-compat_report.html, 
 jackson_1.8_to_1.9_compat_report.html


 Running the server MR tools is not working on Yarn version 2.5+.
 Running org.apache.hadoop.hbase.mapreduce.Export:
 {noformat}
 Exception in thread main java.lang.NoSuchMethodError: 
 org.codehaus.jackson.map.ObjectMapper.setSerializationInclusion(Lorg/codehaus/jackson/map/annotate/JsonSerialize$Inclusion;)Lorg/codehaus/jackson/map/ObjectMapper;
 at 
 org.apache.hadoop.yarn.webapp.YarnJacksonJaxbJsonProvider.configObjectMapper(YarnJacksonJaxbJsonProvider.java:59)
 at 
 org.apache.hadoop.yarn.util.timeline.TimelineUtils.clinit(TimelineUtils.java:47)
 at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:166)
 at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
 org.apache.hadoop.mapred.ResourceMgrDelegate.serviceInit(ResourceMgrDelegate.java:102)
 at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
 org.apache.hadoop.mapred.ResourceMgrDelegate.init(ResourceMgrDelegate.java:96)
 at org.apache.hadoop.mapred.YARNRunner.init(YARNRunner.java:112)
 at 
 org.apache.hadoop.mapred.YarnClientProtocolProvider.create(YarnClientProtocolProvider.java:34)
 at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:95)
 at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:82)
 at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:75)
 at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1266)
 at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1262)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at org.apache.hadoop.mapreduce.Job.connect(Job.java:1261)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1290)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314)
 at org.apache.hadoop.hbase.mapreduce.Export.main(Export.java:189)
 {noformat}
 The problem seems to be the jackson jar version.  HADOOP-10104 updated 
 jackson version to 1.9.13.  YARN-2092 reported a problem as well.
 HBase is using jackson 1.8.8. This version of the jar in the classpath seem 
 to cause the problem.
 Should we upgrade to jackson 1.9.13? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-8642) [Snapshot] List and delete snapshot by table


[ 
https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613222#comment-14613222
 ] 

Ashish Singhi commented on HBASE-8642:
--

Patch addressing Matteo comment from RB.
Also ensured that there is some way for a user to know which set of snapshots 
got successfully deleted or failed.

Please review.

 [Snapshot] List and delete snapshot by table
 

 Key: HBASE-8642
 URL: https://issues.apache.org/jira/browse/HBASE-8642
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2
Reporter: Julian Zhou
Assignee: Ashish Singhi
 Fix For: 2.0.0

 Attachments: 8642-trunk-0.95-v0.patch, 8642-trunk-0.95-v1.patch, 
 8642-trunk-0.95-v2.patch, HBASE-8642-v1.patch, HBASE-8642-v2.patch, 
 HBASE-8642-v3.patch, HBASE-8642-v4.patch, HBASE-8642.patch


 Support list and delete snapshots by table names.
 User scenario:
 A user wants to delete all the snapshots which were taken in January month 
 for a table 't' where snapshot names starts with 'Jan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-8642) [Snapshot] List and delete snapshot by table


 [ 
https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singhi updated HBASE-8642:
-
Attachment: HBASE-8642-v4.patch

 [Snapshot] List and delete snapshot by table
 

 Key: HBASE-8642
 URL: https://issues.apache.org/jira/browse/HBASE-8642
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2
Reporter: Julian Zhou
Assignee: Ashish Singhi
 Fix For: 2.0.0

 Attachments: 8642-trunk-0.95-v0.patch, 8642-trunk-0.95-v1.patch, 
 8642-trunk-0.95-v2.patch, HBASE-8642-v1.patch, HBASE-8642-v2.patch, 
 HBASE-8642-v3.patch, HBASE-8642-v4.patch, HBASE-8642.patch


 Support list and delete snapshots by table names.
 User scenario:
 A user wants to delete all the snapshots which were taken in January month 
 for a table 't' where snapshot names starts with 'Jan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13255) Bad grammar in RegionServer status page


 [ 
https://issues.apache.org/jira/browse/HBASE-13255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13255:

Fix Version/s: (was: 1.2.0)

 Bad grammar in RegionServer status page
 ---

 Key: HBASE-13255
 URL: https://issues.apache.org/jira/browse/HBASE-13255
 Project: HBase
  Issue Type: Improvement
  Components: monitoring
Reporter: Josh Elser
Assignee: Josh Elser
Priority: Trivial
 Fix For: 2.0.0, 1.1.0

 Attachments: 
 0001-HBASE-13255-Fix-grammar-in-Regions-description-parag.patch, 
 HBASE-13255.patch


 Noticed on the rs-status page, the blurb under the Regions section could use 
 some grammatical improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion


 [ 
https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang reassigned HBASE-14017:
--

Assignee: Stephen Yuan Jiang  (was: Matteo Bertozzi)

 Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue 
 deletion
 -

 Key: HBASE-14017
 URL: https://issues.apache.org/jira/browse/HBASE-14017
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.2.0, 1.1.0.1
Reporter: Matteo Bertozzi
Assignee: Stephen Yuan Jiang
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2

 Attachments: HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch


 [~syuanjiang] found a concurrecy issue in the procedure queue delete where we 
 don't have an exclusive lock before deleting the table
 {noformat}
 Thread 1: Create table is running - the queue is empty and wlock is false 
 Thread 2: markTableAsDeleted see the queue empty and wlock= false
 Thread 1: tryWrite() set wlock=true; too late
 Thread 2: delete the queue
 Thread 1: never able to release the lock - NPE when trying to get the queue
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion


 [ 
https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-14017:
---
Assignee: Matteo Bertozzi  (was: Stephen Yuan Jiang)

 Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue 
 deletion
 -

 Key: HBASE-14017
 URL: https://issues.apache.org/jira/browse/HBASE-14017
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.2.0, 1.1.0.1
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2

 Attachments: HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch


 [~syuanjiang] found a concurrecy issue in the procedure queue delete where we 
 don't have an exclusive lock before deleting the table
 {noformat}
 Thread 1: Create table is running - the queue is empty and wlock is false 
 Thread 2: markTableAsDeleted see the queue empty and wlock= false
 Thread 1: tryWrite() set wlock=true; too late
 Thread 2: delete the queue
 Thread 1: never able to release the lock - NPE when trying to get the queue
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion


 [ 
https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-14017:
---
Status: Patch Available  (was: In Progress)

 Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue 
 deletion
 -

 Key: HBASE-14017
 URL: https://issues.apache.org/jira/browse/HBASE-14017
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 1.1.0.1, 2.0.0, 1.2.0
Reporter: Matteo Bertozzi
Assignee: Stephen Yuan Jiang
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2

 Attachments: HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch


 [~syuanjiang] found a concurrecy issue in the procedure queue delete where we 
 don't have an exclusive lock before deleting the table
 {noformat}
 Thread 1: Create table is running - the queue is empty and wlock is false 
 Thread 2: markTableAsDeleted see the queue empty and wlock= false
 Thread 1: tryWrite() set wlock=true; too late
 Thread 2: delete the queue
 Thread 1: never able to release the lock - NPE when trying to get the queue
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion

2015-07-03 Thread Matteo Bertozzi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613299#comment-14613299
 ] 

Matteo Bertozzi commented on HBASE-14017:
-

you are looking at the specific table implementation, try to look more at the 
RunQueue object alone.
that acquireDelete() is a ref-count of poor people. it has nothing to do with 
insert/update/delete. 
if we implement refcount for that object you'll have an unref() == 0, instead 
of acquireDelete().

the fact that we have read/write lock in the table is because we have 
read/write operation support, and since we don't have refcount in the base 
RunQueue object we can just implement acquireDelete() as a tryExclusiveLock(). 
but the acquireDelete() has no knowledge of delete operation in term of 
delete table. it is equivalent to a refcounted unref() == 0, how it is 
implemented is just a shortcut to use what we have already. 

 Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue 
 deletion
 -

 Key: HBASE-14017
 URL: https://issues.apache.org/jira/browse/HBASE-14017
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.2.0, 1.1.1, 1.3.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2

 Attachments: HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch


 [~syuanjiang] found a concurrecy issue in the procedure queue delete where we 
 don't have an exclusive lock before deleting the table
 {noformat}
 Thread 1: Create table is running - the queue is empty and wlock is false 
 Thread 2: markTableAsDeleted see the queue empty and wlock= false
 Thread 1: tryWrite() set wlock=true; too late
 Thread 2: delete the queue
 Thread 1: never able to release the lock - NPE when trying to get the queue
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13387) Add ByteBufferedCell an extension to Cell


[ 
https://issues.apache.org/jira/browse/HBASE-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613330#comment-14613330
 ] 

Anoop Sam John commented on HBASE-13387:


Worked on a patch for this. Mainly in CellComparator and CellUtil matching 
APIs, added the ByteBufferedCell instance check. When the cell is instance of 
ByteBufferedCell , we will use getXXXByteBuffer() API  rathen than 
getXXXArray().   (Pls note that HBASE-12345 added Unsafe based compare in 
ByteBufferUtil to compare BBs).
ByteBufferedCell is created as an interface extending Cell.
Doing perf test with PE (range scan with 10K range and all cells filtered out 
at server) I was seeing a 7% perf down.
{code}
 public int compareRows(final Cell left, final Cell right) {
if (left instanceof ByteBufferedCell  right instanceof ByteBufferedCell) {
  return ByteBufferUtils.compareTo(((ByteBufferedCell) 
left).getRowByteBuffer(),
  ((ByteBufferedCell) left).getRowPositionInByteBuffer(), 
left.getRowLength(),
  ((ByteBufferedCell) right).getRowByteBuffer(),
  ((ByteBufferedCell) right).getRowPositionInByteBuffer(), 
right.getRowLength());
}
if (left instanceof ByteBufferedCell) {
  return ByteBufferUtils.compareTo(((ByteBufferedCell) 
left).getRowByteBuffer(),
  ((ByteBufferedCell) left).getRowPositionInByteBuffer(), 
left.getRowLength(),
  right.getRowArray(), right.getRowOffset(), right.getRowLength());
}
if (right instanceof ByteBufferedCell) {
  return -(ByteBufferUtils.compareTo(((ByteBufferedCell) 
right).getRowByteBuffer(),
  ((ByteBufferedCell) right).getRowPositionInByteBuffer(), 
right.getRowLength(),
  left.getRowArray(), left.getRowOffset(), left.getRowLength()));
}
return Bytes.compareTo(left.getRowArray(), left.getRowOffset(), 
left.getRowLength(),
right.getRowArray(), right.getRowOffset(), right.getRowLength());

   }
{code}
Basically the code is like this and we are still not making cell of type 
ByteBufferedCell.

Then tested by changing ByteBufferedCell into an abstract class instead of 
Interface.  
The diff is quite visible.  There is no perf degrade with this.

Calling non overriding method via interface type seems less performing. Should 
be related to java run time optimization and inlining.

 Add ByteBufferedCell an extension to Cell
 -

 Key: HBASE-13387
 URL: https://issues.apache.org/jira/browse/HBASE-13387
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: ByteBufferedCell.docx, WIP_HBASE-13387_V2.patch, 
 WIP_ServerCell.patch


 This came in btw the discussion abt the parent Jira and recently Stack added 
 as a comment on the E2E patch on the parent Jira.
 The idea is to add a new Interface 'ByteBufferedCell'  in which we can add 
 new buffer based getter APIs and getters for position in components in BB.  
 We will keep this interface @InterfaceAudience.Private.   When the Cell is 
 backed by a DBB, we can create an Object implementing this new interface.
 The Comparators has to be aware abt this new Cell extension and has to use 
 the BB based APIs rather than getXXXArray().  Also give util APIs in CellUtil 
 to abstract the checks for new Cell type.  (Like matchingXXX APIs, 
 getValueAstype APIs etc)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12988) [Replication]Parallel apply edits across regions


[ 
https://issues.apache.org/jira/browse/HBASE-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613364#comment-14613364
 ] 

Andrew Purtell commented on HBASE-12988:


bq.  If not don't worry I'll do it as part of checking the 0.98.14 RC (this 
change will be in it for sure).
I'll check it with the new setting at 1 (default for 0.98) and 10...

 [Replication]Parallel apply edits across regions
 

 Key: HBASE-12988
 URL: https://issues.apache.org/jira/browse/HBASE-12988
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Reporter: hongyu bi
Assignee: Lars Hofhansl
 Attachments: 12988-v2.txt, 12988-v3.txt, 12988-v4.txt, 12988.txt, 
 HBASE-12988-0.98.patch, ParallelReplication-v2.txt


 we can apply  edits to slave cluster in parallel on table-level to speed up 
 replication .
 update : per conversation blow , it's better to apply edits on row-level in 
 parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14002) Add --noReplicationSetup option to IntegrationTestReplication


[ 
https://issues.apache.org/jira/browse/HBASE-14002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613371#comment-14613371
 ] 

Andrew Purtell commented on HBASE-14002:


Ok, let me commit this and pick it back through to 0.98

 Add --noReplicationSetup option to IntegrationTestReplication
 -

 Key: HBASE-14002
 URL: https://issues.apache.org/jira/browse/HBASE-14002
 Project: HBase
  Issue Type: Improvement
  Components: integration tests
Reporter: Dima Spivak
Assignee: Dima Spivak
 Attachments: HBASE-14002_master.patch


 IntegrationTestReplication has been flaky for me on pre-1.1 versions of HBase 
 because of not-actually-synchronous operations in HBaseAdmin/Admin, which 
 hamper its setupTablesAndReplication method. To get around this, I'd like to 
 add a \-nrs/--noReplicationSetup option to the test to allow it to be run on 
 clusters in which the necessary tables and replication have already been 
 setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13925) Use zookeeper multi to clear znodes in ZKProcedureUtil


[ 
https://issues.apache.org/jira/browse/HBASE-13925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613372#comment-14613372
 ] 

Andrew Purtell commented on HBASE-13925:


Ok, I will pick back HBASE-7847 and then commit this back through to 0.98.

 Use zookeeper multi to clear znodes in ZKProcedureUtil
 --

 Key: HBASE-13925
 URL: https://issues.apache.org/jira/browse/HBASE-13925
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.98.13
Reporter: Ashish Singhi
Assignee: Ashish Singhi
 Fix For: 2.0.0, 0.98.14, 1.3.0

 Attachments: HBASE-13925-v1-again.patch, HBASE-13925-v1.patch, 
 HBASE-13925.patch


 Address the TODO in ZKProcedureUtil clearChildZNodes() and clearZNodes methods



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14007) Writing to table through MR should fail upfront if table does not exist/disabled


 [ 
https://issues.apache.org/jira/browse/HBASE-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14007:
---
Status: Open  (was: Patch Available)

 Writing to table through MR should fail upfront if table does not 
 exist/disabled
 

 Key: HBASE-14007
 URL: https://issues.apache.org/jira/browse/HBASE-14007
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Affects Versions: 1.1.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
Priority: Minor
  Labels: mapreduce
 Fix For: 2.0.0, 1.3.0

 Attachments: HBASE-14007.patch


 TableOutputFormat.checkOutputSpecs() needs to be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12988) [Replication]Parallel apply edits across regions


[ 
https://issues.apache.org/jira/browse/HBASE-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613410#comment-14613410
 ] 

Andrew Purtell commented on HBASE-12988:


bq. Not sure I follow the part with the DEBUG message. 
Oh I just mean making what's going on with the new logic available in cluster 
logs at DEBUG level. No need if you don't think it worth it.

 [Replication]Parallel apply edits across regions
 

 Key: HBASE-12988
 URL: https://issues.apache.org/jira/browse/HBASE-12988
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Reporter: hongyu bi
Assignee: Lars Hofhansl
 Attachments: 12988-v2.txt, 12988-v3.txt, 12988-v4.txt, 12988.txt, 
 HBASE-12988-0.98.patch, ParallelReplication-v2.txt


 we can apply  edits to slave cluster in parallel on table-level to speed up 
 replication .
 update : per conversation blow , it's better to apply edits on row-level in 
 parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion


 [ 
https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-14017:
---
Status: In Progress  (was: Patch Available)

 Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue 
 deletion
 -

 Key: HBASE-14017
 URL: https://issues.apache.org/jira/browse/HBASE-14017
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 1.1.0.1, 2.0.0, 1.2.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2

 Attachments: HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch


 [~syuanjiang] found a concurrecy issue in the procedure queue delete where we 
 don't have an exclusive lock before deleting the table
 {noformat}
 Thread 1: Create table is running - the queue is empty and wlock is false 
 Thread 2: markTableAsDeleted see the queue empty and wlock= false
 Thread 1: tryWrite() set wlock=true; too late
 Thread 2: delete the queue
 Thread 1: never able to release the lock - NPE when trying to get the queue
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion


 [ 
https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-14017:
---
Attachment: HBASE-14017.v1-branch1.1.patch

 Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue 
 deletion
 -

 Key: HBASE-14017
 URL: https://issues.apache.org/jira/browse/HBASE-14017
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.2.0, 1.1.0.1
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2

 Attachments: HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch


 [~syuanjiang] found a concurrecy issue in the procedure queue delete where we 
 don't have an exclusive lock before deleting the table
 {noformat}
 Thread 1: Create table is running - the queue is empty and wlock is false 
 Thread 2: markTableAsDeleted see the queue empty and wlock= false
 Thread 1: tryWrite() set wlock=true; too late
 Thread 2: delete the queue
 Thread 1: never able to release the lock - NPE when trying to get the queue
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14020) Unsafe based optimized write in ByteBufferOutputStream


 [ 
https://issues.apache.org/jira/browse/HBASE-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-14020:
---
Attachment: HBASE-14020.patch

Tested the patch with PE tool.
./hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred 
--addColumns=false --rows=100 scanRange1 20

I can see 4.2% better performance.

Did not test JMH micro benchmark which will be much more IMO.

 Unsafe based optimized write in ByteBufferOutputStream
 --

 Key: HBASE-14020
 URL: https://issues.apache.org/jira/browse/HBASE-14020
 Project: HBase
  Issue Type: Sub-task
  Components: Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0

 Attachments: HBASE-14020.patch


 We use this class to build the cellblock at RPC layer. The write operation is 
 doing puts to java ByteBuffer which is having lot of overhead. Instead we can 
 do Unsafe based copy to buffer operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14020) Unsafe based optimized write in ByteBufferOutputStream


 [ 
https://issues.apache.org/jira/browse/HBASE-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-14020:
---
Status: Patch Available  (was: Open)

 Unsafe based optimized write in ByteBufferOutputStream
 --

 Key: HBASE-14020
 URL: https://issues.apache.org/jira/browse/HBASE-14020
 Project: HBase
  Issue Type: Sub-task
  Components: Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0

 Attachments: HBASE-14020.patch


 We use this class to build the cellblock at RPC layer. The write operation is 
 doing puts to java ByteBuffer which is having lot of overhead. Instead we can 
 do Unsafe based copy to buffer operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13387) Add ByteBufferedCell an extension to Cell

2015-07-03 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613337#comment-14613337
 ] 

stack commented on HBASE-13387:
---

A little birdy (smile) told me that you did your perf testing using both JMH 
and PE, is that true [~anoop.hbase]? Nice work and interesting finding boss.

 Add ByteBufferedCell an extension to Cell
 -

 Key: HBASE-13387
 URL: https://issues.apache.org/jira/browse/HBASE-13387
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: ByteBufferedCell.docx, WIP_HBASE-13387_V2.patch, 
 WIP_ServerCell.patch


 This came in btw the discussion abt the parent Jira and recently Stack added 
 as a comment on the E2E patch on the parent Jira.
 The idea is to add a new Interface 'ByteBufferedCell'  in which we can add 
 new buffer based getter APIs and getters for position in components in BB.  
 We will keep this interface @InterfaceAudience.Private.   When the Cell is 
 backed by a DBB, we can create an Object implementing this new interface.
 The Comparators has to be aware abt this new Cell extension and has to use 
 the BB based APIs rather than getXXXArray().  Also give util APIs in CellUtil 
 to abstract the checks for new Cell type.  (Like matchingXXX APIs, 
 getValueAstype APIs etc)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13977) Convert getKey and related APIs to Cell


[ 
https://issues.apache.org/jira/browse/HBASE-13977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613334#comment-14613334
 ] 

ramkrishna.s.vasudevan commented on HBASE-13977:


Thanks for the reviews. The git server seems to be down. Will commit it once it 
is up.

 Convert getKey and related APIs to Cell
 ---

 Key: HBASE-13977
 URL: https://issues.apache.org/jira/browse/HBASE-13977
 Project: HBase
  Issue Type: Sub-task
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-13977.patch, HBASE-13977_1.patch, 
 HBASE-13977_2.patch, HBASE-13977_3.patch, HBASE-13977_4.patch, 
 HBASE-13977_4.patch, HBASE-13977_5.patch


 During the course of changes for HBASE-11425 felt that more APIs can be 
 converted to return Cell instead of BB like getKey, getLastKey. 
 We can also rename the getKeyValue to getCell.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13977) Convert getKey and related APIs to Cell


 [ 
https://issues.apache.org/jira/browse/HBASE-13977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-13977:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0
   Status: Resolved  (was: Patch Available)

The server just came back.  Pushed to master. Thanks for the detailed reviews 
Anoop and Stack.

 Convert getKey and related APIs to Cell
 ---

 Key: HBASE-13977
 URL: https://issues.apache.org/jira/browse/HBASE-13977
 Project: HBase
  Issue Type: Sub-task
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 2.0.0

 Attachments: HBASE-13977.patch, HBASE-13977_1.patch, 
 HBASE-13977_2.patch, HBASE-13977_3.patch, HBASE-13977_4.patch, 
 HBASE-13977_4.patch, HBASE-13977_5.patch


 During the course of changes for HBASE-11425 felt that more APIs can be 
 converted to return Cell instead of BB like getKey, getLastKey. 
 We can also rename the getKeyValue to getCell.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12988) [Replication]Parallel apply edits across regions


[ 
https://issues.apache.org/jira/browse/HBASE-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613401#comment-14613401
 ] 

Lars Hofhansl commented on HBASE-12988:
---

Thanks [~apurtell]. Absolutely yes on the constant and hbase-defaults.xml.

Not sure I follow the part with the DEBUG message. Future.get() will rethrow 
any error it encounters and then the outer catch will process those in the same 
way we do now.
So in effect the error handling is unchanged. The only difference is when 
multiple tasks encounter an exception we only remember the last one... Is that 
part what you meant? I wanted to give the other tasks a chance to run and then 
only reschedule those that actually did encounter an error.


 [Replication]Parallel apply edits across regions
 

 Key: HBASE-12988
 URL: https://issues.apache.org/jira/browse/HBASE-12988
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Reporter: hongyu bi
Assignee: Lars Hofhansl
 Attachments: 12988-v2.txt, 12988-v3.txt, 12988-v4.txt, 12988.txt, 
 HBASE-12988-0.98.patch, ParallelReplication-v2.txt


 we can apply  edits to slave cluster in parallel on table-level to speed up 
 replication .
 update : per conversation blow , it's better to apply edits on row-level in 
 parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion


 [ 
https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-14017:
---
Affects Version/s: (was: 1.1.0.1)
   1.3.0
   1.1.1

 Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue 
 deletion
 -

 Key: HBASE-14017
 URL: https://issues.apache.org/jira/browse/HBASE-14017
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.2.0, 1.1.1, 1.3.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2

 Attachments: HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch


 [~syuanjiang] found a concurrecy issue in the procedure queue delete where we 
 don't have an exclusive lock before deleting the table
 {noformat}
 Thread 1: Create table is running - the queue is empty and wlock is false 
 Thread 2: markTableAsDeleted see the queue empty and wlock= false
 Thread 1: tryWrite() set wlock=true; too late
 Thread 2: delete the queue
 Thread 1: never able to release the lock - NPE when trying to get the queue
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion


[ 
https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613291#comment-14613291
 ] 

Ted Yu commented on HBASE-14017:


Stephen:
Jenkins machines are being upgraded.

Hopefully the upgrade would be completed soon.

 Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue 
 deletion
 -

 Key: HBASE-14017
 URL: https://issues.apache.org/jira/browse/HBASE-14017
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.2.0, 1.1.1, 1.3.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2

 Attachments: HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch


 [~syuanjiang] found a concurrecy issue in the procedure queue delete where we 
 don't have an exclusive lock before deleting the table
 {noformat}
 Thread 1: Create table is running - the queue is empty and wlock is false 
 Thread 2: markTableAsDeleted see the queue empty and wlock= false
 Thread 1: tryWrite() set wlock=true; too late
 Thread 2: delete the queue
 Thread 1: never able to release the lock - NPE when trying to get the queue
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14020) Unsafe based optimized write in ByteBufferOutputStream


[ 
https://issues.apache.org/jira/browse/HBASE-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613339#comment-14613339
 ] 

Ted Yu commented on HBASE-14020:


Thanks for the fast response.

+1

 Unsafe based optimized write in ByteBufferOutputStream
 --

 Key: HBASE-14020
 URL: https://issues.apache.org/jira/browse/HBASE-14020
 Project: HBase
  Issue Type: Sub-task
  Components: Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0

 Attachments: HBASE-14020.patch, HBASE-14020_v2.patch


 We use this class to build the cellblock at RPC layer. The write operation is 
 doing puts to java ByteBuffer which is having lot of overhead. Instead we can 
 do Unsafe based copy to buffer operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-13500) Deprecate KVComparator and move to CellComparator


 [ 
https://issues.apache.org/jira/browse/HBASE-13500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-13500.

  Resolution: Fixed
Hadoop Flags: Reviewed

All sub-jiras are closed.  If any thing is needed further, we can raise new 
JIRAS. For now closing this parent task.

 Deprecate KVComparator and move to CellComparator
 -

 Key: HBASE-13500
 URL: https://issues.apache.org/jira/browse/HBASE-13500
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 2.0.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13702) ImportTsv: Add dry-run functionality and log bad rows

[
https://issues.apache.org/jira/browse/HBASE-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613400#comment-14613400
]

Ted Yu commented on HBASE-13702:

TestImportTsv passed with patch v3.

Nice job, Apekshit.

ImportTsv: Add dry-run functionality and log bad rows
-

Key: HBASE-13702
URL: https://issues.apache.org/jira/browse/HBASE-13702
Project: HBase
Issue Type: New Feature
Reporter: Apekshit Sharma
Assignee: Apekshit Sharma
Fix For: 2.0.0, 1.3.0

Attachments: HBASE-13702-branch-1-v2.patch,
HBASE-13702-branch-1-v3.patch, HBASE-13702-branch-1.patch,
HBASE-13702-v2.patch, HBASE-13702-v3.patch, HBASE-13702-v4.patch,
HBASE-13702-v5.patch, HBASE-13702.patch

ImportTSV job skips bad records by default (keeps a count though).
-Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is
encountered.
To be easily able to determine which rows are corrupted in an input, rather
than failing on one row at a time seems like a good feature to have.
Moreover, there should be 'dry-run' functionality in such kinds of tools,
which can essentially does a quick run of tool without making any changes but
reporting any errors/warnings and success/failure.
To identify corrupted rows, simply logging them should be enough. In worst
case, all rows will be logged and size of logs will be same as input size,
which seems fine. However, user might have to do some work figuring out where
the logs. Is there some link we can show to the user when the tool starts
which can help them with that?
For the dry run, we can simply use if-else to skip over writing out KVs, and
any other mutations, if present.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion


[ 
https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613282#comment-14613282
 ] 

Stephen Yuan Jiang commented on HBASE-14017:


I am fine with your approach, not big deal  (though I disagree - a little bit 
over-engineering - Read (shared) /Write (exclusive) lock is standard way of 
practice, we don't want to expand to Insert/Update/Delete lock so that we can 
have flexibility to implement different approach in the future - all of them 
just Write lock).  

 Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue 
 deletion
 -

 Key: HBASE-14017
 URL: https://issues.apache.org/jira/browse/HBASE-14017
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.2.0, 1.1.0.1
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2

 Attachments: HBASE-14017-v0.patch


 [~syuanjiang] found a concurrecy issue in the procedure queue delete where we 
 don't have an exclusive lock before deleting the table
 {noformat}
 Thread 1: Create table is running - the queue is empty and wlock is false 
 Thread 2: markTableAsDeleted see the queue empty and wlock= false
 Thread 1: tryWrite() set wlock=true; too late
 Thread 2: delete the queue
 Thread 1: never able to release the lock - NPE when trying to get the queue
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14015) Allow setting a richer state value when toString a pv2

2015-07-03 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14015:
--
Attachment: 14015.addendum.to.fix.compile.issue.on.branch-1.branch-1.2.txt

Addendum I applied to fix compile issue found by [~ashish singhi] (Thanks 
Ashish).  Applied to branch-1 and branch-1.2.

 Allow setting a richer state value when toString a pv2
 --

 Key: HBASE-14015
 URL: https://issues.apache.org/jira/browse/HBASE-14015
 Project: HBase
  Issue Type: Improvement
  Components: proc-v2
Reporter: stack
Assignee: stack
Priority: Minor
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 
 0001-HBASE-14015-Allow-setting-a-richer-state-value-when-.patch, 
 14015.addendum.to.fix.compile.issue.on.branch-1.branch-1.2.txt


 Debugging, my procedure after a crash was loaded out of the store and its 
 state was RUNNING. It would help if I knew in which of the states of a 
 StateMachineProcedure it was going to start RUNNING at.
 Chatting w/ Matteo, he suggested allowing Procedures customize the String.
 Here is patch that makes it so StateMachineProcedure will now print out the 
 base state -- RUNNING, FINISHED -- followed by a ':' and then the 
 StateMachineProcedure state: e.g. SimpleStateMachineProcedure 
 state=RUNNABLE:SERVER_CRASH_ASSIGN



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14020) Unsafe based optimized write in ByteBufferOutputStream


 [ 
https://issues.apache.org/jira/browse/HBASE-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-14020:
---
Attachment: HBASE-14020_v2.patch

Addressing Ted's comments.  Thanks Ted.

 Unsafe based optimized write in ByteBufferOutputStream
 --

 Key: HBASE-14020
 URL: https://issues.apache.org/jira/browse/HBASE-14020
 Project: HBase
  Issue Type: Sub-task
  Components: Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0

 Attachments: HBASE-14020.patch, HBASE-14020_v2.patch


 We use this class to build the cellblock at RPC layer. The write operation is 
 doing puts to java ByteBuffer which is having lot of overhead. Instead we can 
 do Unsafe based copy to buffer operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-13639) SyncTable - rsync for HBase tables


 [ 
https://issues.apache.org/jira/browse/HBASE-13639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-13639.

Resolution: Fixed

Re-resolving after commit of Hadoop 1 build fix addendum.

 SyncTable - rsync for HBase tables
 --

 Key: HBASE-13639
 URL: https://issues.apache.org/jira/browse/HBASE-13639
 Project: HBase
  Issue Type: New Feature
Reporter: Dave Latham
Assignee: Dave Latham
 Fix For: 2.0.0, 0.98.14, 1.2.0

 Attachments: HBASE-13639-0.98-addendum-hadoop-1.patch, 
 HBASE-13639-0.98.patch, HBASE-13639-v1.patch, HBASE-13639-v2.patch, 
 HBASE-13639-v3-0.98.patch, HBASE-13639-v3.patch, HBASE-13639.patch


 Given HBase tables in remote clusters with similar but not identical data, 
 efficiently update a target table such that the data in question is identical 
 to a source table.  Efficiency in this context means using far less network 
 traffic than would be required to ship all the data from one cluster to the 
 other.  Takes inspiration from rsync.
 Design doc: 
 https://docs.google.com/document/d/1-2c9kJEWNrXf5V4q_wBcoIXfdchN7Pxvxv1IO6PW0-U/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14020) Unsafe based optimized write in ByteBufferOutputStream


[ 
https://issues.apache.org/jira/browse/HBASE-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613349#comment-14613349
 ] 

Anoop Sam John commented on HBASE-14020:


[~stack]  are you fine with this patch?
Seems build machines are not yet back..  Will submit for QA run once it is back.

 Unsafe based optimized write in ByteBufferOutputStream
 --

 Key: HBASE-14020
 URL: https://issues.apache.org/jira/browse/HBASE-14020
 Project: HBase
  Issue Type: Sub-task
  Components: Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0

 Attachments: HBASE-14020.patch, HBASE-14020_v2.patch


 We use this class to build the cellblock at RPC layer. The write operation is 
 doing puts to java ByteBuffer which is having lot of overhead. Instead we can 
 do Unsafe based copy to buffer operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14020) Unsafe based optimized write in ByteBufferOutputStream


[ 
https://issues.apache.org/jira/browse/HBASE-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613328#comment-14613328
 ] 

Ted Yu commented on HBASE-14020:


lgtm
{code}
359* for {@code in}. The position and limit of the {@code in} buffer to 
be set properly by callee.
{code}
'to be set properly by callee' - 'should be set properly by caller'
{code}
656* @param offset offset in the ByteBuffer
{code}
I don't see offset in putInt(ByteBuffer buffer, int val)
{code}
661   UnsafeAccess.putInt(buffer, buffer.position(), val);
662   buffer.position(buffer.position() + Bytes.SIZEOF_INT);
{code}
Looks like you can utilize the return value of UnsafeAccess.putInt() in the 
buffer.position() call.

 Unsafe based optimized write in ByteBufferOutputStream
 --

 Key: HBASE-14020
 URL: https://issues.apache.org/jira/browse/HBASE-14020
 Project: HBase
  Issue Type: Sub-task
  Components: Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0

 Attachments: HBASE-14020.patch


 We use this class to build the cellblock at RPC layer. The write operation is 
 doing puts to java ByteBuffer which is having lot of overhead. Instead we can 
 do Unsafe based copy to buffer operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13387) Add ByteBufferedCell an extension to Cell


[ 
https://issues.apache.org/jira/browse/HBASE-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613343#comment-14613343
 ] 

Anoop Sam John commented on HBASE-13387:


Yes boss.. I have done the perf test with JMH as well as PE.. The JMH test suit 
I will attach to this Jira ...

 Add ByteBufferedCell an extension to Cell
 -

 Key: HBASE-13387
 URL: https://issues.apache.org/jira/browse/HBASE-13387
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: ByteBufferedCell.docx, WIP_HBASE-13387_V2.patch, 
 WIP_ServerCell.patch


 This came in btw the discussion abt the parent Jira and recently Stack added 
 as a comment on the E2E patch on the parent Jira.
 The idea is to add a new Interface 'ByteBufferedCell'  in which we can add 
 new buffer based getter APIs and getters for position in components in BB.  
 We will keep this interface @InterfaceAudience.Private.   When the Cell is 
 backed by a DBB, we can create an Object implementing this new interface.
 The Comparators has to be aware abt this new Cell extension and has to use 
 the BB based APIs rather than getXXXArray().  Also give util APIs in CellUtil 
 to abstract the checks for new Cell type.  (Like matchingXXX APIs, 
 getValueAstype APIs etc)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion


[ 
https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613288#comment-14613288
 ] 

Stephen Yuan Jiang commented on HBASE-14017:


I am not sure why Jenkins not running on the patch in master.  Attach the patch 
for branch-1.1 (I am tested with the new added UT; without patch, repro 
consistently; with patch, problem disappeared) and try to re-trigger the 
Jenkins job.

 Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue 
 deletion
 -

 Key: HBASE-14017
 URL: https://issues.apache.org/jira/browse/HBASE-14017
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.2.0, 1.1.0.1
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2

 Attachments: HBASE-14017-v0.patch


 [~syuanjiang] found a concurrecy issue in the procedure queue delete where we 
 don't have an exclusive lock before deleting the table
 {noformat}
 Thread 1: Create table is running - the queue is empty and wlock is false 
 Thread 2: markTableAsDeleted see the queue empty and wlock= false
 Thread 1: tryWrite() set wlock=true; too late
 Thread 2: delete the queue
 Thread 1: never able to release the lock - NPE when trying to get the queue
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13925) Use zookeeper multi to clear znodes in ZKProcedureUtil


[ 
https://issues.apache.org/jira/browse/HBASE-13925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613375#comment-14613375
 ] 

Ashish Singhi commented on HBASE-13925:
---

Thanks Andrew. Just let me know if I can be any helpful here.

 Use zookeeper multi to clear znodes in ZKProcedureUtil
 --

 Key: HBASE-13925
 URL: https://issues.apache.org/jira/browse/HBASE-13925
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.98.13
Reporter: Ashish Singhi
Assignee: Ashish Singhi
 Fix For: 2.0.0, 0.98.14, 1.3.0

 Attachments: HBASE-13925-v1-again.patch, HBASE-13925-v1.patch, 
 HBASE-13925.patch


 Address the TODO in ZKProcedureUtil clearChildZNodes() and clearZNodes methods



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-13387) Add ByteBufferedCell an extension to Cell


[ 
https://issues.apache.org/jira/browse/HBASE-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613330#comment-14613330
 ] 

Anoop Sam John edited comment on HBASE-13387 at 7/3/15 5:42 PM:


Worked on a patch for this. Mainly in CellComparator and CellUtil matching 
APIs, added the ByteBufferedCell instance check. When the cell is instance of 
ByteBufferedCell , we will use getXXXByteBuffer() API  rathen than 
getXXXArray().   (Pls note that HBASE-12345 added Unsafe based compare in 
ByteBufferUtil to compare BBs).
ByteBufferedCell is created as an interface extending Cell.
Doing perf test with PE (range scan with 10K range and all cells filtered out 
at server) I was seeing a 7% perf down. ( This is with this patch alone which 
add some extra burden of below kind of type check in our compare methods)
{code}
 public int compareRows(final Cell left, final Cell right) {
if (left instanceof ByteBufferedCell  right instanceof ByteBufferedCell) {
  return ByteBufferUtils.compareTo(((ByteBufferedCell) 
left).getRowByteBuffer(),
  ((ByteBufferedCell) left).getRowPositionInByteBuffer(), 
left.getRowLength(),
  ((ByteBufferedCell) right).getRowByteBuffer(),
  ((ByteBufferedCell) right).getRowPositionInByteBuffer(), 
right.getRowLength());
}
if (left instanceof ByteBufferedCell) {
  return ByteBufferUtils.compareTo(((ByteBufferedCell) 
left).getRowByteBuffer(),
  ((ByteBufferedCell) left).getRowPositionInByteBuffer(), 
left.getRowLength(),
  right.getRowArray(), right.getRowOffset(), right.getRowLength());
}
if (right instanceof ByteBufferedCell) {
  return -(ByteBufferUtils.compareTo(((ByteBufferedCell) 
right).getRowByteBuffer(),
  ((ByteBufferedCell) right).getRowPositionInByteBuffer(), 
right.getRowLength(),
  left.getRowArray(), left.getRowOffset(), left.getRowLength()));
}
return Bytes.compareTo(left.getRowArray(), left.getRowOffset(), 
left.getRowLength(),
right.getRowArray(), right.getRowOffset(), right.getRowLength());

   }
{code}
Basically the code is like this and we are still not making cell of type 
ByteBufferedCell.

Then tested by changing ByteBufferedCell into an abstract class instead of 
Interface.  
The diff is quite visible.  There is no perf degrade with this.

Calling non overriding method via interface type seems less performing. Should 
be related to java run time optimization and inlining.


was (Author: anoop.hbase):
Worked on a patch for this. Mainly in CellComparator and CellUtil matching 
APIs, added the ByteBufferedCell instance check. When the cell is instance of 
ByteBufferedCell , we will use getXXXByteBuffer() API  rathen than 
getXXXArray().   (Pls note that HBASE-12345 added Unsafe based compare in 
ByteBufferUtil to compare BBs).
ByteBufferedCell is created as an interface extending Cell.
Doing perf test with PE (range scan with 10K range and all cells filtered out 
at server) I was seeing a 7% perf down.
{code}
 public int compareRows(final Cell left, final Cell right) {
if (left instanceof ByteBufferedCell  right instanceof ByteBufferedCell) {
  return ByteBufferUtils.compareTo(((ByteBufferedCell) 
left).getRowByteBuffer(),
  ((ByteBufferedCell) left).getRowPositionInByteBuffer(), 
left.getRowLength(),
  ((ByteBufferedCell) right).getRowByteBuffer(),
  ((ByteBufferedCell) right).getRowPositionInByteBuffer(), 
right.getRowLength());
}
if (left instanceof ByteBufferedCell) {
  return ByteBufferUtils.compareTo(((ByteBufferedCell) 
left).getRowByteBuffer(),
  ((ByteBufferedCell) left).getRowPositionInByteBuffer(), 
left.getRowLength(),
  right.getRowArray(), right.getRowOffset(), right.getRowLength());
}
if (right instanceof ByteBufferedCell) {
  return -(ByteBufferUtils.compareTo(((ByteBufferedCell) 
right).getRowByteBuffer(),
  ((ByteBufferedCell) right).getRowPositionInByteBuffer(), 
right.getRowLength(),
  left.getRowArray(), left.getRowOffset(), left.getRowLength()));
}
return Bytes.compareTo(left.getRowArray(), left.getRowOffset(), 
left.getRowLength(),
right.getRowArray(), right.getRowOffset(), right.getRowLength());

   }
{code}
Basically the code is like this and we are still not making cell of type 
ByteBufferedCell.

Then tested by changing ByteBufferedCell into an abstract class instead of 
Interface.  
The diff is quite visible.  There is no perf degrade with this.

Calling non overriding method via interface type seems less performing. Should 
be related to java run time optimization and inlining.

 Add ByteBufferedCell an extension to Cell
 -

 Key: HBASE-13387
 URL: https://issues.apache.org/jira/browse/HBASE-13387
 Project: HBase
  Issue Type: Sub-task

[jira] [Resolved] (HBASE-14018) RegionServer is aborted when flushing memstore.


 [ 
https://issues.apache.org/jira/browse/HBASE-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-14018.

Resolution: Invalid

Please mail u...@hbase.apache.org for troubleshooting advice and help. This is 
the project dev tracker. Thanks!

 RegionServer is aborted when flushing memstore.
 ---

 Key: HBASE-14018
 URL: https://issues.apache.org/jira/browse/HBASE-14018
 Project: HBase
  Issue Type: Bug
  Components: hadoop2, hbase
Affects Versions: 1.0.1.1
 Environment: CentOS x64 Server
Reporter: Dinh Duong Mai
 Attachments: hbase-hadoop-master-node1.vmcluster.log, 
 hbase-hadoop-regionserver-node1.vmcluster.log, 
 hbase-hadoop-zookeeper-node1.vmcluster.log


 + Pseudo-distributed Hadoop (2.6.0), ZK_HBASE_MANAGE = true (1 master, 1 
 regionserver).
 + Put data to OpenTSDB, 1000 records / s, for 2000 seconds.
 + RegionServer is aborted.
 === RegionServer logs ===
 2015-07-03 16:37:37,332 INFO  [LruBlockCacheStatsExecutor] 
 hfile.LruBlockCache: totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, 
 blockCount=5, accesses=1623, hits=172, hitRatio=10.60%, , 
 cachingAccesses=177, cachingHits=151, cachingHitsRatio=85.31%, 
 evictions=1139, evicted=21, evictedPerRun=0.018437225371599197
 2015-07-03 16:37:37,898 INFO  [node1:16040Replication Statistics #0] 
 regionserver.Replication: Normal source for cluster 1: Total replicated 
 edits: 2744, currently replicating from: 
 hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
  at position: 19207814
 2015-07-03 16:42:37,331 INFO  [LruBlockCacheStatsExecutor] 
 hfile.LruBlockCache: totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, 
 blockCount=5, accesses=1624, hits=173, hitRatio=10.65%, , 
 cachingAccesses=178, cachingHits=152, cachingHitsRatio=85.39%, 
 evictions=1169, evicted=21, evictedPerRun=0.01796407252550125
 2015-07-03 16:42:37,899 INFO  [node1:16040Replication Statistics #0] 
 regionserver.Replication: Normal source for cluster 1: Total replicated 
 edits: 3049, currently replicating from: 
 hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
  at position: 33026416
 2015-07-03 16:43:27,217 INFO  [MemStoreFlusher.1] regionserver.HRegion: 
 Started memstore flush for 
 tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d., current region 
 memstore size 128.05 MB
 2015-07-03 16:43:27,899 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
 ABORTING region server node1.vmcluster,16040,1435897652505: Replay of WAL 
 required. Forcing server shutdown
 org.apache.hadoop.hbase.DroppedSnapshotException: region: 
 tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d.
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2001)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1772)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1704)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743
   at 
 org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478)
   at 
 org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:932)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:121)
   at 
 org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:71)
   at 
 org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:879)
   at 
 org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2128)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1955)
   ... 7 more
 2015-07-03 16:43:27,901 FATAL [MemStoreFlusher.1]

[jira] [Commented] (HBASE-13500) Deprecate KVComparator and move to CellComparator


[ 
https://issues.apache.org/jira/browse/HBASE-13500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613345#comment-14613345
 ] 

Anoop Sam John commented on HBASE-13500:


Great..  This was a nice clean up..  We changed all places to cell still this 
area was having KV style handling..  Thanks..

 Deprecate KVComparator and move to CellComparator
 -

 Key: HBASE-13500
 URL: https://issues.apache.org/jira/browse/HBASE-13500
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 2.0.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13861) BucketCacheTmpl.jamon has wrong bucket free and used labels


[ 
https://issues.apache.org/jira/browse/HBASE-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613344#comment-14613344
 ] 

Andrew Purtell commented on HBASE-13861:


bq. Isn't this basically the /jmx endpoint?
I would hope so.

 BucketCacheTmpl.jamon has wrong bucket free and used labels
 ---

 Key: HBASE-13861
 URL: https://issues.apache.org/jira/browse/HBASE-13861
 Project: HBase
  Issue Type: Bug
  Components: regionserver, UI
Affects Versions: 1.1.0
Reporter: Lars George
Assignee: Matt Warhaftig
  Labels: beginner
 Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2

 Attachments: hbase-13861-v1.patch


 See this from the template, and note the label and actual values for the last 
 two columns.
 {noformat}
 table class=table table-striped
 tr
 thBucket Offset/th
 thAllocation Size/th
 thFree Count/th
 thUsed Count/th
 /tr
 %for Bucket bucket: buckets %
 tr
 td% bucket.getBaseOffset() %/td
 td% bucket.getItemAllocationSize() %/td
 td% bucket.getFreeBytes() %/td
 td% bucket.getUsedBytes() %/td
 /tr
 {noformat}
 They are labeled counts but are bytes, duh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12988) [Replication]Parallel apply edits across regions


[ 
https://issues.apache.org/jira/browse/HBASE-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613361#comment-14613361
 ] 

Andrew Purtell commented on HBASE-12988:


bq. In 1.1+ I plan to default the number of threads to 10. Before that (1.0 and 
0.98) to 1, so that the behavior and performance characteristics do not change.
Sounds reasonable. 

Can you pull this out into a constant and javadoc it? Should go into 
hbase-defaults.xml too I think. 
{quote}
{code}
replication.source.maxthreads
{code}
{quote}

What do you think about DEBUG level logging where we are submitting futures and 
then handling errors here?
{quote}
{code}
 
+ListFutureInteger futures = new 
ArrayListFutureInteger(entryLists.size());
+for (int i=0; ientryLists.size(); i++) {
+  if (!entryLists.get(i).isEmpty()) {
+futures.add(exec.submit(new Replicator(entryLists.get(i), i)));
+  }
+}
+IOException iox = null;
+for (FutureInteger f : futures) {
+  try {
+// wait for all futures remove successful parts
+entryLists.remove(f.get());
+  } catch (InterruptedException ie) {
+iox =  new IOException(ie);
+  } catch (ExecutionException ee) {
+// cause must be an IOException
+iox = (IOException)ee.getCause();
+  }
+}
+if (iox != null) {
+  // if we had any exception, try again
+  throw iox;
+}
{code}
{quote}

Have you by chance had a chance to run ITR with the latest changes applied? 
It's a bit of a PITA to set up, you'll need to have two single node clusters 
running on the same node as minimum. If not don't worry I'll do it as part of 
checking the 0.98.14 RC (this change will be in it for sure).

 [Replication]Parallel apply edits across regions
 

 Key: HBASE-12988
 URL: https://issues.apache.org/jira/browse/HBASE-12988
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Reporter: hongyu bi
Assignee: Lars Hofhansl
 Attachments: 12988-v2.txt, 12988-v3.txt, 12988-v4.txt, 12988.txt, 
 HBASE-12988-0.98.patch, ParallelReplication-v2.txt


 we can apply  edits to slave cluster in parallel on table-level to speed up 
 replication .
 update : per conversation blow , it's better to apply edits on row-level in 
 parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12213) HFileBlock backed by Array of ByteBuffers


[ 
https://issues.apache.org/jira/browse/HBASE-12213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613367#comment-14613367
 ] 

Ted Yu commented on HBASE-12213:


For MultiByteBufferInputStream :
{code}
+if (len = 0) {
+  return 0;
+}
{code}
The above check can be lifted to beginning of method.
{code}
+  public long skip(long n) {
+long k = Math.min(n, available());
+if (k  0) {
+  k = 0;
+}
{code}
When k is 0, we can directly return from method, right ?

 HFileBlock backed by Array of ByteBuffers
 -

 Key: HBASE-12213
 URL: https://issues.apache.org/jira/browse/HBASE-12213
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-12213_1.patch, HBASE-12213_2.patch


 In L2 cache (offheap) an HFile block might have been cached into multiple 
 chunks of buffers. If HFileBlock need single BB, we will end up in recreation 
 of bigger BB and copying. Instead we can make HFileBlock to serve data from 
 an array of BBs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13978) Variable never assigned in SimpleTotalOrderPartitioner.getPartition()


 [ 
https://issues.apache.org/jira/browse/HBASE-13978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-13978:
---
Fix Version/s: 0.98.14

Picked to 0.98

 Variable never assigned in SimpleTotalOrderPartitioner.getPartition() 
 --

 Key: HBASE-13978
 URL: https://issues.apache.org/jira/browse/HBASE-13978
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 1.1.0.1
Reporter: Lars George
Assignee: Bhupendra Kumar Jain
  Labels: beginner
 Fix For: 2.0.0, 0.98.14, 1.2.0

 Attachments: 
 0001-HBASE-13978-Variable-never-assigned-in-SimpleTotalOr.patch


 See 
 https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/SimpleTotalOrderPartitioner.java#L104,
  which has an {{if}} statement that tries to limit the code to run only once, 
 but since it does not assign {{this.lastReduces}} it will always trigger and 
 recompute the splits (and log them).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13498) Add more docs and a basic check for storage policy handling


 [ 
https://issues.apache.org/jira/browse/HBASE-13498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13498:

Fix Version/s: (was: 1.2.0)

 Add more docs and a basic check for storage policy handling
 ---

 Key: HBASE-13498
 URL: https://issues.apache.org/jira/browse/HBASE-13498
 Project: HBase
  Issue Type: Sub-task
  Components: wal
Reporter: Sean Busbey
Assignee: Sean Busbey
Priority: Minor
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13498.1.patch.txt, HBASE-13498.2.patch.txt


 some minor clean up:
 * make sure our javadocs contain enough info for someone unfamiliar with HDFS 
 storage policies to get started.
 * add a basic test that verifies things happily continue on non-supported 
 versions, or with non-supported policies
 * clarify some log messages



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13490) foreground daemon start re-executes ulimit output


 [ 
https://issues.apache.org/jira/browse/HBASE-13490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13490:

Fix Version/s: (was: 1.2.0)

 foreground daemon start re-executes ulimit output
 -

 Key: HBASE-13490
 URL: https://issues.apache.org/jira/browse/HBASE-13490
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 2.0.0, 1.1.0, 0.98.11
Reporter: Y. SREENIVASULU REDDY
Assignee: Y. SREENIVASULU REDDY
Priority: Minor
 Fix For: 2.0.0, 1.1.0, 0.98.13

 Attachments: HBASE-13490.patch


 Linux Command execution is failing while starting HBase processes using 
 hbase-daemon.sh file
 While starting any HBase process (HMaster or Regionserver)
 ulimit command execution is failing.
 {code}
  echo `date` Starting $command on `hostname`  ${HBASE_LOGLOG}
 `ulimit -a`  $HBASE_LOGLOG 21
 {code}
 Log message is follows.
 {noformat}
 Thu Apr 16 19:24:25 IST 2015 Starting regionserver on HOST-10
 /opt/hdfsdata/HA/install/hbase/regionserver/bin/hbase-daemon.sh: line 207: 
 core: command not found
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13499) AsyncRpcClient test cases failure in powerpc


 [ 
https://issues.apache.org/jira/browse/HBASE-13499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13499:

Fix Version/s: (was: 1.2.0)

 AsyncRpcClient test cases failure in powerpc
 

 Key: HBASE-13499
 URL: https://issues.apache.org/jira/browse/HBASE-13499
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Affects Versions: 2.0.0, 1.1.0, 1.2.0
Reporter: sangamesh
Assignee: Duo Zhang
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13499.patch


 The new AsyncRpcClient feature added through the jira defect HBASE-12684 
 causing some test cases failures in powerpc64 environment.
 I am testing it in master branch.
 Looks like the version of netty (4.0.23) doesn't provide a support for non 
 amd64 platforms and suggested to use pure java netty 
 Here is the discussion on that https://github.com/aphyr/riemann/pull/508
 So new Async test cases will fail in ppc64 and other non amd64 platforms too.
 Here is the output of the error.
 Running org.apache.hadoop.hbase.ipc.TestAsyncIPC
 Tests run: 24, Failures: 0, Errors: 6, Skipped: 0, Time elapsed: 2.802 sec 
  FAILURE! - in org.apache.hadoop.hbase.ipc.TestAsyncIPC
 testRTEDuringAsyncConnectionSetup[3](org.apache.hadoop.hbase.ipc.TestAsyncIPC)
   Time elapsed: 0.048 sec   ERROR!
 java.lang.UnsatisfiedLinkError: 
 /tmp/libnetty-transport-native-epoll4286512618055650929.so: 
 /tmp/libnetty-transport-native-epoll4286512618055650929.so: cannot open 
 shared object file: No such file or directory (Possible cause: can't load AMD 
 64-bit .so on a Power PC 64-bit platform)
   at java.lang.ClassLoader$NativeLibrary.load(Native Method)
   at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13491) Issue in FuzzyRowFilter#getNextForFuzzyRule


 [ 
https://issues.apache.org/jira/browse/HBASE-13491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13491:

Fix Version/s: (was: 1.2.0)

 Issue in FuzzyRowFilter#getNextForFuzzyRule
 ---

 Key: HBASE-13491
 URL: https://issues.apache.org/jira/browse/HBASE-13491
 Project: HBase
  Issue Type: Bug
  Components: Filters
Affects Versions: 1.0.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13

 Attachments: HBASE-13491-branch-1.1.patch, 
 HBASE-13491-branch-1.1.patch, HBASE-13491.patch


 {code}
 for (int i = 0; i  result.length; i++) {
   if (i = fuzzyKeyMeta.length || fuzzyKeyMeta[i] == 1) {
 result[i] = row[offset + i];
 if (!order.isMax(row[i])) {
   // this is non-fixed position and is not at max value, hence we 
 can increase it
   toInc = i;
 }
   }
 {code}
 See we take row bytes with out considering the row offset.  The test cases 
 are passing as we pass 0 offset row bytes. Change in the test will reveal the 
 bug.
 Came across this when I was working on HBASE-11425



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13515) Handle FileNotFoundException in region replica replay for flush/compaction events


 [ 
https://issues.apache.org/jira/browse/HBASE-13515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13515:

Fix Version/s: (was: 1.2.0)

 Handle FileNotFoundException in region replica replay for flush/compaction 
 events
 -

 Key: HBASE-13515
 URL: https://issues.apache.org/jira/browse/HBASE-13515
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0

 Attachments: hbase-13515_v1.patch, hbase-13515_v1.patch


 I had this patch laying around that somehow dropped from my plate. We should 
 skip replaying compaction / flush and region open event markers if the files 
 (from flush or compaction) can no longer be found from the secondary. If we 
 do not skip, the replay will be retried forever, effectively blocking the 
 replication further. 
 Bulk load already does this, we just need to do it for flush / compaction and 
 region open events as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13516) Increase PermSize to 128MB


 [ 
https://issues.apache.org/jira/browse/HBASE-13516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13516:

Fix Version/s: (was: 1.2.0)

 Increase PermSize to 128MB
 --

 Key: HBASE-13516
 URL: https://issues.apache.org/jira/browse/HBASE-13516
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0

 Attachments: hbase-13516_v1.patch, hbase-13516_v2.patch


 HBase uses ~40MB, and with Phoenix we use ~56MB of Perm space out of 64MB by 
 default. Every Filter and Coprocessor increases that.
 Running out of perm space triggers a stop the world full GC of the entire 
 heap. We have seen this in misconfigured cluster. 
 Should we default to  {{-XX:PermSize=128m -XX:MaxPermSize=128m}} out of the 
 box as a convenience for users? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout


 [ 
https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13514:

Fix Version/s: (was: 1.2.0)

 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting 
 of hbase.rpc.timeout
 --

 Key: HBASE-13514
 URL: https://issues.apache.org/jira/browse/HBASE-13514
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 1.1.0, 1.2.0
Reporter: Jonathan Lawlor
Assignee: Jonathan Lawlor
Priority: Minor
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13514-branch-1.1.patch, 
 HBASE-13514-branch-1.patch, HBASE-13514.patch


 The test inside TestScannerHeartbeatMessages is failing because the 
 configured value of hbase.rpc.timeout cannot be less than 2 seconds in 
 branch-1 and branch-1.1 but the test expects that it can be set to 0.5 
 seconds. This is because of the field MIN_RPC_TIMEOUT in 
 {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no 
 longer in master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13518) Typo in hbase.hconnection.meta.lookup.threads.core parameter


 [ 
https://issues.apache.org/jira/browse/HBASE-13518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13518:

Fix Version/s: (was: 1.2.0)

 Typo in hbase.hconnection.meta.lookup.threads.core parameter
 

 Key: HBASE-13518
 URL: https://issues.apache.org/jira/browse/HBASE-13518
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Devaraj Das
 Fix For: 2.0.0, 1.1.0

 Attachments: 13518-1.branch-1.patch, 13518-1.txt


 A possible typo coming from patch in HBASE-13036. 
 I think we want {{hbase.hconnection.meta.lookup.threads.core}}, not 
 {{hbase.hconnection.meta.lookup.threads.max.core}} to be in line with the 
 regular thread pool configuration. 
 {code}
 //To start with, threads.max.core threads can hit the meta 
 (including replicas).
 //After that, requests will get queued up in the passed queue, 
 and only after
 //the queue is full, a new thread will be started
 this.metaLookupPool = getThreadPool(
conf.getInt(hbase.hconnection.meta.lookup.threads.max, 128),
conf.getInt(hbase.hconnection.meta.lookup.threads.max.core, 
 10),
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13517) Publish a client artifact with shaded dependencies


 [ 
https://issues.apache.org/jira/browse/HBASE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13517:

Fix Version/s: (was: 1.2.0)

 Publish a client artifact with shaded dependencies
 --

 Key: HBASE-13517
 URL: https://issues.apache.org/jira/browse/HBASE-13517
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0, 1.1.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13517-v1.patch, HBASE-13517-v2.patch, 
 HBASE-13517-v3.patch, HBASE-13517.patch


 Guava's moved on. Hadoop has not.
 Jackson moves whenever it feels like it.
 Protobuf moves with breaking point changes.
 While shading all of the time would break people that require the transitive 
 dependencies for MR or other things. Lets provide an artifact with our 
 dependencies shaded. Then users can have the choice to use the shaded version 
 or the non-shaded version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

[
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613517#comment-14613517
]

Lars Hofhansl commented on HBASE-7912:
--

[~eclark] had mentioned something he did with incremental backups using
snapshots once.

HBase Backup/Restore Based on HBase Snapshot

Key: HBASE-7912
URL: https://issues.apache.org/jira/browse/HBASE-7912
Project: HBase
Issue Type: Sub-task
Reporter: Richard Ding
Assignee: Vladimir Rodionov
Labels: backup
Fix For: 2.0.0

Attachments: HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf,
HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf,
HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf,
HBase_BackupRestore-Jira-7912-CLI-v1.pdf

Finally, we completed the implementation of our backup/restore solution, and
would like to share with community through this jira.
We are leveraging existing hbase snapshot feature, and provide a general
solution to common users. Our full backup is using snapshot to capture
metadata locally and using exportsnapshot to move data to another cluster;
the incremental backup is using offline-WALplayer to backup HLogs; we also
leverage global distribution rolllog and flush to improve performance; other
added-on values such as convert, merge, progress report, and CLI commands. So
that a common user can backup hbase data without in-depth knowledge of hbase.
Our solution also contains some usability features for enterprise users.
The detail design document and CLI command will be attached in this jira. We
plan to use 10~12 subtasks to share each of the following features, and
document the detail implement in the subtasks:
* *Full Backup* : provide local and remote back/restore for a list of tables
* *offline-WALPlayer* to convert HLog to HFiles offline (for incremental
backup)
* *distributed* Logroll and distributed flush
* Backup *Manifest* and history
* *Incremental* backup: to build on top of full backup as daily/weekly backup
* *Convert* incremental backup WAL files into hfiles
* *Merge* several backup images into one(like merge weekly into monthly)
* *add and remove* table to and from Backup image
* *Cancel* a backup process
* backup progress *status*
* full backup based on *existing snapshot*
*-*
*Below is the original description, to keep here as the history for the
design and discussion back in 2013*
There have been attempts in the past to come up with a viable HBase
backup/restore solution (e.g., HBASE-4618). Recently, there are many
advancements and new features in HBase, for example, FileLink, Snapshot, and
Distributed Barrier Procedure. This is a proposal for a backup/restore
solution that utilizes these new features to achieve better performance and
consistency.

A common practice of backup and restore in database is to first take full
baseline backup, and then periodically take incremental backup that capture
the changes since the full baseline backup. HBase cluster can store massive
amount data. Combination of full backups with incremental backups has
tremendous benefit for HBase as well. The following is a typical scenario
for full and incremental backup.
# The user takes a full backup of a table or a set of tables in HBase.
# The user schedules periodical incremental backups to capture the changes
from the full backup, or from last incremental backup.
# The user needs to restore table data to a past point of time.
# The full backup is restored to the table(s) or to different table name(s).
Then the incremental backups that are up to the desired point in time are
applied on top of the full backup.
We would support the following key features and capabilities.
* Full backup uses HBase snapshot to capture HFiles.
* Use HBase WALs to capture incremental changes, but we use bulk load of
HFiles for fast incremental restore.
* Support single table or a set of tables, and column family level backup and
restore.
* Restore to different table names.
* Support adding additional tables or CF to backup set without interruption
of incremental backup schedule.
* Support rollup/combining of incremental backups into longer period and
bigger incremental backups.
* Unified command line interface for all the above.
The solution will support HBase backup to FileSystem, either on the same
cluster or across clusters. It has the flexibility to support backup to
other devices and servers in the future.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13446) Add docs warning about missing data for downstream on versions prior to HBASE-13262


 [ 
https://issues.apache.org/jira/browse/HBASE-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-13446:
--
Assignee: (was: Lars Hofhansl)

 Add docs warning about missing data for downstream on versions prior to 
 HBASE-13262
 ---

 Key: HBASE-13446
 URL: https://issues.apache.org/jira/browse/HBASE-13446
 Project: HBase
  Issue Type: Task
  Components: documentation
Affects Versions: 0.98.0, 1.0.0
Reporter: Sean Busbey
Priority: Critical
 Fix For: 2.0.0, 0.98.14, 1.0.2


 From conversation at the end of HBASE-13262:
 [~davelatham]
 {quote}
 Should we put a warning somewhere (mailing list? book?) about this? Something 
 like:
 IF (client OR server is = 0.98.11/1.0.0) AND server has a smaller value for 
 hbase.client.scanner.max.result.size than client does, THEN scan requests 
 that reach the server's hbase.client.scanner.max.result.size are likely to 
 miss data. In particular, 0.98.11 defaults 
 hbase.client.scanner.max.result.size to 2MB but other versions default to 
 larger values, so be very careful using 0.98.11 servers with any other client 
 version.
 {quote}
 [~busbey]
 {quote}
 How about we add a note in the ref guide for upgrades and for
 troubleshooting?
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13352) Add hbase.import.version to Import usage.


[ 
https://issues.apache.org/jira/browse/HBASE-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613520#comment-14613520
 ] 

Lars Hofhansl commented on HBASE-13352:
---

+1

 Add hbase.import.version to Import usage.
 -

 Key: HBASE-13352
 URL: https://issues.apache.org/jira/browse/HBASE-13352
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 2.0.0, 0.98.14, 1.0.2, 1.1.2, 1.3.0, 1.2.1

 Attachments: 13352-v2.txt, 13352.txt, hbase-13352_v3.patch


 We just tried to export some (small amount of) data out of an 0.94 cluster to 
 0.98 cluster. We used Export/Import for that.
 By default we found that the import M/R job correctly reports the number of 
 records seen, but _silently_ does not import anything. After looking at the 
 0.98 it's obvious there's an hbase.import.version 
 (-Dhbase.import.version=0.94) to make this work.
 Two issues:
 # -Dhbase.import.version=0.94 should be show with the the Import.usage
 # If not given it should not just silently not import anything
 In this issue I'll just a trivially add this option to the Import tool's 
 usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2015-07-03 Thread Matteo Bertozzi (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613541#comment-14613541
]

Matteo Bertozzi commented on HBASE-7912:

[~lhofhansl] snapshots are already incremental in case you don't compact. you
just export the new files. I don't think Elliott did more than that.
but this is different, here we are incremental because we copy the WAL. In case
of no compaction the snapshot-incremental is probably better but as soon you
start having compactions you have to copy the compacted files with the data you
already have exported, since snapshot are only file based (so we may copy more
data compared to the set of wals we are copying).

HBase Backup/Restore Based on HBase Snapshot

Key: HBASE-7912
URL: https://issues.apache.org/jira/browse/HBASE-7912
Project: HBase
Issue Type: Sub-task
Reporter: Richard Ding
Assignee: Vladimir Rodionov
Labels: backup
Fix For: 2.0.0

[jira] [Updated] (HBASE-13417) batchCoprocessorService() does not handle NULL keys


 [ 
https://issues.apache.org/jira/browse/HBASE-13417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13417:

Fix Version/s: (was: 1.2.0)

 batchCoprocessorService() does not handle NULL keys
 ---

 Key: HBASE-13417
 URL: https://issues.apache.org/jira/browse/HBASE-13417
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 1.0.0
Reporter: Lars George
Assignee: Abhishek Singh Chouhan
Priority: Minor
  Labels: beginner
 Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2

 Attachments: HBASE-13417.patch


 The JavaDoc for {{batchCoprocessorService()}} reads:
 {noformat}
 * @param startKey
 *  start region selection with region containing this row. If {@code 
 null}, the
 *  selection will start with the first table region.
 * @param endKey
 *  select regions up to and including the region containing this row. 
 If {@code null},
 *  selection will continue through the last table region.
 {noformat}
 Setting the call to {{null}} keys like so
 {code}
   Mapbyte[], CountResponse results = table.batchCoprocessorService(
 RowCountService.getDescriptor().findMethodByName(getRowCount),
 request, null, null, CountResponse.getDefaultInstance());
 {code}
 yields an exception:
 {noformat}
 java.lang.NullPointerException
   at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:1187)
   at 
 org.apache.hadoop.hbase.client.HTable.getKeysAndRegionsInRange(HTable.java:744)
   at 
 org.apache.hadoop.hbase.client.HTable.getKeysAndRegionsInRange(HTable.java:723)
   at 
 org.apache.hadoop.hbase.client.HTable.batchCoprocessorService(HTable.java:1801)
   at 
 org.apache.hadoop.hbase.client.HTable.batchCoprocessorService(HTable.java:1778)
   at coprocessor.EndpointBatchExample.main(EndpointBatchExample.java:60)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
 {noformat}
 This is caused by the call shipping off the keys to {{getStartKeysInRange()}} 
 as-is and that method uses {{Bytes.compareTo()}} on the {{null}} keys and 
 fails. 
 One can work around using {{HConstants.EMPTY_START_ROW, 
 HConstants.EMPTY_END_ROW}} instead, but that is not documented, nor standard 
 behavior. Need to handle {{null}} keys as advertised.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12091) Optionally ignore edits for dropped tables for replication.


[ 
https://issues.apache.org/jira/browse/HBASE-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613529#comment-14613529
 ] 

Lars Hofhansl commented on HBASE-12091:
---

Another option to do this: Wait until we actually get a TableNotFound exception 
from the sink. When that happens check whether this one table exists on the 
source and ignore all edit for this table for the retry. The disadvantage is 
that we're trying at least twice in this case (but the case is rare).

 Optionally ignore edits for dropped tables for replication.
 ---

 Key: HBASE-12091
 URL: https://issues.apache.org/jira/browse/HBASE-12091
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl

 We just ran into a scenario where we dropped a table from both the source and 
 the sink, but the source still has outstanding edits that now it could not 
 get rid of. Now all replication is backed up behind these unreplicatable 
 edits.
 We should have an option to ignore edits for tables dropped at the source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-14022) TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+

Andrew Purtell created HBASE-14022:
--

 Summary: TestMultiTableSnapshotInputFormatImpl uses a class only 
available in JRE 1.7+
 Key: HBASE-14022
 URL: https://issues.apache.org/jira/browse/HBASE-14022
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Fix For: 0.98.14


Only applicable to 0.98. Another instance where minimum supported versions of 
the JRE/JDK and Hadoop lag far behind current committer dev tooling. Fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13991) Hierarchical Layout for Humongous Tables


[ 
https://issues.apache.org/jira/browse/HBASE-13991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613446#comment-14613446
 ] 

Andrew Purtell commented on HBASE-13991:


bq.   I think we agree that it may make sense to switchover entirely to the new 
layout instead of making it optional.

+1

Agree there's broad agreement on that. 

Looks like also we should be able to split the need to haves from the nice to 
haves to scope work into increments and make it possible for some changes to go 
further back than master. 

 Hierarchical Layout for Humongous Tables
 

 Key: HBASE-13991
 URL: https://issues.apache.org/jira/browse/HBASE-13991
 Project: HBase
  Issue Type: Sub-task
Reporter: Ben Lau
Assignee: Ben Lau
 Attachments: HBASE-13991-master.patch, HumongousTableDoc.pdf


 Add support for humongous tables via a hierarchical layout for regions on 
 filesystem.  
 Credit for most of this code goes to Huaiyu Zhu.  
 Latest version of the patch is available on the review board: 
 https://reviews.apache.org/r/36029/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13482) Phoenix is failing to scan tables on secure environments.


 [ 
https://issues.apache.org/jira/browse/HBASE-13482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13482:

Fix Version/s: (was: 1.2.0)

 Phoenix is failing to scan tables on secure environments. 
 --

 Key: HBASE-13482
 URL: https://issues.apache.org/jira/browse/HBASE-13482
 Project: HBase
  Issue Type: Bug
Reporter: Alicia Ying Shu
Assignee: Alicia Ying Shu
 Fix For: 2.0.0, 1.1.0, 0.98.13

 Attachments: Hbase-13482-v1.patch, Hbase-13482.patch


 When executed on secure environments, phoenix query is getting the following 
 exception message:
 java.util.concurrent.ExecutionException: 
 org.apache.hadoop.hbase.security.AccessDeniedException: 
 org.apache.hadoop.hbase.security.AccessDeniedException: User 'null' is not 
 the scanner owner! 
 org.apache.hadoop.hbase.security.access.AccessController.requireScannerOwner(AccessController.java:2048)
 org.apache.hadoop.hbase.security.access.AccessController.preScannerNext(AccessController.java:2022)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$53.call(RegionCoprocessorHost.java:1336)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1671)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1746)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1720)
 org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preScannerNext(RegionCoprocessorHost.java:1331)
 org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2227)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13496) Make Bytes$LexicographicalComparerHolder$UnsafeComparer::compareTo inlineable


 [ 
https://issues.apache.org/jira/browse/HBASE-13496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13496:

Fix Version/s: (was: 1.2.0)

 Make Bytes$LexicographicalComparerHolder$UnsafeComparer::compareTo inlineable
 -

 Key: HBASE-13496
 URL: https://issues.apache.org/jira/browse/HBASE-13496
 Project: HBase
  Issue Type: Sub-task
  Components: Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0, 1.1.0, 1.0.2

 Attachments: ByteBufferUtils.java, HBASE-13496.patch, 
 HBASE-13496.patch, OffheapVsOnHeapCompareTest.java, onheapoffheapcompare.tgz


 While testing with some other perf comparisons I have noticed that the above 
 method (which is very hot in read path) is not getting inline
 bq.@ 16   
 org.apache.hadoop.hbase.util.Bytes$LexicographicalComparerHolder$UnsafeComparer::compareTo
  (364 bytes)   hot method too big
 We can do minor refactoring to make it inlineable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13431) Allow to skip store file range check based on column family while creating reference files in HRegionFileSystem#splitStoreFile


 [ 
https://issues.apache.org/jira/browse/HBASE-13431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13431:

Fix Version/s: (was: 1.2.0)

 Allow to skip store file range check based on column family while creating 
 reference files in HRegionFileSystem#splitStoreFile
 --

 Key: HBASE-13431
 URL: https://issues.apache.org/jira/browse/HBASE-13431
 Project: HBase
  Issue Type: Improvement
Reporter: Rajeshbabu Chintaguntla
Assignee: Rajeshbabu Chintaguntla
 Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2

 Attachments: HBASE-13431.patch, HBASE-13431_branch-1.0.patch, 
 HBASE-13431_branch-1.1.patch, HBASE-13431_branch-1.patch, 
 HBASE-13431_v2.patch, HBASE-13431_v3.patch, HBASE-13431_v4.patch


  APPROACH #3 at PHOENIX-1734 helps to implement local indexing without much 
 changes in HBase. For split we need one kernel change to allow creating both 
 top and bottom reference files for index column family store files even when 
 the split key not in the storefile key range. 
 The changes helps in this case are  
 1) pass boolean to HRegionFileSystem#splitStoreFile to allow to skip the 
 storefile key range check.
 2) move the splitStoreFile with extra boolean parameter  to the new interface 
 introduced at HBASE-12975. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13437) ThriftServer leaks ZooKeeper connections


 [ 
https://issues.apache.org/jira/browse/HBASE-13437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13437:

Fix Version/s: (was: 1.2.0)

 ThriftServer leaks ZooKeeper connections
 

 Key: HBASE-13437
 URL: https://issues.apache.org/jira/browse/HBASE-13437
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.98.8
Reporter: Winger Pun
Assignee: Winger Pun
 Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2

 Attachments: HBASE-13437_1.patch, HBASE-13437_1.patch, 
 hbase-13437-fix.patch


 HBase ThriftServer will cache Zookeeper connection in memory using 
 org.apache.hadoop.hbase.util.ConnectionCache. This class has a mechanism 
 called chore to clean up connections idle for too long(default is 10 min). 
 But method timedOut for testing whether idle exceed for maxIdleTime always 
 return false which leads to never release the Zookeeper connection. If we 
 send request to ThriftServer every maxIdleTime then ThriftServer will keep 
 thousands of Zookeeper Connection soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13466) Document deprecations in 1.x - Part 1


 [ 
https://issues.apache.org/jira/browse/HBASE-13466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-13466:

Fix Version/s: (was: 1.2.0)

 Document deprecations in 1.x - Part 1
 -

 Key: HBASE-13466
 URL: https://issues.apache.org/jira/browse/HBASE-13466
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Francke
Assignee: Lars Francke
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13466-v1-branch-1.patch, 
 HBASE-13466-v2-branch-1.patch, HBASE-13466-v3-branch-1.patch, 
 HBASE-13466-v4-branch-1.patch, HBASE-13466.1.patch


 This documents deprecations for the following issues:
 * HBASE-6038
 * HBASE-1502
 * HBASE-5453
 * HBASE-5357
 * HBASE-9870
 * HBASE-10870
 * HBASE-12363
 * HBASE-9508



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-12311) Version stats in HFiles?


 [ 
https://issues.apache.org/jira/browse/HBASE-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-12311.
---
Resolution: Invalid

This is no longer needed. I added much better heuristics now to decide when we 
should SEEK and when we should SKIP.

 Version stats in HFiles?
 

 Key: HBASE-12311
 URL: https://issues.apache.org/jira/browse/HBASE-12311
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl
 Attachments: 12311-indexed-0.98-v2.txt, 12311-indexed-0.98.txt, 
 12311-v2.txt, 12311-v3.txt, 12311.txt, CellStatTracker.java


 In HBASE-9778 I basically punted the decision on whether doing repeated 
 scanner.next() called instead of the issueing (re)seeks to the user.
 I think we can do better.
 One way do that is maintain simple stats of what the maximum number of 
 versions we've seen for any row/col combination and store these in the 
 HFile's metadata (just like the timerange, oldest Put, etc).
 Then we estimate fairly accurately whether we have to expect lots of versions 
 (i.e. seek between columns is better) or not (in which case we'd issue 
 repeated next()'s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-12765) SplitTransaction creates too many threads (potentially)


 [ 
https://issues.apache.org/jira/browse/HBASE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-12765.
---
Resolution: Invalid

 SplitTransaction creates too many threads (potentially)
 ---

 Key: HBASE-12765
 URL: https://issues.apache.org/jira/browse/HBASE-12765
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl
 Attachments: 12765.txt


 In splitStoreFiles(...) we create a new thread pool with as many threads as 
 there are files to split.
 We should be able to do better. During times of very heavy write loads there 
 might be a lot of files to split and multiple splits might be going on at the 
 same time on the same region server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14022) TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+


 [ 
https://issues.apache.org/jira/browse/HBASE-14022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14022:
---
Status: Patch Available  (was: Open)

 TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+
 -

 Key: HBASE-14022
 URL: https://issues.apache.org/jira/browse/HBASE-14022
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Fix For: 0.98.14

 Attachments: HBASE-14022-0.98.patch


 Only applicable to 0.98. Another instance where minimum supported versions of 
 the JRE/JDK and Hadoop lag far behind current committer dev tooling. Fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14022) TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+


 [ 
https://issues.apache.org/jira/browse/HBASE-14022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14022:
---
Attachment: HBASE-14022-0.98.patch

 TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+
 -

 Key: HBASE-14022
 URL: https://issues.apache.org/jira/browse/HBASE-14022
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Fix For: 0.98.14

 Attachments: HBASE-14022-0.98.patch


 Only applicable to 0.98. Another instance where minimum supported versions of 
 the JRE/JDK and Hadoop lag far behind current committer dev tooling. Fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-7847) Use zookeeper multi to clear znodes


 [ 
https://issues.apache.org/jira/browse/HBASE-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-7847:
--
Fix Version/s: 1.0.2
   0.98.14

Picked back to 0.98 and branch-1.0

 Use zookeeper multi to clear znodes
 ---

 Key: HBASE-7847
 URL: https://issues.apache.org/jira/browse/HBASE-7847
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu
Assignee: Rakesh R
 Fix For: 2.0.0, 1.1.0, 0.98.14, 1.0.2

 Attachments: 7847-v1.txt, 7847_v6.patch, 7847_v6.patch, 
 HBASE-7847.patch, HBASE-7847.patch, HBASE-7847.patch, HBASE-7847_v4.patch, 
 HBASE-7847_v5.patch, HBASE-7847_v6.patch, HBASE-7847_v7.1.patch, 
 HBASE-7847_v7.patch, HBASE-7847_v9.patch


 In ZKProcedureUtil, clearChildZNodes() and clearZNodes(String procedureName) 
 should utilize zookeeper multi so that they're atomic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13925) Use zookeeper multi to clear znodes in ZKProcedureUtil


 [ 
https://issues.apache.org/jira/browse/HBASE-13925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-13925:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 1.1.2
   1.2.0
   1.0.2
   Status: Resolved  (was: Patch Available)

Pushed to 0.98 and up

 Use zookeeper multi to clear znodes in ZKProcedureUtil
 --

 Key: HBASE-13925
 URL: https://issues.apache.org/jira/browse/HBASE-13925
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.98.13
Reporter: Ashish Singhi
Assignee: Ashish Singhi
 Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0

 Attachments: HBASE-13925-v1-again.patch, HBASE-13925-v1.patch, 
 HBASE-13925.patch


 Address the TODO in ZKProcedureUtil clearChildZNodes() and clearZNodes methods



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12988) [Replication]Parallel apply edits across regions


 [ 
https://issues.apache.org/jira/browse/HBASE-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-12988:
--
Attachment: 12988-v5.txt

v5
* constants in HConstants
* default in hbase-default.xml
* improved comments
* trace messages when enqueueing tasks

 [Replication]Parallel apply edits across regions
 

 Key: HBASE-12988
 URL: https://issues.apache.org/jira/browse/HBASE-12988
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Reporter: hongyu bi
Assignee: Lars Hofhansl
 Attachments: 12988-v2.txt, 12988-v3.txt, 12988-v4.txt, 12988-v5.txt, 
 12988.txt, HBASE-12988-0.98.patch, ParallelReplication-v2.txt


 we can apply  edits to slave cluster in parallel on table-level to speed up 
 replication .
 update : per conversation blow , it's better to apply edits on row-level in 
 parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12988) [Replication]Parallel apply edits across regions


[ 
https://issues.apache.org/jira/browse/HBASE-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613419#comment-14613419
 ] 

Andrew Purtell commented on HBASE-12988:


+1

 [Replication]Parallel apply edits across regions
 

 Key: HBASE-12988
 URL: https://issues.apache.org/jira/browse/HBASE-12988
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Reporter: hongyu bi
Assignee: Lars Hofhansl
 Attachments: 12988-v2.txt, 12988-v3.txt, 12988-v4.txt, 12988-v5.txt, 
 12988.txt, HBASE-12988-0.98.patch, ParallelReplication-v2.txt


 we can apply  edits to slave cluster in parallel on table-level to speed up 
 replication .
 update : per conversation blow , it's better to apply edits on row-level in 
 parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14002) Add --noReplicationSetup option to IntegrationTestReplication


 [ 
https://issues.apache.org/jira/browse/HBASE-14002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14002:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 1.3.0
   1.1.2
   1.2.0
   0.98.14
   2.0.0
   Status: Resolved  (was: Patch Available)

Pushed to 0.98 and up, except branch-1.0 where we are missing 
IntegrationTestReplication. Shall I rectify that [~enis]? 

 Add --noReplicationSetup option to IntegrationTestReplication
 -

 Key: HBASE-14002
 URL: https://issues.apache.org/jira/browse/HBASE-14002
 Project: HBase
  Issue Type: Improvement
  Components: integration tests
Reporter: Dima Spivak
Assignee: Dima Spivak
 Fix For: 2.0.0, 0.98.14, 1.2.0, 1.1.2, 1.3.0

 Attachments: HBASE-14002_master.patch


 IntegrationTestReplication has been flaky for me on pre-1.1 versions of HBase 
 because of not-actually-synchronous operations in HBaseAdmin/Admin, which 
 hamper its setupTablesAndReplication method. To get around this, I'd like to 
 add a \-nrs/--noReplicationSetup option to the test to allow it to be run on 
 clusters in which the necessary tables and replication have already been 
 setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12988) [Replication]Parallel apply edits across regions


[ 
https://issues.apache.org/jira/browse/HBASE-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613424#comment-14613424
 ] 

Lars Hofhansl commented on HBASE-12988:
---

Just remembering now... [~abhishek.chouhan], you had added a bunch of log 
messages locally to track where time is spent during replication. Should we 
fold those in here? Or separate jira?

 [Replication]Parallel apply edits across regions
 

 Key: HBASE-12988
 URL: https://issues.apache.org/jira/browse/HBASE-12988
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Reporter: hongyu bi
Assignee: Lars Hofhansl
 Attachments: 12988-v2.txt, 12988-v3.txt, 12988-v4.txt, 12988-v5.txt, 
 12988.txt, HBASE-12988-0.98.patch, ParallelReplication-v2.txt


 we can apply  edits to slave cluster in parallel on table-level to speed up 
 replication .
 update : per conversation blow , it's better to apply edits on row-level in 
 parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2015-07-03 Thread Jerry He (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613440#comment-14613440
]

Jerry He commented on HBASE-7912:
-

I went thru the v5 doc. Looks good.

There are some changes in there I see are good.
One of them is to use a new system table hbase:backup instead of zookeeper. I
definitely see benefit of using a system table.
There is also a section talking about with security support consistent with the
existing HBase security.
Regarding the section First incremental after full backup restore, yes, there
could data duplicated in two backups (the full and the incr).
It is better to fix it during the backup.

[~mbertozzi]'s filesystem layout question is mainly a concern for upgrade and
migration, I think.
We've faced such problems.
We will need to make sure the current backup images (data and metadata) can be
used to restore to future hbase releases.

Will continue to read, and may comment later.

HBase Backup/Restore Based on HBase Snapshot

Key: HBASE-7912
URL: https://issues.apache.org/jira/browse/HBASE-7912
Project: HBase
Issue Type: Sub-task
Reporter: Richard Ding
Assignee: Vladimir Rodionov
Labels: backup
Fix For: 2.0.0

[jira] [Updated] (HBASE-14018) RegionServer is aborted when flushing memstore.


 [ 
https://issues.apache.org/jira/browse/HBASE-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinh Duong Mai updated HBASE-14018:
---
Description: 
+ Pseudo-distributed Hadoop, ZK_HBASE_MANAGE = true (1 master, 1 regionserver).
+ Put data to OpenTSDB, 1000 records / s, for 2000 seconds.
+ RegionServer is aborted.

=== RegionServer logs ===
2015-07-03 16:37:37,332 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, blockCount=5, 
accesses=1623, hits=172, hitRatio=10.60%, , cachingAccesses=177, 
cachingHits=151, cachingHitsRatio=85.31%, evictions=1139, evicted=21, 
evictedPerRun=0.018437225371599197
2015-07-03 16:37:37,898 INFO  [node1:16040Replication Statistics #0] 
regionserver.Replication: Normal source for cluster 1: Total replicated edits: 
2744, currently replicating from: 
hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
 at position: 19207814

2015-07-03 16:42:37,331 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, blockCount=5, 
accesses=1624, hits=173, hitRatio=10.65%, , cachingAccesses=178, 
cachingHits=152, cachingHitsRatio=85.39%, evictions=1169, evicted=21, 
evictedPerRun=0.01796407252550125
2015-07-03 16:42:37,899 INFO  [node1:16040Replication Statistics #0] 
regionserver.Replication: Normal source for cluster 1: Total replicated edits: 
3049, currently replicating from: 
hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
 at position: 33026416

2015-07-03 16:43:27,217 INFO  [MemStoreFlusher.1] regionserver.HRegion: Started 
memstore flush for tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d., 
current region memstore size 128.05 MB
2015-07-03 16:43:27,899 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
ABORTING region server node1.vmcluster,16040,1435897652505: Replay of WAL 
required. Forcing server shutdown
org.apache.hadoop.hbase.DroppedSnapshotException: region: 
tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d.
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2001)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1772)
at 
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1704)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743
at 
org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478)
at 
org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
at 
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:932)
at 
org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:121)
at 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:71)
at 
org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:879)
at 
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2128)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1955)
... 7 more
2015-07-03 16:43:27,901 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
RegionServer abort: loaded coprocessors are: 
[org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint]
2015-07-03 16:43:27,942 INFO  [MemStoreFlusher.1] regionserver.HRegionServer: 
Dump of metrics as JSON on abort: {
  beans : [ {
name : java.lang:type=Memory,
modelerType : sun.management.MemoryImpl,
HeapMemoryUsage : {
  committed : 170471424,
  init : 31457280,
  max : 476512256,
  used : 151515136
},
Verbose : false,
ObjectPendingFinalizationCount : 0,
NonHeapMemoryUsage : {
  committed : 72335360,
  init : 2555904,
  max : -1,
  used : 70646296
},
ObjectName : java.lang:type=Memory
  } ],
  beans : [ {
name :

[jira] [Updated] (HBASE-14019) Hbase table import throws RetriesExhaustedException

2015-07-03 Thread Wesley Connor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wesley Connor updated HBASE-14019:
--
Attachment: error.txt

 Hbase table import throws RetriesExhaustedException
 ---

 Key: HBASE-14019
 URL: https://issues.apache.org/jira/browse/HBASE-14019
 Project: HBase
  Issue Type: Bug
  Components: hadoop2, hbase
Affects Versions: 0.98.9
 Environment: hbase-0.98.9-hadoop2
 hadoop-2.6
 Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.63-2+deb7u1 x86_64 GNU/Linux
 Oracle jdk1.8.0_45/
Reporter: Wesley Connor
 Attachments: error.txt


 hbase-0.98.9-hadoop2/bin/hbase org.apache.hadoop.hbase.mapreduce.Import 
 item_restore /data/item_backup 
 Fails with numerous RetriesExhaustedException
 The export process eg  hbase-0.98.9-hadoop2/bin/hbase 
 org.apache.hadoop.hbase.mapreduce.Export item /data/item_backup 
 works flawlessly and the file item_backup is created import of the same file 
 to a table of a different name, fails.
 Please see attached job log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-14019) Hbase table import throws RetriesExhaustedException

2015-07-03 Thread Wesley Connor (JIRA)

Wesley Connor created HBASE-14019:
-

 Summary: Hbase table import throws RetriesExhaustedException
 Key: HBASE-14019
 URL: https://issues.apache.org/jira/browse/HBASE-14019
 Project: HBase
  Issue Type: Bug
  Components: hadoop2, hbase
Affects Versions: 0.98.9
 Environment: hbase-0.98.9-hadoop2
hadoop-2.6
Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.63-2+deb7u1 x86_64 GNU/Linux
Oracle jdk1.8.0_45/
Reporter: Wesley Connor
 Attachments: error.txt

hbase-0.98.9-hadoop2/bin/hbase org.apache.hadoop.hbase.mapreduce.Import 
item_restore /data/item_backup 
Fails with numerous RetriesExhaustedException

The export process eg  hbase-0.98.9-hadoop2/bin/hbase 
org.apache.hadoop.hbase.mapreduce.Export item /data/item_backup 
works flawlessly and the file item_backup is created import of the same file to 
a table of a different name, fails.

Please see attached job log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12016) Reduce number of versions in Meta table. Make it configurable

2015-07-03 Thread Mikhail Antonov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-12016:

Fix Version/s: 1.0.0

 Reduce number of versions in Meta table. Make it configurable
 -

 Key: HBASE-12016
 URL: https://issues.apache.org/jira/browse/HBASE-12016
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Andrey Stepachev
Assignee: Andrey Stepachev
Priority: Minor
 Fix For: 1.0.0, 2.0.0

 Attachments: HBASE-12016.patch, HBASE-12016.patch, HBASE-12016.patch, 
 HBASE-12016.patch, HBASE-12016.patch, HBASE-12016.patch, HBASE-12016.patch


 Currently meta keeps up to 10 versions of each KV. 
 For big metas it leads to substantial memory overhead and scan slowdowns.
 (see https://issues.apache.org/jira/browse/HBASE-11165 )
 Need to keep reasonable number of versions (suggested value is 3). 
 Number of versions configurable via parameter: hbase.meta.versions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14018) RegionServer is aborted when flushing memstore.


 [ 
https://issues.apache.org/jira/browse/HBASE-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinh Duong Mai updated HBASE-14018:
---
Description: 
+ Pseudo-distributed Hadoop (2.6.0), ZK_HBASE_MANAGE = true (1 master, 1 
regionserver).
+ Put data to OpenTSDB, 1000 records / s, for 2000 seconds.
+ RegionServer is aborted.

=== RegionServer logs ===
2015-07-03 16:37:37,332 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, blockCount=5, 
accesses=1623, hits=172, hitRatio=10.60%, , cachingAccesses=177, 
cachingHits=151, cachingHitsRatio=85.31%, evictions=1139, evicted=21, 
evictedPerRun=0.018437225371599197
2015-07-03 16:37:37,898 INFO  [node1:16040Replication Statistics #0] 
regionserver.Replication: Normal source for cluster 1: Total replicated edits: 
2744, currently replicating from: 
hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
 at position: 19207814

2015-07-03 16:42:37,331 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, blockCount=5, 
accesses=1624, hits=173, hitRatio=10.65%, , cachingAccesses=178, 
cachingHits=152, cachingHitsRatio=85.39%, evictions=1169, evicted=21, 
evictedPerRun=0.01796407252550125
2015-07-03 16:42:37,899 INFO  [node1:16040Replication Statistics #0] 
regionserver.Replication: Normal source for cluster 1: Total replicated edits: 
3049, currently replicating from: 
hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
 at position: 33026416

2015-07-03 16:43:27,217 INFO  [MemStoreFlusher.1] regionserver.HRegion: Started 
memstore flush for tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d., 
current region memstore size 128.05 MB
2015-07-03 16:43:27,899 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
ABORTING region server node1.vmcluster,16040,1435897652505: Replay of WAL 
required. Forcing server shutdown
org.apache.hadoop.hbase.DroppedSnapshotException: region: 
tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d.
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2001)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1772)
at 
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1704)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743
at 
org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478)
at 
org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
at 
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:932)
at 
org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:121)
at 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:71)
at 
org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:879)
at 
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2128)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1955)
... 7 more
2015-07-03 16:43:27,901 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
RegionServer abort: loaded coprocessors are: 
[org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint]


=== HMaster logs ===
2015-07-03 13:29:20,671 INFO  [RegionOpenAndInitThread-tsdb-meta-1] 
regionserver.HRegion: creating HRegion tsdb-meta HTD == 'tsdb-meta', {NAME = 
'name', BLOOMFILTER = 'ROW', VERSIONS = '1', IN_MEMORY = 'false', 
KEEP_DELETED_CELLS = 'FALSE', DATA_BLOCK_ENCODING = 'NONE', TTL = 'FOREVER', 
COMPRESSION = 'NONE', MIN_VERSIONS = '0', BLOCKCACHE = 'true', BLOCKSIZE = 
'65536', REPLICATION_SCOPE = '1'} RootDir = 
hdfs://node1.vmcluster:9000/hbase/.tmp Table name == tsdb-meta
2015-07-03 13:29:20,696 INFO  [RegionOpenAndInitThread-tsdb-meta-1]

[jira] [Created] (HBASE-14018) RegionServer is aborted when flushing memstore.

Dinh Duong Mai created HBASE-14018:
--

 Summary: RegionServer is aborted when flushing memstore.
 Key: HBASE-14018
 URL: https://issues.apache.org/jira/browse/HBASE-14018
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.1.1
 Environment: CentOS x64 Server
Reporter: Dinh Duong Mai


+ Pseudo-distributed Hadoop, ZK_HBASE_MANAGE = true (1 master, 1 regionserver).
+ Put data to OpenTSDB, 1000 records / s, for 2000 seconds.
+ RegionServer is aborted.

RegionServer logs:
2015-07-03 16:43:27,320 INFO  [MemStoreFlusher.1] regionserver.HRegion: Started 
memstore flush for tsdb,,1435894216453.35f5c254751fef111cdd3788f8465324., 
current region memstore size 128.12 MB
2015-07-03 16:43:27,955 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
ABORTING region server node2.vmcluster,16040,1435897661557: Replay of WAL 
required. Forcing server shutdown
org.apache.hadoop.hbase.DroppedSnapshotException: region: 
tsdb,,1435894216453.35f5c254751fef111cdd3788f8465324.
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2001)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1772)
at 
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1704)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743
at 
org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478)
at 
org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
at 
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:932)
at 
org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:121)
at 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:71)
at 
org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:879)
at 
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2128)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1955)
... 7 more
2015-07-03 16:43:27,956 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
RegionServer abort: loaded coprocessors are: 
[org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint]





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14018) RegionServer is aborted when flushing memstore.


 [ 
https://issues.apache.org/jira/browse/HBASE-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinh Duong Mai updated HBASE-14018:
---
Attachment: hbase-hadoop-zookeeper-node1.vmcluster.log
hbase-hadoop-regionserver-node1.vmcluster.log
hbase-hadoop-master-node1.vmcluster.log

check at 2015-07-03 16:43:27

 RegionServer is aborted when flushing memstore.
 ---

 Key: HBASE-14018
 URL: https://issues.apache.org/jira/browse/HBASE-14018
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.1.1
 Environment: CentOS x64 Server
Reporter: Dinh Duong Mai
 Attachments: hbase-hadoop-master-node1.vmcluster.log, 
 hbase-hadoop-regionserver-node1.vmcluster.log, 
 hbase-hadoop-zookeeper-node1.vmcluster.log


 + Pseudo-distributed Hadoop, ZK_HBASE_MANAGE = true (1 master, 1 
 regionserver).
 + Put data to OpenTSDB, 1000 records / s, for 2000 seconds.
 + RegionServer is aborted.
 === RegionServer logs ===
 2015-07-03 16:37:37,332 INFO  [LruBlockCacheStatsExecutor] 
 hfile.LruBlockCache: totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, 
 blockCount=5, accesses=1623, hits=172, hitRatio=10.60%, , 
 cachingAccesses=177, cachingHits=151, cachingHitsRatio=85.31%, 
 evictions=1139, evicted=21, evictedPerRun=0.018437225371599197
 2015-07-03 16:37:37,898 INFO  [node1:16040Replication Statistics #0] 
 regionserver.Replication: Normal source for cluster 1: Total replicated 
 edits: 2744, currently replicating from: 
 hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
  at position: 19207814
 2015-07-03 16:42:37,331 INFO  [LruBlockCacheStatsExecutor] 
 hfile.LruBlockCache: totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, 
 blockCount=5, accesses=1624, hits=173, hitRatio=10.65%, , 
 cachingAccesses=178, cachingHits=152, cachingHitsRatio=85.39%, 
 evictions=1169, evicted=21, evictedPerRun=0.01796407252550125
 2015-07-03 16:42:37,899 INFO  [node1:16040Replication Statistics #0] 
 regionserver.Replication: Normal source for cluster 1: Total replicated 
 edits: 3049, currently replicating from: 
 hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
  at position: 33026416
 2015-07-03 16:43:27,217 INFO  [MemStoreFlusher.1] regionserver.HRegion: 
 Started memstore flush for 
 tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d., current region 
 memstore size 128.05 MB
 2015-07-03 16:43:27,899 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
 ABORTING region server node1.vmcluster,16040,1435897652505: Replay of WAL 
 required. Forcing server shutdown
 org.apache.hadoop.hbase.DroppedSnapshotException: region: 
 tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d.
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2001)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1772)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1704)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743
   at 
 org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478)
   at 
 org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:932)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:121)
   at 
 org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:71)
   at 
 org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:879)
   at 
 org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2128)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1955)
   ... 7 more
 2015-07-03 16:43:27,901 FATAL

[jira] [Updated] (HBASE-14018) RegionServer is aborted when flushing memstore.


 [ 
https://issues.apache.org/jira/browse/HBASE-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinh Duong Mai updated HBASE-14018:
---
Description: 
+ Pseudo-distributed Hadoop, ZK_HBASE_MANAGE = true (1 master, 1 regionserver).
+ Put data to OpenTSDB, 1000 records / s, for 2000 seconds.
+ RegionServer is aborted.

=== RegionServer logs ===
2015-07-03 16:37:37,332 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, blockCount=5, 
accesses=1623, hits=172, hitRatio=10.60%, , cachingAccesses=177, 
cachingHits=151, cachingHitsRatio=85.31%, evictions=1139, evicted=21, 
evictedPerRun=0.018437225371599197
2015-07-03 16:37:37,898 INFO  [node1:16040Replication Statistics #0] 
regionserver.Replication: Normal source for cluster 1: Total replicated edits: 
2744, currently replicating from: 
hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
 at position: 19207814

2015-07-03 16:42:37,331 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, blockCount=5, 
accesses=1624, hits=173, hitRatio=10.65%, , cachingAccesses=178, 
cachingHits=152, cachingHitsRatio=85.39%, evictions=1169, evicted=21, 
evictedPerRun=0.01796407252550125
2015-07-03 16:42:37,899 INFO  [node1:16040Replication Statistics #0] 
regionserver.Replication: Normal source for cluster 1: Total replicated edits: 
3049, currently replicating from: 
hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
 at position: 33026416

2015-07-03 16:43:27,217 INFO  [MemStoreFlusher.1] regionserver.HRegion: Started 
memstore flush for tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d., 
current region memstore size 128.05 MB
2015-07-03 16:43:27,899 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
ABORTING region server node1.vmcluster,16040,1435897652505: Replay of WAL 
required. Forcing server shutdown
org.apache.hadoop.hbase.DroppedSnapshotException: region: 
tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d.
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2001)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1772)
at 
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1704)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743
at 
org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478)
at 
org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
at 
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:932)
at 
org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:121)
at 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:71)
at 
org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:879)
at 
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2128)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1955)
... 7 more
2015-07-03 16:43:27,901 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
RegionServer abort: loaded coprocessors are: 
[org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint]


=== HMaster logs ===
2015-07-03 13:29:20,671 INFO  [RegionOpenAndInitThread-tsdb-meta-1] 
regionserver.HRegion: creating HRegion tsdb-meta HTD == 'tsdb-meta', {NAME = 
'name', BLOOMFILTER = 'ROW', VERSIONS = '1', IN_MEMORY = 'false', 
KEEP_DELETED_CELLS = 'FALSE', DATA_BLOCK_ENCODING = 'NONE', TTL = 'FOREVER', 
COMPRESSION = 'NONE', MIN_VERSIONS = '0', BLOCKCACHE = 'true', BLOCKSIZE = 
'65536', REPLICATION_SCOPE = '1'} RootDir = 
hdfs://node1.vmcluster:9000/hbase/.tmp Table name == tsdb-meta
2015-07-03 13:29:20,696 INFO  [RegionOpenAndInitThread-tsdb-meta-1] 
regionserver.HRegion:

[jira] [Commented] (HBASE-12596) bulkload needs to follow locality

2015-07-03 Thread Victor Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613007#comment-14613007
 ] 

Victor Xu commented on HBASE-12596:
---

Thanks for your suggestion. The default value should be true and I need to 
revise some unit tests before that.
New patches will be updated soon.

 bulkload needs to follow locality
 -

 Key: HBASE-12596
 URL: https://issues.apache.org/jira/browse/HBASE-12596
 Project: HBase
  Issue Type: Improvement
  Components: HFile, regionserver
Affects Versions: 0.98.8
 Environment: hadoop-2.3.0, hbase-0.98.8, jdk1.7
Reporter: Victor Xu
Assignee: Victor Xu
 Fix For: 0.98.14

 Attachments: HBASE-12596-0.98-v1.patch, HBASE-12596-0.98-v2.patch, 
 HBASE-12596-master-v1.patch, HBASE-12596-master-v2.patch, HBASE-12596.patch


 Normally, we have 2 steps to perform a bulkload: 1. use a job to write HFiles 
 to be loaded; 2. Move these HFiles to the right hdfs directory. However, the 
 locality could be loss during the first step. Why not just write the HFiles 
 directly into the right place? We can do this easily because 
 StoreFile.WriterBuilder has the withFavoredNodes method, and we just need 
 to call it in HFileOutputFormat's getNewWriter().
 This feature is disabled by default, and we could use 
 'hbase.bulkload.locality.sensitive.enabled=true' to enable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14018) RegionServer is aborted when flushing memstore.


 [ 
https://issues.apache.org/jira/browse/HBASE-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinh Duong Mai updated HBASE-14018:
---
Description: 
+ Pseudo-distributed Hadoop, ZK_HBASE_MANAGE = true (1 master, 1 regionserver).
+ Put data to OpenTSDB, 1000 records / s, for 2000 seconds.
+ RegionServer is aborted.

=== RegionServer logs ===
2015-07-03 16:37:37,332 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, blockCount=5, 
accesses=1623, hits=172, hitRatio=10.60%, , cachingAccesses=177, 
cachingHits=151, cachingHitsRatio=85.31%, evictions=1139, evicted=21, 
evictedPerRun=0.018437225371599197
2015-07-03 16:37:37,898 INFO  [node1:16040Replication Statistics #0] 
regionserver.Replication: Normal source for cluster 1: Total replicated edits: 
2744, currently replicating from: 
hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
 at position: 19207814

2015-07-03 16:42:37,331 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, blockCount=5, 
accesses=1624, hits=173, hitRatio=10.65%, , cachingAccesses=178, 
cachingHits=152, cachingHitsRatio=85.39%, evictions=1169, evicted=21, 
evictedPerRun=0.01796407252550125
2015-07-03 16:42:37,899 INFO  [node1:16040Replication Statistics #0] 
regionserver.Replication: Normal source for cluster 1: Total replicated edits: 
3049, currently replicating from: 
hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
 at position: 33026416

2015-07-03 16:43:27,217 INFO  [MemStoreFlusher.1] regionserver.HRegion: Started 
memstore flush for tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d., 
current region memstore size 128.05 MB
2015-07-03 16:43:27,899 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
ABORTING region server node1.vmcluster,16040,1435897652505: Replay of WAL 
required. Forcing server shutdown
org.apache.hadoop.hbase.DroppedSnapshotException: region: 
tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d.
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2001)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1772)
at 
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1704)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743
at 
org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478)
at 
org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263)
at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
at 
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:932)
at 
org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:121)
at 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:71)
at 
org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:879)
at 
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2128)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1955)
... 7 more
2015-07-03 16:43:27,901 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
RegionServer abort: loaded coprocessors are: 
[org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint]

=== HMaster logs ===
2015-07-03 13:29:20,671 INFO  [RegionOpenAndInitThread-tsdb-meta-1] 
regionserver.HRegion: creating HRegion tsdb-meta HTD == 'tsdb-meta', {NAME = 
'name', BLOOMFILTER = 'ROW', VERSIONS = '1', IN_MEMORY = 'false', 
KEEP_DELETED_CELLS = 'FALSE', DATA_BLOCK_ENCODING = 'NONE', TTL = 'FOREVER', 
COMPRESSION = 'NONE', MIN_VERSIONS = '0', BLOCKCACHE = 'true', BLOCKSIZE = 
'65536', REPLICATION_SCOPE = '1'} RootDir = 
hdfs://node1.vmcluster:9000/hbase/.tmp Table name == tsdb-meta
2015-07-03 13:29:20,696 INFO  [RegionOpenAndInitThread-tsdb-meta-1] 
regionserver.HRegion:

[jira] [Commented] (HBASE-12213) HFileBlock backed by Array of ByteBuffers


[ 
https://issues.apache.org/jira/browse/HBASE-12213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612998#comment-14612998
 ] 

Anoop Sam John commented on HBASE-12213:


bq.We get around 9 to 10 % improvement.
That is great Ram.. Thanks for doing the perf test for every sub task..  Yes 
the MBB and usage of its we tried to save every op as much as possible..  When 
we avoid the temp on heap buffer creation and copy and MBB actually wraps the 
underlying offheap BBs only, we will be able to see much more gain.   Random 
performance is the main thing we are expecting there too.

 HFileBlock backed by Array of ByteBuffers
 -

 Key: HBASE-12213
 URL: https://issues.apache.org/jira/browse/HBASE-12213
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-12213_1.patch, HBASE-12213_2.patch


 In L2 cache (offheap) an HFile block might have been cached into multiple 
 chunks of buffers. If HFileBlock need single BB, we will end up in recreation 
 of bigger BB and copying. Instead we can make HFileBlock to serve data from 
 an array of BBs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14018) RegionServer is aborted when flushing memstore.


 [ 
https://issues.apache.org/jira/browse/HBASE-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinh Duong Mai updated HBASE-14018:
---
Component/s: hbase
 hadoop2

 RegionServer is aborted when flushing memstore.
 ---

 Key: HBASE-14018
 URL: https://issues.apache.org/jira/browse/HBASE-14018
 Project: HBase
  Issue Type: Bug
  Components: hadoop2, hbase
Affects Versions: 1.0.1.1
 Environment: CentOS x64 Server
Reporter: Dinh Duong Mai
 Attachments: hbase-hadoop-master-node1.vmcluster.log, 
 hbase-hadoop-regionserver-node1.vmcluster.log, 
 hbase-hadoop-zookeeper-node1.vmcluster.log


 + Pseudo-distributed Hadoop, ZK_HBASE_MANAGE = true (1 master, 1 
 regionserver).
 + Put data to OpenTSDB, 1000 records / s, for 2000 seconds.
 + RegionServer is aborted.
 === RegionServer logs ===
 2015-07-03 16:37:37,332 INFO  [LruBlockCacheStatsExecutor] 
 hfile.LruBlockCache: totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, 
 blockCount=5, accesses=1623, hits=172, hitRatio=10.60%, , 
 cachingAccesses=177, cachingHits=151, cachingHitsRatio=85.31%, 
 evictions=1139, evicted=21, evictedPerRun=0.018437225371599197
 2015-07-03 16:37:37,898 INFO  [node1:16040Replication Statistics #0] 
 regionserver.Replication: Normal source for cluster 1: Total replicated 
 edits: 2744, currently replicating from: 
 hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
  at position: 19207814
 2015-07-03 16:42:37,331 INFO  [LruBlockCacheStatsExecutor] 
 hfile.LruBlockCache: totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, 
 blockCount=5, accesses=1624, hits=173, hitRatio=10.65%, , 
 cachingAccesses=178, cachingHits=152, cachingHitsRatio=85.39%, 
 evictions=1169, evicted=21, evictedPerRun=0.01796407252550125
 2015-07-03 16:42:37,899 INFO  [node1:16040Replication Statistics #0] 
 regionserver.Replication: Normal source for cluster 1: Total replicated 
 edits: 3049, currently replicating from: 
 hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590
  at position: 33026416
 2015-07-03 16:43:27,217 INFO  [MemStoreFlusher.1] regionserver.HRegion: 
 Started memstore flush for 
 tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d., current region 
 memstore size 128.05 MB
 2015-07-03 16:43:27,899 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
 ABORTING region server node1.vmcluster,16040,1435897652505: Replay of WAL 
 required. Forcing server shutdown
 org.apache.hadoop.hbase.DroppedSnapshotException: region: 
 tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d.
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2001)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1772)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1704)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743
   at 
 org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478)
   at 
 org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:932)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:121)
   at 
 org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:71)
   at 
 org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:879)
   at 
 org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2128)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1955)
   ... 7 more
 2015-07-03 16:43:27,901 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: 
 RegionServer abort: loaded coprocessors are:

[jira] [Commented] (HBASE-14015) Allow setting a richer state value when toString a pv2


[ 
https://issues.apache.org/jira/browse/HBASE-14015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612890#comment-14612890
 ] 

Ashish Singhi commented on HBASE-14015:
---

[~saint@gmail.com], this seems to be introducing a compilation error in at 
least branch-1.

{noformat}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile 
(default-testCompile) on project hbase-procedure: Compilation failure: 
Compilation failure:
[ERROR] 
/home/root1/Code/H-b-1/hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestProcedureToString.java:[28,49]
 error: cannot find symbol
[ERROR] symbol:   class MasterTests
[ERROR] location: package org.apache.hadoop.hbase.testclassification
[ERROR] 
/home/root1/Code/H-b-1/hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestProcedureToString.java:[33,11]
 error: cannot find symbol
{noformat}

 Allow setting a richer state value when toString a pv2
 --

 Key: HBASE-14015
 URL: https://issues.apache.org/jira/browse/HBASE-14015
 Project: HBase
  Issue Type: Improvement
  Components: proc-v2
Reporter: stack
Assignee: stack
Priority: Minor
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 
 0001-HBASE-14015-Allow-setting-a-richer-state-value-when-.patch


 Debugging, my procedure after a crash was loaded out of the store and its 
 state was RUNNING. It would help if I knew in which of the states of a 
 StateMachineProcedure it was going to start RUNNING at.
 Chatting w/ Matteo, he suggested allowing Procedures customize the String.
 Here is patch that makes it so StateMachineProcedure will now print out the 
 base state -- RUNNING, FINISHED -- followed by a ':' and then the 
 StateMachineProcedure state: e.g. SimpleStateMachineProcedure 
 state=RUNNABLE:SERVER_CRASH_ASSIGN



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12213) HFileBlock backed by Array of ByteBuffers


[ 
https://issues.apache.org/jira/browse/HBASE-12213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612900#comment-14612900
 ] 

ramkrishna.s.vasudevan commented on HBASE-12213:


Did some perf testing using the Performance evaluation tool.
For range scan there is no difference between with and without patch
ScanRange1 rows=1 with 25 threads, filterAll = true
without patch
{code}
2015-07-03 17:06:35,332 INFO  [main] hbase.PerformanceEvaluation: 
[RandomScanWithRange1Test] Summary of timings (ms): [573388, 574365, 
573071, 571351, 572586, 572020, 572732, 573370, 573627, 573146, 573245, 574017, 
573049, 574283, 571470, 574312, 574539, 574515, 571540, 574108, 573530, 574177, 
574143, 572927, 574541]
2015-07-03 17:06:35,333 INFO  [main] hbase.PerformanceEvaluation: 
[RandomScanWithRange1Test]Min: 571351ms   Max: 574541ms   Avg: 
573362ms

{code}
with patch
{code}
2015-07-03 16:49:57,711 INFO  [main] hbase.PerformanceEvaluation: 
[RandomScanWithRange1Test] Summary of timings (ms): [572276, 569970, 
572610, 571517, 572027, 570168, 570932, 572951, 570492, 572410, 572037, 571495, 
572238, 572055, 572114, 571142, 572500, 569813, 572383, 572223, 571191, 572197, 
571473, 571235, 569209]
2015-07-03 16:49:57,715 INFO  [main] hbase.PerformanceEvaluation: 
[RandomScanWithRange1Test]Min: 569209ms   Max: 572951ms   Avg: 
571546ms
{code}

But when we do gets which involves lot of blockSeek code 
randomReads = 10 threads, multiGets = 100, filterAll = true
withoutpatch
{code}
2015-07-03 17:13:37,172 INFO  [main] hbase.PerformanceEvaluation: 
[RandomReadTest] Summary of timings (ms): [158292, 158191, 157503, 158375, 
158182, 158901, 157680, 158583, 158388, 158279]
2015-07-03 17:13:37,173 INFO  [main] hbase.PerformanceEvaluation: 
[RandomReadTest]  Min: 157503ms   Max: 158901ms   Avg: 158237ms

2015-07-03 17:33:36,854 INFO  [main] hbase.PerformanceEvaluation: 
[RandomReadTest] Summary of timings (ms): [163757, 163478, 163175, 163876, 
163490, 163187, 163813, 163937, 162839, 163398]
2015-07-03 17:33:36,855 INFO  [main] hbase.PerformanceEvaluation: 
[RandomReadTest]  Min: 162839ms   Max: 163937ms   Avg: 163495ms

2015-07-03 17:36:44,255 INFO  [main] hbase.PerformanceEvaluation: 
[RandomReadTest] Summary of timings (ms): [161414, 162715, 162266, 162409, 
161884, 162161, 162235, 162077, 162852, 161575]
2015-07-03 17:36:44,256 INFO  [main] hbase.PerformanceEvaluation: 
[RandomReadTest]  Min: 161414ms   Max: 162852ms   Avg: 162158ms
{code}
with patch
{code}
2015-07-03 17:26:02,496 INFO  [main] hbase.PerformanceEvaluation: 
[RandomReadTest] Summary of timings (ms): [150311, 150721, 150350, 149602, 
150339, 150593, 151117, 149046, 150403, 150087]
2015-07-03 17:26:02,497 INFO  [main] hbase.PerformanceEvaluation: 
[RandomReadTest]  Min: 149046ms   Max: 151117ms   Avg: 150256ms

2015-07-03 17:43:52,968 INFO  [main] hbase.PerformanceEvaluation: 
[RandomReadTest] Summary of timings (ms): [148812, 148077, 148482, 148160, 
148479, 147835, 148461, 147683, 148499, 149299]
2015-07-03 17:43:52,970 INFO  [main] hbase.PerformanceEvaluation: 
[RandomReadTest]  Min: 147683ms   Max: 149299ms   Avg: 148378ms

2015-07-03 17:46:30,247 INFO  [main] hbase.PerformanceEvaluation: 
[RandomReadTest] Summary of timings (ms): [149254, 148839, 148658, 148982, 
148443, 147960, 148590, 148508, 148855, 148767]
2015-07-03 17:46:30,248 INFO  [main] hbase.PerformanceEvaluation: 
[RandomReadTest]  Min: 147960ms   Max: 149254ms   Avg: 148685ms

{code}
We see significant gain.  Currently note that we are copying the offheap 
buckets to the onheap single BB.  The blockSeek positional based reads and some 
optimization in the KeyOnlyKV is the reason for this performance gain.   Even 
without MBB we should be able to see the same gain because now the MBB is going 
to act as a wrapper over the single BB that was formed from the underlying 
offheap buckets.

 HFileBlock backed by Array of ByteBuffers
 -

 Key: HBASE-12213
 URL: https://issues.apache.org/jira/browse/HBASE-12213
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-12213_1.patch


 In L2 cache (offheap) an HFile block might have been cached into multiple 
 chunks of buffers. If HFileBlock need single BB, we will end up in recreation 
 of bigger BB and copying. Instead we can make HFileBlock to serve data from 
 an array of BBs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12213) HFileBlock backed by Array of ByteBuffers


[ 
https://issues.apache.org/jira/browse/HBASE-12213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612902#comment-14612902
 ] 

ramkrishna.s.vasudevan commented on HBASE-12213:


We get around 9 to 10 % improvement.

 HFileBlock backed by Array of ByteBuffers
 -

 Key: HBASE-12213
 URL: https://issues.apache.org/jira/browse/HBASE-12213
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-12213_1.patch


 In L2 cache (offheap) an HFile block might have been cached into multiple 
 chunks of buffers. If HFileBlock need single BB, we will end up in recreation 
 of bigger BB and copying. Instead we can make HFileBlock to serve data from 
 an array of BBs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12213) HFileBlock backed by Array of ByteBuffers


 [ 
https://issues.apache.org/jira/browse/HBASE-12213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-12213:
---
Status: Open  (was: Patch Available)

 HFileBlock backed by Array of ByteBuffers
 -

 Key: HBASE-12213
 URL: https://issues.apache.org/jira/browse/HBASE-12213
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-12213_1.patch


 In L2 cache (offheap) an HFile block might have been cached into multiple 
 chunks of buffers. If HFileBlock need single BB, we will end up in recreation 
 of bigger BB and copying. Instead we can make HFileBlock to serve data from 
 an array of BBs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12596) bulkload needs to follow locality