[jira] [Commented] (HBASE-15484) Correct the semantic of batch and partial

2016-11-10 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656439#comment-15656439
 ] 

Duo Zhang commented on HBASE-15484:
---

setBatch is almost the same with allowPartial as we do not guarantee that we 
will always give you the cells which count is exactly the batch you set. You 
can use setMaxResultSize and allowPartial I think.

And for caching, it is the same reason that you can just use setMaxResultSize 
to limit the size of cells returned. And I think the 'limit' option is more 
useful because RS could close the scanner when the returned results reached the 
limit.

> Correct the semantic of batch and partial
> -
>
> Key: HBASE-15484
> URL: https://issues.apache.org/jira/browse/HBASE-15484
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
> Fix For: 2.0.0
>
> Attachments: HBASE-15484-v1.patch, HBASE-15484-v2.patch, 
> HBASE-15484-v3.patch, HBASE-15484-v4.patch
>
>
> Follow-up to HBASE-15325, as discussed, the meaning of setBatch and 
> setAllowPartialResults should not be same. We should not regard setBatch as 
> setAllowPartialResults.
> And isPartial should be define accurately.
> (Considering getBatch==MaxInt if we don't setBatch.) If 
> result.rawcells.length row, isPartial==true, otherwise isPartial == false. So if user don't 
> setAllowPartialResults(true), isPartial should always be false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15484) Correct the semantic of batch and partial

2016-11-10 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656408#comment-15656408
 ] 

Anoop Sam John commented on HBASE-15484:


Sorry missed this one from a long time.  Thanks for reviving the discussion.
setBatch might be still useful for a paging kind of results presentation way?  
May be then also let users fetch data based on the max result size and any way 
that will come to client result cache and let only needed data be displayed.
Ya now we have caching, batch, set max result size...  Too much complicated. 
But caching is still useful?

> Correct the semantic of batch and partial
> -
>
> Key: HBASE-15484
> URL: https://issues.apache.org/jira/browse/HBASE-15484
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
> Fix For: 2.0.0
>
> Attachments: HBASE-15484-v1.patch, HBASE-15484-v2.patch, 
> HBASE-15484-v3.patch, HBASE-15484-v4.patch
>
>
> Follow-up to HBASE-15325, as discussed, the meaning of setBatch and 
> setAllowPartialResults should not be same. We should not regard setBatch as 
> setAllowPartialResults.
> And isPartial should be define accurately.
> (Considering getBatch==MaxInt if we don't setBatch.) If 
> result.rawcells.length row, isPartial==true, otherwise isPartial == false. So if user don't 
> setAllowPartialResults(true), isPartial should always be false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17073) Increase the max number of buffers in ByteBufferPool

2016-11-10 Thread Anoop Sam John (JIRA)
Anoop Sam John created HBASE-17073:
--

 Summary: Increase the max number of buffers in ByteBufferPool
 Key: HBASE-17073
 URL: https://issues.apache.org/jira/browse/HBASE-17073
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 2.0.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0


Before the HBASE-15525 issue fix, we had variable sized buffers in our buffer 
pool. The max size upto which one buffer can grow was 2 MB.  Now we have 
changed it to be a fixed sized BBPool. By default 64 KB is the size of each 
buffer.  But the max number of BBs allowed to be in the pool was not changed.  
ie. twice the number of handlers. May be we should be changing increasing it 
now?  To make it equal to the way like 2 MB, we will need 32 * 2 * handlers.  
There is no initial #BBs any way. 2 MB is the default max response size what we 
have. And write reqs also, when it is Buffered mutator 2 MB is the default 
flush limit.  We can make it to be 32 * #handlers as the def max #BBs I believe.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17072) CPU usage starts to climb up to 90-100% when using G1GC

2016-11-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656390#comment-15656390
 ] 

stack commented on HBASE-17072:
---

Nice analysis. The header prefetch can save a seek when scans cross hfile block 
boundaries. The thread local caching has been there a long time. Lets revisit 
in light of the findings here (Yeah, we don't carry over the header to cache 
anymore). At a minimum add the [~esteban] suggestion.

> CPU usage starts to climb up to 90-100% when using G1GC
> ---
>
> Key: HBASE-17072
> URL: https://issues.apache.org/jira/browse/HBASE-17072
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Eiichi Sato
> Attachments: disable-block-header-cache.patch, mat-threadlocals.png, 
> mat-threads.png, metrics.png, slave1.svg, slave2.svg, slave3.svg, slave4.svg
>
>
> h5. Problem
> CPU usage of a region server in our CDH 5.4.5 cluster, at some point, starts 
> to gradually get higher up to nearly 90-100% when using G1GC.  We've also run 
> into this problem on CDH 5.7.3 and CDH 5.8.2.
> In our production cluster, it normally takes a few weeks for this to happen 
> after restarting a RS.  We reproduced this on our test cluster and attached 
> the results.  Please note that, to make it easy to reproduce, we did some 
> "anti-tuning" on a table when running tests.
> In metrics.png, soon after we started running some workloads against a test 
> cluster (CDH 5.8.2) at about 7 p.m. CPU usage of the two RSs started to rise. 
>  Flame Graphs (slave1.svg to slave4.svg) are generated from jstack dumps of 
> each RS process around 10:30 a.m. the next day.
> After investigating heapdumps from another occurrence on a test cluster 
> running CDH 5.7.3, we found that the ThreadLocalMap contain a lot of 
> contiguous entries of {{HFileBlock$PrefetchedHeader}} probably due to primary 
> clustering.  This caused more loops in 
> {{ThreadLocalMap#expungeStaleEntries()}}, consuming a certain amount of CPU 
> time.  What is worse is that the method is called from RPC metrics code, 
> which means even a small amount of per-RPC time soon adds up to a huge amount 
> of CPU time.
> This is very similar to the issue in HBASE-16616, but we have many 
> {{HFileBlock$PrefetchedHeader}} not only {{Counter$IndexHolder}} instances.  
> Here are some OQL counts from Eclipse Memory Analyzer (MAT).  This shows a 
> number of ThreadLocal instances in the ThreadLocalMap of a single handler 
> thread.
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = 
> "org.apache.hadoop.hbase.io.hfile.HFileBlock$PrefetchedHeader"
> #=> 10980 instances
> {code}
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = "org.apache.hadoop.hbase.util.Counter$IndexHolder"
> #=> 2052 instances
> {code}
> Although as described in HBASE-16616 this somewhat seems to be an issue in 
> G1GC side regarding weakly-reachable objects, we should keep ThreadLocal 
> usage minimal and avoid creating an indefinite number (in this case, a number 
> of HFiles) of ThreadLocal instances.
> HBASE-16146 removes ThreadLocals from the RPC metrics code.  That may solve 
> the issue (I just saw the patch, never tested it at all), but the 
> {{HFileBlock$PrefetchedHeader}} are still there in the ThreadLocalMap, which 
> may cause issues in the future again.
> h5. Our Solution
> We simply removed the whole {{HFileBlock$PrefetchedHeader}} caching and 
> fortunately we didn't notice any performance degradation for our production 
> workloads.
> Because the PrefetchedHeader caching uses ThreadLocal and because RPCs are 
> handled randomly in any of the handlers, small Get or small Scan RPCs do not 
> benefit from the caching (See HBASE-10676 and HBASE-11402 for the details).  
> Probably, we need to see how well reads are saved by the caching for large 
> Scan or Get RPCs and especially for compactions if we really remove the 
> caching. It's probably better if we can remove ThreadLocals without breaking 
> the current caching behavior.
> FWIW, I'm attaching the patch we applied. It's for CDH 5.4.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17071) Do not initialize MemstoreChunkPool when use mslab option is turned off

2016-11-10 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656382#comment-15656382
 ] 

Yu Li commented on HBASE-17071:
---

+1 for the patch, reasonable to move {{USEMSLAB_KEY}} from {{SegmentFactory}} 
into {{MemStoreLAB}}

> Do not initialize MemstoreChunkPool when use mslab option is turned off
> ---
>
> Key: HBASE-17071
> URL: https://issues.apache.org/jira/browse/HBASE-17071
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-17071.patch
>
>
> This is a 2.0 only issue and induced by HBASE-16407. 
> We are initializing MSLAB chunk pool along with RS start itself now. (To pass 
> it as a HeapMemoryTuneObserver).
> When MSLAB is turned off  (ie. hbase.hregion.memstore.mslab.enabled is 
> configured false) we should not be initializing MSLAB chunk pool at all.  By 
> default the initial chunk count to be created will be 0 only.  Still better 
> to avoid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17055) Disabling table not getting enabled after clean cluster restart.

2016-11-10 Thread Stephen Yuan Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656378#comment-15656378
 ] 

Stephen Yuan Jiang commented on HBASE-17055:


Thanks, [~sreenivasulureddy].  I will look the code for this.  This is an 
interesting,  for some reason SSH does not re-assign the region.  

> Disabling table not getting enabled after clean cluster restart.
> 
>
> Key: HBASE-17055
> URL: https://issues.apache.org/jira/browse/HBASE-17055
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 1.3.0
>Reporter: Y. SREENIVASULU REDDY
>Assignee: Stephen Yuan Jiang
> Fix For: 1.3.0
>
>
> scenario:
> 1. Disable the table, while disabling the table is in progress.
> 2. Restart whole HBase service.
> 3. Then enable the table.
> the above operation leads to RIT continously.
> pls find the below logs for understanding.
> while disabling the table whole hbase service went down.
> the following is the master logs
> {noformat}
> 2016-11-09 19:32:55,102 INFO  
> [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000] master.HMaster: 
> Client=seenu//host-1 disable testTable
> 2016-11-09 19:32:55,257 DEBUG 
> [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000] 
> procedure2.ProcedureExecutor: Procedure DisableTableProcedure 
> (table=testTable) id=8 owner=seenu state=RUNNABLE:DISABLE_TABLE_PREPARE added 
> to the store.
> 2016-11-09 19:32:55,264 DEBUG [ProcedureExecutor-5] 
> lock.ZKInterProcessLockBase: Acquired a lock for 
> /hbase/table-lock/testTable/write-master:165
> 2016-11-09 19:32:55,285 DEBUG 
> [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000] 
> master.MasterRpcServices: Checking to see if procedure is done procId=8
> 2016-11-09 19:32:55,386 DEBUG 
> [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000] 
> master.MasterRpcServices: Checking to see if procedure is done procId=8
> 2016-11-09 19:32:55,513 INFO  [ProcedureExecutor-5] 
> zookeeper.ZKTableStateManager: Moving table testTable state from DISABLING to 
> DISABLING
> 2016-11-09 19:32:55,587 DEBUG 
> [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000] 
> master.MasterRpcServices: Checking to see if procedure is done procId=8
> 2016-11-09 19:32:55,628 INFO  [ProcedureExecutor-5] 
> procedure.DisableTableProcedure: Offlining 1 regions.
> .
> .
> .
> .
> .
> .
> .
> .
> 2016-11-09 19:33:02,871 INFO  [AM.ZK.Worker-pool2-t7] master.RegionStates: 
> Offlined 1890fa9c085dcc2ee0602f4bab069d10 from host-1,16040,1478690163056
> Wed Nov  9 19:33:02 CST 2016 Terminating master
> {noformat}
> here we need to observe
> {color:red} Offlined 1890fa9c085dcc2ee0602f4bab069d10 from 
> host-1,16040,1478690163056 {color}
> then hmaster went down, all regionServers also made down.
> After hmaster and regionserver are restarted
> executed enable Table operation on the table.
> {panel:title=HMaster 
> Logs|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE}
> {noformat}
> 2016-11-09 19:49:57,059 INFO  
> [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000] master.HMaster: 
> Client=seenu//host-1 enable testTable
> 2016-11-09 19:49:57,325 DEBUG 
> [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000] 
> procedure2.ProcedureExecutor: Procedure EnableTableProcedure 
> (table=testTable) id=9 owner=seenu state=RUNNABLE:ENABLE_TABLE_PREPARE added 
> to the store.
> 2016-11-09 19:49:57,333 DEBUG [ProcedureExecutor-2] 
> lock.ZKInterProcessLockBase: Acquired a lock for 
> /hbase/table-lock/testTable/write-master:168
> 2016-11-09 19:49:57,335 DEBUG [hconnection-0x745317ee-shared--pool3-t11] 
> ipc.RpcClientImpl: Use SIMPLE authentication for service ClientService, 
> sasl=false
> 2016-11-09 19:49:57,335 DEBUG [hconnection-0x745317ee-shared--pool3-t11] 
> ipc.RpcClientImpl: Connecting to host-1:16040
> 2016-11-09 19:49:57,347 DEBUG 
> [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000] 
> master.MasterRpcServices: Checking to see if procedure is done procId=9
> 2016-11-09 19:49:57,449 DEBUG 
> [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000] 
> master.MasterRpcServices: Checking to see if procedure is done procId=9
> 2016-11-09 19:49:57,579 INFO  [ProcedureExecutor-2] 
> procedure.EnableTableProcedure: Attempting to enable the table testTable
> 2016-11-09 19:49:57,580 INFO  [ProcedureExecutor-2] 
> zookeeper.ZKTableStateManager: Moving table testTable state from DISABLED to 
> ENABLING
> 2016-11-09 19:49:57,655 DEBUG 
> [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000] 
> master.MasterRpcServices: Checking to see if procedure is done procId=9
> 2016-11-09 19:49:57,707 INFO  [ProcedureExecutor-2] 
> procedure.EnableTableProcedure: Table 'testTable' has 1 regions, of which 1 

[jira] [Updated] (HBASE-17071) Do not initialize MemstoreChunkPool when use mslab option is turned off

2016-11-10 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-17071:
---
Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

> Do not initialize MemstoreChunkPool when use mslab option is turned off
> ---
>
> Key: HBASE-17071
> URL: https://issues.apache.org/jira/browse/HBASE-17071
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-17071.patch
>
>
> This is a 2.0 only issue and induced by HBASE-16407. 
> We are initializing MSLAB chunk pool along with RS start itself now. (To pass 
> it as a HeapMemoryTuneObserver).
> When MSLAB is turned off  (ie. hbase.hregion.memstore.mslab.enabled is 
> configured false) we should not be initializing MSLAB chunk pool at all.  By 
> default the initial chunk count to be created will be 0 only.  Still better 
> to avoid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17071) Do not initialize MemstoreChunkPool when use mslab option is turned off

2016-11-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656374#comment-15656374
 ] 

stack commented on HBASE-17071:
---

+1

> Do not initialize MemstoreChunkPool when use mslab option is turned off
> ---
>
> Key: HBASE-17071
> URL: https://issues.apache.org/jira/browse/HBASE-17071
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-17071.patch
>
>
> This is a 2.0 only issue and induced by HBASE-16407. 
> We are initializing MSLAB chunk pool along with RS start itself now. (To pass 
> it as a HeapMemoryTuneObserver).
> When MSLAB is turned off  (ie. hbase.hregion.memstore.mslab.enabled is 
> configured false) we should not be initializing MSLAB chunk pool at all.  By 
> default the initial chunk count to be created will be 0 only.  Still better 
> to avoid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17072) CPU usage starts to climb up to 90-100% when using G1GC

2016-11-10 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656376#comment-15656376
 ] 

Anoop Sam John commented on HBASE-17072:


Ya when blocks are there in BC, we wont deal with this ThreadLocal variable at 
all.  When we cache the blocks into BC, do we cache the next block's header 
also?  I think no.  Some issue fixed by Stack remove this I believe.

> CPU usage starts to climb up to 90-100% when using G1GC
> ---
>
> Key: HBASE-17072
> URL: https://issues.apache.org/jira/browse/HBASE-17072
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Eiichi Sato
> Attachments: disable-block-header-cache.patch, mat-threadlocals.png, 
> mat-threads.png, metrics.png, slave1.svg, slave2.svg, slave3.svg, slave4.svg
>
>
> h5. Problem
> CPU usage of a region server in our CDH 5.4.5 cluster, at some point, starts 
> to gradually get higher up to nearly 90-100% when using G1GC.  We've also run 
> into this problem on CDH 5.7.3 and CDH 5.8.2.
> In our production cluster, it normally takes a few weeks for this to happen 
> after restarting a RS.  We reproduced this on our test cluster and attached 
> the results.  Please note that, to make it easy to reproduce, we did some 
> "anti-tuning" on a table when running tests.
> In metrics.png, soon after we started running some workloads against a test 
> cluster (CDH 5.8.2) at about 7 p.m. CPU usage of the two RSs started to rise. 
>  Flame Graphs (slave1.svg to slave4.svg) are generated from jstack dumps of 
> each RS process around 10:30 a.m. the next day.
> After investigating heapdumps from another occurrence on a test cluster 
> running CDH 5.7.3, we found that the ThreadLocalMap contain a lot of 
> contiguous entries of {{HFileBlock$PrefetchedHeader}} probably due to primary 
> clustering.  This caused more loops in 
> {{ThreadLocalMap#expungeStaleEntries()}}, consuming a certain amount of CPU 
> time.  What is worse is that the method is called from RPC metrics code, 
> which means even a small amount of per-RPC time soon adds up to a huge amount 
> of CPU time.
> This is very similar to the issue in HBASE-16616, but we have many 
> {{HFileBlock$PrefetchedHeader}} not only {{Counter$IndexHolder}} instances.  
> Here are some OQL counts from Eclipse Memory Analyzer (MAT).  This shows a 
> number of ThreadLocal instances in the ThreadLocalMap of a single handler 
> thread.
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = 
> "org.apache.hadoop.hbase.io.hfile.HFileBlock$PrefetchedHeader"
> #=> 10980 instances
> {code}
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = "org.apache.hadoop.hbase.util.Counter$IndexHolder"
> #=> 2052 instances
> {code}
> Although as described in HBASE-16616 this somewhat seems to be an issue in 
> G1GC side regarding weakly-reachable objects, we should keep ThreadLocal 
> usage minimal and avoid creating an indefinite number (in this case, a number 
> of HFiles) of ThreadLocal instances.
> HBASE-16146 removes ThreadLocals from the RPC metrics code.  That may solve 
> the issue (I just saw the patch, never tested it at all), but the 
> {{HFileBlock$PrefetchedHeader}} are still there in the ThreadLocalMap, which 
> may cause issues in the future again.
> h5. Our Solution
> We simply removed the whole {{HFileBlock$PrefetchedHeader}} caching and 
> fortunately we didn't notice any performance degradation for our production 
> workloads.
> Because the PrefetchedHeader caching uses ThreadLocal and because RPCs are 
> handled randomly in any of the handlers, small Get or small Scan RPCs do not 
> benefit from the caching (See HBASE-10676 and HBASE-11402 for the details).  
> Probably, we need to see how well reads are saved by the caching for large 
> Scan or Get RPCs and especially for compactions if we really remove the 
> caching. It's probably better if we can remove ThreadLocals without breaking 
> the current caching behavior.
> FWIW, I'm attaching the patch we applied. It's for CDH 5.4.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17072) CPU usage starts to climb up to 90-100% when using G1GC

2016-11-10 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656372#comment-15656372
 ] 

Esteban Gutierrez commented on HBASE-17072:
---

[~sato_eiichi], take a look at HBASE-17017 probably removing the per-region 
metrics could help.

> CPU usage starts to climb up to 90-100% when using G1GC
> ---
>
> Key: HBASE-17072
> URL: https://issues.apache.org/jira/browse/HBASE-17072
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Eiichi Sato
> Attachments: disable-block-header-cache.patch, mat-threadlocals.png, 
> mat-threads.png, metrics.png, slave1.svg, slave2.svg, slave3.svg, slave4.svg
>
>
> h5. Problem
> CPU usage of a region server in our CDH 5.4.5 cluster, at some point, starts 
> to gradually get higher up to nearly 90-100% when using G1GC.  We've also run 
> into this problem on CDH 5.7.3 and CDH 5.8.2.
> In our production cluster, it normally takes a few weeks for this to happen 
> after restarting a RS.  We reproduced this on our test cluster and attached 
> the results.  Please note that, to make it easy to reproduce, we did some 
> "anti-tuning" on a table when running tests.
> In metrics.png, soon after we started running some workloads against a test 
> cluster (CDH 5.8.2) at about 7 p.m. CPU usage of the two RSs started to rise. 
>  Flame Graphs (slave1.svg to slave4.svg) are generated from jstack dumps of 
> each RS process around 10:30 a.m. the next day.
> After investigating heapdumps from another occurrence on a test cluster 
> running CDH 5.7.3, we found that the ThreadLocalMap contain a lot of 
> contiguous entries of {{HFileBlock$PrefetchedHeader}} probably due to primary 
> clustering.  This caused more loops in 
> {{ThreadLocalMap#expungeStaleEntries()}}, consuming a certain amount of CPU 
> time.  What is worse is that the method is called from RPC metrics code, 
> which means even a small amount of per-RPC time soon adds up to a huge amount 
> of CPU time.
> This is very similar to the issue in HBASE-16616, but we have many 
> {{HFileBlock$PrefetchedHeader}} not only {{Counter$IndexHolder}} instances.  
> Here are some OQL counts from Eclipse Memory Analyzer (MAT).  This shows a 
> number of ThreadLocal instances in the ThreadLocalMap of a single handler 
> thread.
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = 
> "org.apache.hadoop.hbase.io.hfile.HFileBlock$PrefetchedHeader"
> #=> 10980 instances
> {code}
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = "org.apache.hadoop.hbase.util.Counter$IndexHolder"
> #=> 2052 instances
> {code}
> Although as described in HBASE-16616 this somewhat seems to be an issue in 
> G1GC side regarding weakly-reachable objects, we should keep ThreadLocal 
> usage minimal and avoid creating an indefinite number (in this case, a number 
> of HFiles) of ThreadLocal instances.
> HBASE-16146 removes ThreadLocals from the RPC metrics code.  That may solve 
> the issue (I just saw the patch, never tested it at all), but the 
> {{HFileBlock$PrefetchedHeader}} are still there in the ThreadLocalMap, which 
> may cause issues in the future again.
> h5. Our Solution
> We simply removed the whole {{HFileBlock$PrefetchedHeader}} caching and 
> fortunately we didn't notice any performance degradation for our production 
> workloads.
> Because the PrefetchedHeader caching uses ThreadLocal and because RPCs are 
> handled randomly in any of the handlers, small Get or small Scan RPCs do not 
> benefit from the caching (See HBASE-10676 and HBASE-11402 for the details).  
> Probably, we need to see how well reads are saved by the caching for large 
> Scan or Get RPCs and especially for compactions if we really remove the 
> caching. It's probably better if we can remove ThreadLocals without breaking 
> the current caching behavior.
> FWIW, I'm attaching the patch we applied. It's for CDH 5.4.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17056) Remove checked in PB generated files

2016-11-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-17056:
--
Attachment: 0002-HBASE-17056-Remove-checked-in-PB-generated-files.patch

Example. Does a few modules. Has protos generated on the fly. Removed from 
hbase-endpoint, hbase-rsgroup, and hbase-spark.

> Remove checked in PB generated files 
> -
>
> Key: HBASE-17056
> URL: https://issues.apache.org/jira/browse/HBASE-17056
> Project: HBase
>  Issue Type: Bug
>Reporter: Enis Soztutar
> Fix For: 2.0.0
>
> Attachments: 
> 0002-HBASE-17056-Remove-checked-in-PB-generated-files.patch
>
>
> Now that we have the new PB maven plugin, there is no need to have the PB 
> files checked in to the repo. The reason we did that was to ease up developer 
> env setup. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17072) CPU usage starts to climb up to 90-100% when using G1GC

2016-11-10 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656368#comment-15656368
 ] 

Esteban Gutierrez commented on HBASE-17072:
---

As I think about this problem as described by [~sato_eiichi] I think there is 
some value of disabling optionally the prefetching of headers for some 
workloads (lots of regions, very large HFiles, SSDs) and it could be done via 
an HCD like PREFETCH_BLOCKS_ON_OPEN. However, regarding of the CPU usage I 
think the counters and our metrics per region are very expensive in general.

> CPU usage starts to climb up to 90-100% when using G1GC
> ---
>
> Key: HBASE-17072
> URL: https://issues.apache.org/jira/browse/HBASE-17072
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Eiichi Sato
> Attachments: disable-block-header-cache.patch, mat-threadlocals.png, 
> mat-threads.png, metrics.png, slave1.svg, slave2.svg, slave3.svg, slave4.svg
>
>
> h5. Problem
> CPU usage of a region server in our CDH 5.4.5 cluster, at some point, starts 
> to gradually get higher up to nearly 90-100% when using G1GC.  We've also run 
> into this problem on CDH 5.7.3 and CDH 5.8.2.
> In our production cluster, it normally takes a few weeks for this to happen 
> after restarting a RS.  We reproduced this on our test cluster and attached 
> the results.  Please note that, to make it easy to reproduce, we did some 
> "anti-tuning" on a table when running tests.
> In metrics.png, soon after we started running some workloads against a test 
> cluster (CDH 5.8.2) at about 7 p.m. CPU usage of the two RSs started to rise. 
>  Flame Graphs (slave1.svg to slave4.svg) are generated from jstack dumps of 
> each RS process around 10:30 a.m. the next day.
> After investigating heapdumps from another occurrence on a test cluster 
> running CDH 5.7.3, we found that the ThreadLocalMap contain a lot of 
> contiguous entries of {{HFileBlock$PrefetchedHeader}} probably due to primary 
> clustering.  This caused more loops in 
> {{ThreadLocalMap#expungeStaleEntries()}}, consuming a certain amount of CPU 
> time.  What is worse is that the method is called from RPC metrics code, 
> which means even a small amount of per-RPC time soon adds up to a huge amount 
> of CPU time.
> This is very similar to the issue in HBASE-16616, but we have many 
> {{HFileBlock$PrefetchedHeader}} not only {{Counter$IndexHolder}} instances.  
> Here are some OQL counts from Eclipse Memory Analyzer (MAT).  This shows a 
> number of ThreadLocal instances in the ThreadLocalMap of a single handler 
> thread.
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = 
> "org.apache.hadoop.hbase.io.hfile.HFileBlock$PrefetchedHeader"
> #=> 10980 instances
> {code}
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = "org.apache.hadoop.hbase.util.Counter$IndexHolder"
> #=> 2052 instances
> {code}
> Although as described in HBASE-16616 this somewhat seems to be an issue in 
> G1GC side regarding weakly-reachable objects, we should keep ThreadLocal 
> usage minimal and avoid creating an indefinite number (in this case, a number 
> of HFiles) of ThreadLocal instances.
> HBASE-16146 removes ThreadLocals from the RPC metrics code.  That may solve 
> the issue (I just saw the patch, never tested it at all), but the 
> {{HFileBlock$PrefetchedHeader}} are still there in the ThreadLocalMap, which 
> may cause issues in the future again.
> h5. Our Solution
> We simply removed the whole {{HFileBlock$PrefetchedHeader}} caching and 
> fortunately we didn't notice any performance degradation for our production 
> workloads.
> Because the PrefetchedHeader caching uses ThreadLocal and because RPCs are 
> handled randomly in any of the handlers, small Get or small Scan RPCs do not 
> benefit from the caching (See HBASE-10676 and HBASE-11402 for the details).  
> Probably, we need to see how well reads are saved by the caching for large 
> Scan or Get RPCs and especially for compactions if we really remove the 
> caching. It's probably better if we can remove ThreadLocals without breaking 
> the current caching behavior.
> FWIW, I'm attaching the patch we applied. It's for CDH 5.4.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17071) Do not initialize MemstoreChunkPool when use mslab option is turned off

2016-11-10 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-17071:
---
Attachment: HBASE-17071.patch

> Do not initialize MemstoreChunkPool when use mslab option is turned off
> ---
>
> Key: HBASE-17071
> URL: https://issues.apache.org/jira/browse/HBASE-17071
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-17071.patch
>
>
> This is a 2.0 only issue and induced by HBASE-16407. 
> We are initializing MSLAB chunk pool along with RS start itself now. (To pass 
> it as a HeapMemoryTuneObserver).
> When MSLAB is turned off  (ie. hbase.hregion.memstore.mslab.enabled is 
> configured false) we should not be initializing MSLAB chunk pool at all.  By 
> default the initial chunk count to be created will be 0 only.  Still better 
> to avoid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17047) Add an API to get HBase connection cache statistics

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656355#comment-15656355
 ] 

Hadoop QA commented on HBASE-17047:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
10s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} scaladoc {color} | {color:green} 0m 
26s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} scalac {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 54s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha1. {color} |
| {color:green}+1{color} | {color:green} scaladoc {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s 
{color} | {color:green} hbase-spark in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
6s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 6s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:7bda515 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838514/HBASE-17047_v2.patch |
| JIRA Issue | HBASE-17047 |
| Optional Tests |  asflicense  scalac  scaladoc  unit  compile  |
| uname | Linux 3fd26d956c44 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / f9c6b66 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4431/testReport/ |
| modules | C: hbase-spark U: hbase-spark |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4431/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Add an API to get HBase connection cache statistics
> ---
>
> Key: HBASE-17047
> URL: https://issues.apache.org/jira/browse/HBASE-17047
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Weiqing Yang
>Assignee: Weiqing Yang
>Priority: Minor
> Attachments: HBASE-17047_v1.patch, HBASE-17047_v2.patch
>
>
> This patch will add a function "getStat" for the user to get the statistics 
> of the HBase connection cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15484) Correct the semantic of batch and partial

2016-11-10 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656344#comment-15656344
 ] 

Duo Zhang commented on HBASE-15484:
---

I think only allowPartial and maxResultSize are needed. And for implementing 
small scan, we need to add a 'limit' option to scan.

> Correct the semantic of batch and partial
> -
>
> Key: HBASE-15484
> URL: https://issues.apache.org/jira/browse/HBASE-15484
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
> Fix For: 2.0.0
>
> Attachments: HBASE-15484-v1.patch, HBASE-15484-v2.patch, 
> HBASE-15484-v3.patch, HBASE-15484-v4.patch
>
>
> Follow-up to HBASE-15325, as discussed, the meaning of setBatch and 
> setAllowPartialResults should not be same. We should not regard setBatch as 
> setAllowPartialResults.
> And isPartial should be define accurately.
> (Considering getBatch==MaxInt if we don't setBatch.) If 
> result.rawcells.length row, isPartial==true, otherwise isPartial == false. So if user don't 
> setAllowPartialResults(true), isPartial should always be false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17056) Remove checked in PB generated files

2016-11-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656330#comment-15656330
 ] 

stack commented on HBASE-17056:
---

If we don't check in the shaded and generated files, then IDEs will report 
classes as not findable since they are not in our src tree. To get them, you'll 
need to run a mvn build first. That ok to require of our IDE users?

On timing, for the smaller modules where there five or six protos -- e.g. 
hbase-endpoint -- then the build time for a 'clean install -DskipTests' goes 
from 5 seconds to 7 seconds. Acceptable I'd say.

On checkStaleness, it is set already but I'd think that rather than generate 
the protos into our src tree, instead, we'd generate under target dir at 
pre-compile and then tell the compiler to pick up the protos along w/ 
src/main/java when it goes to compile. Means that a mvn clean will remove the 
generated protos.  But no protos dirtying our src? (But IDEs will report 
missing files when you look at endpoints, etc.)

> Remove checked in PB generated files 
> -
>
> Key: HBASE-17056
> URL: https://issues.apache.org/jira/browse/HBASE-17056
> Project: HBase
>  Issue Type: Bug
>Reporter: Enis Soztutar
> Fix For: 2.0.0
>
>
> Now that we have the new PB maven plugin, there is no need to have the PB 
> files checked in to the repo. The reason we did that was to ease up developer 
> env setup. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-15484) Correct the semantic of batch and partial

2016-11-10 Thread Phil Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656306#comment-15656306
 ] 

Phil Yang edited comment on HBASE-15484 at 11/11/16 6:44 AM:
-

These days [~Apache9] is doing some work on async scan. It may be the time to 
reconsider this issue? In current implementation, we consider setBatch and 
setAllowPartialResults(true) as same meaning. My original idea is to 
distinguish them, but I agree with [~enis] that we can remove batch/cache. Like 
caching, batching may be also an old-style feature? We have allowPartialResults 
so we can use this to limit size/time for a large row.

What do you think? Thanks.


was (Author: yangzhe1991):
These days [~Apache9] is doing some work on async scan. It may be the time to 
reconsider this issue? In current implementation, we consider setBatch and 
setAllowPartialResults(true) as same meaning. Like caching, batching may be 
also an old-style feature? We have allowPartialResults so we can use this to 
limit size/time for a large row. We should distinguish two methods or remove 
setBatch in 2.0?

What do you think? Thanks.

> Correct the semantic of batch and partial
> -
>
> Key: HBASE-15484
> URL: https://issues.apache.org/jira/browse/HBASE-15484
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
> Fix For: 2.0.0
>
> Attachments: HBASE-15484-v1.patch, HBASE-15484-v2.patch, 
> HBASE-15484-v3.patch, HBASE-15484-v4.patch
>
>
> Follow-up to HBASE-15325, as discussed, the meaning of setBatch and 
> setAllowPartialResults should not be same. We should not regard setBatch as 
> setAllowPartialResults.
> And isPartial should be define accurately.
> (Considering getBatch==MaxInt if we don't setBatch.) If 
> result.rawcells.length row, isPartial==true, otherwise isPartial == false. So if user don't 
> setAllowPartialResults(true), isPartial should always be false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17072) CPU usage starts to climb up to 90-100% when using G1GC

2016-11-10 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656321#comment-15656321
 ] 

ramkrishna.s.vasudevan commented on HBASE-17072:


May be this is more helpful when the blocks are not cached rather than when 
they are cached?

> CPU usage starts to climb up to 90-100% when using G1GC
> ---
>
> Key: HBASE-17072
> URL: https://issues.apache.org/jira/browse/HBASE-17072
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Eiichi Sato
> Attachments: disable-block-header-cache.patch, mat-threadlocals.png, 
> mat-threads.png, metrics.png, slave1.svg, slave2.svg, slave3.svg, slave4.svg
>
>
> h5. Problem
> CPU usage of a region server in our CDH 5.4.5 cluster, at some point, starts 
> to gradually get higher up to nearly 90-100% when using G1GC.  We've also run 
> into this problem on CDH 5.7.3 and CDH 5.8.2.
> In our production cluster, it normally takes a few weeks for this to happen 
> after restarting a RS.  We reproduced this on our test cluster and attached 
> the results.  Please note that, to make it easy to reproduce, we did some 
> "anti-tuning" on a table when running tests.
> In metrics.png, soon after we started running some workloads against a test 
> cluster (CDH 5.8.2) at about 7 p.m. CPU usage of the two RSs started to rise. 
>  Flame Graphs (slave1.svg to slave4.svg) are generated from jstack dumps of 
> each RS process around 10:30 a.m. the next day.
> After investigating heapdumps from another occurrence on a test cluster 
> running CDH 5.7.3, we found that the ThreadLocalMap contain a lot of 
> contiguous entries of {{HFileBlock$PrefetchedHeader}} probably due to primary 
> clustering.  This caused more loops in 
> {{ThreadLocalMap#expungeStaleEntries()}}, consuming a certain amount of CPU 
> time.  What is worse is that the method is called from RPC metrics code, 
> which means even a small amount of per-RPC time soon adds up to a huge amount 
> of CPU time.
> This is very similar to the issue in HBASE-16616, but we have many 
> {{HFileBlock$PrefetchedHeader}} not only {{Counter$IndexHolder}} instances.  
> Here are some OQL counts from Eclipse Memory Analyzer (MAT).  This shows a 
> number of ThreadLocal instances in the ThreadLocalMap of a single handler 
> thread.
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = 
> "org.apache.hadoop.hbase.io.hfile.HFileBlock$PrefetchedHeader"
> #=> 10980 instances
> {code}
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = "org.apache.hadoop.hbase.util.Counter$IndexHolder"
> #=> 2052 instances
> {code}
> Although as described in HBASE-16616 this somewhat seems to be an issue in 
> G1GC side regarding weakly-reachable objects, we should keep ThreadLocal 
> usage minimal and avoid creating an indefinite number (in this case, a number 
> of HFiles) of ThreadLocal instances.
> HBASE-16146 removes ThreadLocals from the RPC metrics code.  That may solve 
> the issue (I just saw the patch, never tested it at all), but the 
> {{HFileBlock$PrefetchedHeader}} are still there in the ThreadLocalMap, which 
> may cause issues in the future again.
> h5. Our Solution
> We simply removed the whole {{HFileBlock$PrefetchedHeader}} caching and 
> fortunately we didn't notice any performance degradation for our production 
> workloads.
> Because the PrefetchedHeader caching uses ThreadLocal and because RPCs are 
> handled randomly in any of the handlers, small Get or small Scan RPCs do not 
> benefit from the caching (See HBASE-10676 and HBASE-11402 for the details).  
> Probably, we need to see how well reads are saved by the caching for large 
> Scan or Get RPCs and especially for compactions if we really remove the 
> caching. It's probably better if we can remove ThreadLocals without breaking 
> the current caching behavior.
> FWIW, I'm attaching the patch we applied. It's for CDH 5.4.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15484) Correct the semantic of batch and partial

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656313#comment-15656313
 ] 

Hadoop QA commented on HBASE-15484:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s {color} 
| {color:red} HBASE-15484 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.3.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12797329/HBASE-15484-v4.patch |
| JIRA Issue | HBASE-15484 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4432/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Correct the semantic of batch and partial
> -
>
> Key: HBASE-15484
> URL: https://issues.apache.org/jira/browse/HBASE-15484
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
> Fix For: 2.0.0
>
> Attachments: HBASE-15484-v1.patch, HBASE-15484-v2.patch, 
> HBASE-15484-v3.patch, HBASE-15484-v4.patch
>
>
> Follow-up to HBASE-15325, as discussed, the meaning of setBatch and 
> setAllowPartialResults should not be same. We should not regard setBatch as 
> setAllowPartialResults.
> And isPartial should be define accurately.
> (Considering getBatch==MaxInt if we don't setBatch.) If 
> result.rawcells.length row, isPartial==true, otherwise isPartial == false. So if user don't 
> setAllowPartialResults(true), isPartial should always be false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17071) Do not initialize MemstoreChunkPool when use mslab option is turned off

2016-11-10 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656311#comment-15656311
 ] 

Yu Li commented on HBASE-17071:
---

+1 for the idea, makes sense.

> Do not initialize MemstoreChunkPool when use mslab option is turned off
> ---
>
> Key: HBASE-17071
> URL: https://issues.apache.org/jira/browse/HBASE-17071
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
>
> This is a 2.0 only issue and induced by HBASE-16407. 
> We are initializing MSLAB chunk pool along with RS start itself now. (To pass 
> it as a HeapMemoryTuneObserver).
> When MSLAB is turned off  (ie. hbase.hregion.memstore.mslab.enabled is 
> configured false) we should not be initializing MSLAB chunk pool at all.  By 
> default the initial chunk count to be created will be 0 only.  Still better 
> to avoid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15484) Correct the semantic of batch and partial

2016-11-10 Thread Phil Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656306#comment-15656306
 ] 

Phil Yang commented on HBASE-15484:
---

These days [~Apache9] is doing some work on async scan. It may be the time to 
reconsider this issue? In current implementation, we consider setBatch and 
setAllowPartialResults(true) as same meaning. Like caching, batching may be 
also an old-style feature? We have allowPartialResults so we can use this to 
limit size/time for a large row. We should distinguish two methods or remove 
setBatch in 2.0?

What do you think? Thanks.

> Correct the semantic of batch and partial
> -
>
> Key: HBASE-15484
> URL: https://issues.apache.org/jira/browse/HBASE-15484
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
> Fix For: 2.0.0
>
> Attachments: HBASE-15484-v1.patch, HBASE-15484-v2.patch, 
> HBASE-15484-v3.patch, HBASE-15484-v4.patch
>
>
> Follow-up to HBASE-15325, as discussed, the meaning of setBatch and 
> setAllowPartialResults should not be same. We should not regard setBatch as 
> setAllowPartialResults.
> And isPartial should be define accurately.
> (Considering getBatch==MaxInt if we don't setBatch.) If 
> result.rawcells.length row, isPartial==true, otherwise isPartial == false. So if user don't 
> setAllowPartialResults(true), isPartial should always be false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17072) CPU usage starts to climb up to 90-100% when using G1GC

2016-11-10 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656297#comment-15656297
 ] 

Anoop Sam John commented on HBASE-17072:


So the prefetching of the new block's header is not useful at all as per the 
patch.  Ya in case of compaction it would have been beneficial at least. I 
agree to the point of handling of RPCs by the random handler.  Large scans 
might be impacted. Single RPC next itself touching more than one block.  
Nice digging in..


> CPU usage starts to climb up to 90-100% when using G1GC
> ---
>
> Key: HBASE-17072
> URL: https://issues.apache.org/jira/browse/HBASE-17072
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Eiichi Sato
> Attachments: disable-block-header-cache.patch, mat-threadlocals.png, 
> mat-threads.png, metrics.png, slave1.svg, slave2.svg, slave3.svg, slave4.svg
>
>
> h5. Problem
> CPU usage of a region server in our CDH 5.4.5 cluster, at some point, starts 
> to gradually get higher up to nearly 90-100% when using G1GC.  We've also run 
> into this problem on CDH 5.7.3 and CDH 5.8.2.
> In our production cluster, it normally takes a few weeks for this to happen 
> after restarting a RS.  We reproduced this on our test cluster and attached 
> the results.  Please note that, to make it easy to reproduce, we did some 
> "anti-tuning" on a table when running tests.
> In metrics.png, soon after we started running some workloads against a test 
> cluster (CDH 5.8.2) at about 7 p.m. CPU usage of the two RSs started to rise. 
>  Flame Graphs (slave1.svg to slave4.svg) are generated from jstack dumps of 
> each RS process around 10:30 a.m. the next day.
> After investigating heapdumps from another occurrence on a test cluster 
> running CDH 5.7.3, we found that the ThreadLocalMap contain a lot of 
> contiguous entries of {{HFileBlock$PrefetchedHeader}} probably due to primary 
> clustering.  This caused more loops in 
> {{ThreadLocalMap#expungeStaleEntries()}}, consuming a certain amount of CPU 
> time.  What is worse is that the method is called from RPC metrics code, 
> which means even a small amount of per-RPC time soon adds up to a huge amount 
> of CPU time.
> This is very similar to the issue in HBASE-16616, but we have many 
> {{HFileBlock$PrefetchedHeader}} not only {{Counter$IndexHolder}} instances.  
> Here are some OQL counts from Eclipse Memory Analyzer (MAT).  This shows a 
> number of ThreadLocal instances in the ThreadLocalMap of a single handler 
> thread.
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = 
> "org.apache.hadoop.hbase.io.hfile.HFileBlock$PrefetchedHeader"
> #=> 10980 instances
> {code}
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = "org.apache.hadoop.hbase.util.Counter$IndexHolder"
> #=> 2052 instances
> {code}
> Although as described in HBASE-16616 this somewhat seems to be an issue in 
> G1GC side regarding weakly-reachable objects, we should keep ThreadLocal 
> usage minimal and avoid creating an indefinite number (in this case, a number 
> of HFiles) of ThreadLocal instances.
> HBASE-16146 removes ThreadLocals from the RPC metrics code.  That may solve 
> the issue (I just saw the patch, never tested it at all), but the 
> {{HFileBlock$PrefetchedHeader}} are still there in the ThreadLocalMap, which 
> may cause issues in the future again.
> h5. Our Solution
> We simply removed the whole {{HFileBlock$PrefetchedHeader}} caching and 
> fortunately we didn't notice any performance degradation for our production 
> workloads.
> Because the PrefetchedHeader caching uses ThreadLocal and because RPCs are 
> handled randomly in any of the handlers, small Get or small Scan RPCs do not 
> benefit from the caching (See HBASE-10676 and HBASE-11402 for the details).  
> Probably, we need to see how well reads are saved by the caching for large 
> Scan or Get RPCs and especially for compactions if we really remove the 
> caching. It's probably better if we can remove ThreadLocals without breaking 
> the current caching behavior.
> FWIW, I'm attaching the patch we applied. It's for CDH 5.4.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17047) Add an API to get HBase connection cache statistics

2016-11-10 Thread Weiqing Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656291#comment-15656291
 ] 

Weiqing Yang commented on HBASE-17047:
--

Thanks for the review, [~stack] 

> Add an API to get HBase connection cache statistics
> ---
>
> Key: HBASE-17047
> URL: https://issues.apache.org/jira/browse/HBASE-17047
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Weiqing Yang
>Assignee: Weiqing Yang
>Priority: Minor
> Attachments: HBASE-17047_v1.patch, HBASE-17047_v2.patch
>
>
> This patch will add a function "getStat" for the user to get the statistics 
> of the HBase connection cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17047) Add an API to get HBase connection cache statistics

2016-11-10 Thread Weiqing Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656290#comment-15656290
 ] 

Weiqing Yang commented on HBASE-17047:
--

The second patch is to fix the scaladoc warning "warning: Could not find any 
member to link for 'HBaseConnectionCache'".

> Add an API to get HBase connection cache statistics
> ---
>
> Key: HBASE-17047
> URL: https://issues.apache.org/jira/browse/HBASE-17047
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Weiqing Yang
>Assignee: Weiqing Yang
>Priority: Minor
> Attachments: HBASE-17047_v1.patch, HBASE-17047_v2.patch
>
>
> This patch will add a function "getStat" for the user to get the statistics 
> of the HBase connection cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656282#comment-15656282
 ] 

Hudson commented on HBASE-17020:


SUCCESS: Integrated in Jenkins build HBase-1.2-JDK7 #69 (See 
[https://builds.apache.org/job/HBase-1.2-JDK7/69/])
HBASE-17020 keylen in midkey() dont computed correctly (liyu: rev 
bf9614f72e3104ec0110ed018fb0b6d0174c6366)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java


> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4
>Reporter: Yu Sun
>Assignee: Yu Sun
> Fix For: 2.0.0, 1.4.0, 1.2.5, 0.98.24, 1.1.8
>
> Attachments: HBASE-17020-branch-0.98.patch, HBASE-17020-v1.patch, 
> HBASE-17020-v2.patch, HBASE-17020-v2.patch, HBASE-17020-v3-branch1.1.patch, 
> HBASE-17020.branch-0.98.patch, HBASE-17020.branch-0.98.patch, 
> HBASE-17020.branch-1.1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17047) Add an API to get HBase connection cache statistics

2016-11-10 Thread Weiqing Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiqing Yang updated HBASE-17047:
-
Attachment: HBASE-17047_v2.patch

> Add an API to get HBase connection cache statistics
> ---
>
> Key: HBASE-17047
> URL: https://issues.apache.org/jira/browse/HBASE-17047
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Weiqing Yang
>Assignee: Weiqing Yang
>Priority: Minor
> Attachments: HBASE-17047_v1.patch, HBASE-17047_v2.patch
>
>
> This patch will add a function "getStat" for the user to get the statistics 
> of the HBase connection cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16838) Implement basic scan

2016-11-10 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656277#comment-15656277
 ] 

Duo Zhang commented on HBASE-16838:
---

{quote}
Is a Scan always against a Region (is the Region redundant?).
{quote}
A scan can cross regions, the AsyncScanRegionRpcRetryingCaller is only used to 
scan one region. During a scan we may create many 
AsyncScanRegionRpcRetryingCallers. AsyncScanOneRegionRpcRetryingCaller maybe 
better?

{quote}
Looking at the Response, we need Scan in there? The Scan in Response is 
different from originalScan?
{quote}
They are the same. But I think it is a little confusing that we use the 
originalScan in AsyncClientScanner but modify it at another place. But anyway, 
we do change it so people can not use it as 'original'... Let me add some 
javadoc here...

{quote}
I think you should stick the above comment on the scan timeout so it is clear 
what the scan timeout means. It helps.
{quote}
I've add some comments in AsyncConnectionConfiguration. Let me add the above 
comments too.

{quote}
Is there example code on how I'd do an async Scan? I create a ScanConsumer and 
pass it in then it will get called with Results as the Scan progresses? The 
AsyncTable#scan returns immediately? Perhaps stick it in javadoc for the scan 
method? Is SimpleScanObserver a good example or just a stop gap with its queue?
{quote}
I plan to introduce a example when implementing getScanner, where I plan to add 
flow control support. This method is used to write high performance 
event-driven program so it is not very user friendly... [~carp84] also claimed 
that even for other method such as get and put, complete the CompletableFuture 
insdie the thread of the rpc framework is not safe as user may also do time 
consuming work when consuming the CompletableFuture. Maybe we need a new 
'SafeAsyncTable' interface? Or change the name of this interface to 
'RawAsyncTable' or 'UnsafeAsyncTable'? As the async client is still marked as 
Unstable, I think we can do this in a follow on issue.

{quote}
Dont kill me but should ScanConsumer be ScanResultConsumer (can do in followup 
if makes sense) or just ScanResult?
{quote}
I think it should be ScanResultConsumer as I've already introduce a 
'ScanResultCache'(What's wrong with my brain...)

{quote}
CompleteResultScanResultCache should be CompleteScanResultCache to match 
AllowPartialScanResultCache?
{quote}

Fine. Will change.

> Implement basic scan
> 
>
> Key: HBASE-16838
> URL: https://issues.apache.org/jira/browse/HBASE-16838
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-16838-v1.patch, HBASE-16838-v2.patch, 
> HBASE-16838-v3.patch, HBASE-16838.patch
>
>
> Implement a scan works like the grpc streaming call that all returned results 
> will be passed to a ScanConsumer. The methods of the consumer will be called 
> directly in the rpc framework threads so it is not allowed to do time 
> consuming work in the methods. So in general only experts or the 
> implementation of other methods in AsyncTable can call this method directly, 
> that's why I call it 'basic scan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17072) CPU usage starts to climb up to 90-100% when using G1GC

2016-11-10 Thread Eiichi Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eiichi Sato updated HBASE-17072:

Attachment: mat-threadlocals.png
metrics.png
disable-block-header-cache.patch
mat-threads.png
slave1.svg
slave2.svg
slave3.svg
slave4.svg

> CPU usage starts to climb up to 90-100% when using G1GC
> ---
>
> Key: HBASE-17072
> URL: https://issues.apache.org/jira/browse/HBASE-17072
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Eiichi Sato
> Attachments: disable-block-header-cache.patch, mat-threadlocals.png, 
> mat-threads.png, metrics.png, slave1.svg, slave2.svg, slave3.svg, slave4.svg
>
>
> h5. Problem
> CPU usage of a region server in our CDH 5.4.5 cluster, at some point, starts 
> to gradually get higher up to nearly 90-100% when using G1GC.  We've also run 
> into this problem on CDH 5.7.3 and CDH 5.8.2.
> In our production cluster, it normally takes a few weeks for this to happen 
> after restarting a RS.  We reproduced this on our test cluster and attached 
> the results.  Please note that, to make it easy to reproduce, we did some 
> "anti-tuning" on a table when running tests.
> In metrics.png, soon after we started running some workloads against a test 
> cluster (CDH 5.8.2) at about 7 p.m. CPU usage of the two RSs started to rise. 
>  Flame Graphs (slave1.svg to slave4.svg) are generated from jstack dumps of 
> each RS process around 10:30 a.m. the next day.
> After investigating heapdumps from another occurrence on a test cluster 
> running CDH 5.7.3, we found that the ThreadLocalMap contain a lot of 
> contiguous entries of {{HFileBlock$PrefetchedHeader}} probably due to primary 
> clustering.  This caused more loops in 
> {{ThreadLocalMap#expungeStaleEntries()}}, consuming a certain amount of CPU 
> time.  What is worse is that the method is called from RPC metrics code, 
> which means even a small amount of per-RPC time soon adds up to a huge amount 
> of CPU time.
> This is very similar to the issue in HBASE-16616, but we have many 
> {{HFileBlock$PrefetchedHeader}} not only {{Counter$IndexHolder}} instances.  
> Here are some OQL counts from Eclipse Memory Analyzer (MAT).  This shows a 
> number of ThreadLocal instances in the ThreadLocalMap of a single handler 
> thread.
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = 
> "org.apache.hadoop.hbase.io.hfile.HFileBlock$PrefetchedHeader"
> #=> 10980 instances
> {code}
> {code}
> SELECT *
> FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
> FROM OBJECTS 0x4ee380430) obj
> WHERE obj.@clazz.@name = "org.apache.hadoop.hbase.util.Counter$IndexHolder"
> #=> 2052 instances
> {code}
> Although as described in HBASE-16616 this somewhat seems to be an issue in 
> G1GC side regarding weakly-reachable objects, we should keep ThreadLocal 
> usage minimal and avoid creating an indefinite number (in this case, a number 
> of HFiles) of ThreadLocal instances.
> HBASE-16146 removes ThreadLocals from the RPC metrics code.  That may solve 
> the issue (I just saw the patch, never tested it at all), but the 
> {{HFileBlock$PrefetchedHeader}} are still there in the ThreadLocalMap, which 
> may cause issues in the future again.
> h5. Our Solution
> We simply removed the whole {{HFileBlock$PrefetchedHeader}} caching and 
> fortunately we didn't notice any performance degradation for our production 
> workloads.
> Because the PrefetchedHeader caching uses ThreadLocal and because RPCs are 
> handled randomly in any of the handlers, small Get or small Scan RPCs do not 
> benefit from the caching (See HBASE-10676 and HBASE-11402 for the details).  
> Probably, we need to see how well reads are saved by the caching for large 
> Scan or Get RPCs and especially for compactions if we really remove the 
> caching. It's probably better if we can remove ThreadLocals without breaking 
> the current caching behavior.
> FWIW, I'm attaching the patch we applied. It's for CDH 5.4.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17072) CPU usage starts to climb up to 90-100% when using G1GC

2016-11-10 Thread Eiichi Sato (JIRA)
Eiichi Sato created HBASE-17072:
---

 Summary: CPU usage starts to climb up to 90-100% when using G1GC
 Key: HBASE-17072
 URL: https://issues.apache.org/jira/browse/HBASE-17072
 Project: HBase
  Issue Type: Bug
  Components: Performance, regionserver
Affects Versions: 1.2.0, 1.0.0
Reporter: Eiichi Sato


h5. Problem

CPU usage of a region server in our CDH 5.4.5 cluster, at some point, starts to 
gradually get higher up to nearly 90-100% when using G1GC.  We've also run into 
this problem on CDH 5.7.3 and CDH 5.8.2.

In our production cluster, it normally takes a few weeks for this to happen 
after restarting a RS.  We reproduced this on our test cluster and attached the 
results.  Please note that, to make it easy to reproduce, we did some 
"anti-tuning" on a table when running tests.

In metrics.png, soon after we started running some workloads against a test 
cluster (CDH 5.8.2) at about 7 p.m. CPU usage of the two RSs started to rise.  
Flame Graphs (slave1.svg to slave4.svg) are generated from jstack dumps of each 
RS process around 10:30 a.m. the next day.

After investigating heapdumps from another occurrence on a test cluster running 
CDH 5.7.3, we found that the ThreadLocalMap contain a lot of contiguous entries 
of {{HFileBlock$PrefetchedHeader}} probably due to primary clustering.  This 
caused more loops in {{ThreadLocalMap#expungeStaleEntries()}}, consuming a 
certain amount of CPU time.  What is worse is that the method is called from 
RPC metrics code, which means even a small amount of per-RPC time soon adds up 
to a huge amount of CPU time.

This is very similar to the issue in HBASE-16616, but we have many 
{{HFileBlock$PrefetchedHeader}} not only {{Counter$IndexHolder}} instances.  
Here are some OQL counts from Eclipse Memory Analyzer (MAT).  This shows a 
number of ThreadLocal instances in the ThreadLocalMap of a single handler 
thread.

{code}
SELECT *
FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
  FROM OBJECTS 0x4ee380430) obj
WHERE obj.@clazz.@name = 
"org.apache.hadoop.hbase.io.hfile.HFileBlock$PrefetchedHeader"

#=> 10980 instances
{code}

{code}
SELECT *
FROM OBJECTS (SELECT AS RETAINED SET OBJECTS value
  FROM OBJECTS 0x4ee380430) obj
WHERE obj.@clazz.@name = "org.apache.hadoop.hbase.util.Counter$IndexHolder"

#=> 2052 instances
{code}

Although as described in HBASE-16616 this somewhat seems to be an issue in G1GC 
side regarding weakly-reachable objects, we should keep ThreadLocal usage 
minimal and avoid creating an indefinite number (in this case, a number of 
HFiles) of ThreadLocal instances.

HBASE-16146 removes ThreadLocals from the RPC metrics code.  That may solve the 
issue (I just saw the patch, never tested it at all), but the 
{{HFileBlock$PrefetchedHeader}} are still there in the ThreadLocalMap, which 
may cause issues in the future again.


h5. Our Solution

We simply removed the whole {{HFileBlock$PrefetchedHeader}} caching and 
fortunately we didn't notice any performance degradation for our production 
workloads.

Because the PrefetchedHeader caching uses ThreadLocal and because RPCs are 
handled randomly in any of the handlers, small Get or small Scan RPCs do not 
benefit from the caching (See HBASE-10676 and HBASE-11402 for the details).  
Probably, we need to see how well reads are saved by the caching for large Scan 
or Get RPCs and especially for compactions if we really remove the caching. 
It's probably better if we can remove ThreadLocals without breaking the current 
caching behavior.

FWIW, I'm attaching the patch we applied. It's for CDH 5.4.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17062) RegionSplitter throws ClassCastException

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656268#comment-15656268
 ] 

Hadoop QA commented on HBASE-17062:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
31s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
47s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
50s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
29m 18s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 46s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
14s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 119m 0s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | 
org.apache.hadoop.hbase.security.access.TestCellACLWithMultipleVersions |
|   | org.apache.hadoop.hbase.security.access.TestAccessController |
|   | org.apache.hadoop.hbase.security.access.TestTablePermissions |
|   | org.apache.hadoop.hbase.security.access.TestScanEarlyTermination |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:7bda515 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838498/HBASE-17062.003.patch 
|
| JIRA Issue | HBASE-17062 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 84868f85ec0c 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 62e3b1e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4430/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HBASE-Build/4430/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4430/testReport/ |
| modules | C: 

[jira] [Commented] (HBASE-16962) Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API

2016-11-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656255#comment-15656255
 ] 

stack commented on HBASE-16962:
---

Don't think so. Open one I'd say.

> Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API
> --
>
> Key: HBASE-16962
> URL: https://issues.apache.org/jira/browse/HBASE-16962
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16956.branch-1.001.patch, 
> HBASE-16956.master.006.patch, HBASE-16962.master.001.patch, 
> HBASE-16962.master.002.patch, HBASE-16962.master.003.patch, 
> HBASE-16962.master.004.patch, HBASE-16962.rough.patch
>
>
> Similar to HBASE-15759, I would like to add readPoint to the 
> preCompactScannerOpen() API.
> I have a CP where I create a StoreScanner() as part of the 
> preCompactScannerOpen() API. I need the readpoint which was obtained in 
> Compactor.compact() method to be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17047) Add an API to get HBase connection cache statistics

2016-11-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656243#comment-15656243
 ] 

stack commented on HBASE-17047:
---

+1 for commit. Thanks. 

> Add an API to get HBase connection cache statistics
> ---
>
> Key: HBASE-17047
> URL: https://issues.apache.org/jira/browse/HBASE-17047
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Weiqing Yang
>Assignee: Weiqing Yang
>Priority: Minor
> Attachments: HBASE-17047_v1.patch
>
>
> This patch will add a function "getStat" for the user to get the statistics 
> of the HBase connection cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16962) Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API

2016-11-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-16962:
--
Hadoop Flags: Reviewed  (was: Incompatible change,Reviewed)

When is it safe to remove the old APIs?

> Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API
> --
>
> Key: HBASE-16962
> URL: https://issues.apache.org/jira/browse/HBASE-16962
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16956.branch-1.001.patch, 
> HBASE-16956.master.006.patch, HBASE-16962.master.001.patch, 
> HBASE-16962.master.002.patch, HBASE-16962.master.003.patch, 
> HBASE-16962.master.004.patch, HBASE-16962.rough.patch
>
>
> Similar to HBASE-15759, I would like to add readPoint to the 
> preCompactScannerOpen() API.
> I have a CP where I create a StoreScanner() as part of the 
> preCompactScannerOpen() API. I need the readpoint which was obtained in 
> Compactor.compact() method to be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16962) Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API

2016-11-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656240#comment-15656240
 ] 

stack commented on HBASE-16962:
---

True.

> Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API
> --
>
> Key: HBASE-16962
> URL: https://issues.apache.org/jira/browse/HBASE-16962
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16956.branch-1.001.patch, 
> HBASE-16956.master.006.patch, HBASE-16962.master.001.patch, 
> HBASE-16962.master.002.patch, HBASE-16962.master.003.patch, 
> HBASE-16962.master.004.patch, HBASE-16962.rough.patch
>
>
> Similar to HBASE-15759, I would like to add readPoint to the 
> preCompactScannerOpen() API.
> I have a CP where I create a StoreScanner() as part of the 
> preCompactScannerOpen() API. I need the readpoint which was obtained in 
> Compactor.compact() method to be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14882) Provide a Put API that adds the provided family, qualifier, value without copying

2016-11-10 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656231#comment-15656231
 ] 

Anoop Sam John commented on HBASE-14882:


Sorry missed this some how.  Will look at the latest patch today. Thanks.

> Provide a Put API that adds the provided family, qualifier, value without 
> copying
> -
>
> Key: HBASE-14882
> URL: https://issues.apache.org/jira/browse/HBASE-14882
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Jerry He
>Assignee: Xiang Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14882.master.000.patch, 
> HBASE-14882.master.001.patch, HBASE-14882.master.002.patch, 
> HBASE-14882.master.003.patch
>
>
> In the Put API, we have addImmutable()
> {code}
>  /**
>* See {@link #addColumn(byte[], byte[], byte[])}. This version expects
>* that the underlying arrays won't change. It's intended
>* for usage internal HBase to and for advanced client applications.
>*/
>   public Put addImmutable(byte [] family, byte [] qualifier, byte [] value)
> {code}
> But in the implementation, the family, qualifier and value are still being 
> copied locally to create kv.
> Hopefully we should provide an API that truly uses immutable family, 
> qualifier and value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16962) Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API

2016-11-10 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656226#comment-15656226
 ] 

Anoop Sam John commented on HBASE-16962:


Changed the release notes a little. It is not that we wont call the old CP API 
at all..  If the user has not implemented the new hook but the old one, we will 
still call the old one.  The BaseRegionObserver will do the old API call from 
its default impl of the new one.
So this is not an incompatible change. [~saint@gmail.com]  We can remove 
that marker also I believe.

> Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API
> --
>
> Key: HBASE-16962
> URL: https://issues.apache.org/jira/browse/HBASE-16962
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16956.branch-1.001.patch, 
> HBASE-16956.master.006.patch, HBASE-16962.master.001.patch, 
> HBASE-16962.master.002.patch, HBASE-16962.master.003.patch, 
> HBASE-16962.master.004.patch, HBASE-16962.rough.patch
>
>
> Similar to HBASE-15759, I would like to add readPoint to the 
> preCompactScannerOpen() API.
> I have a CP where I create a StoreScanner() as part of the 
> preCompactScannerOpen() API. I need the readpoint which was obtained in 
> Compactor.compact() method to be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16838) Implement basic scan

2016-11-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656227#comment-15656227
 ] 

stack commented on HBASE-16838:
---

Looking. Its much nicer now I think. We can work on the one-rpc small scan over 
in the linked issue.

Pity names get to be like this: AsyncScanRegionRpcRetryingCaller... which runs 
a AsyncScanRegionRpcRetryingCallables.  The RpcRetryingCaller stuff and 
RpcRetryingCallable was all there before you so not your fault... just saying. 
Is a Scan always against a Region (is the Region redundant?).

Looking at the Response, we need Scan in there? The Scan in Response is 
different from originalScan? nvm... I see this Scan carries state of the 
general Scan as we progress.

locateToPreviousRegion is when a reverse Scan?

Thanks for comment on scan timeout. So scan timeout is different to operation 
timeout? It is at least according to your comment up in rb: "As now we have 
heartbeat support for scan, ideally a scan will never timeout unless the RS is 
crash. The RS will always return something before the rpc timeout or scan 
timeout to tell the client that it is still alive.
The scan timeout is used as operation timeout for every operations in a scan, 
such as openScanner or next."

I think you should stick the above comment on the scan timeout so it is clear 
what the scan timeout means. It helps.

Update doc on commit:

   * The basic scan API uses the observer pattern. All results that 
match the given scan object will
356* be passed to the given {@code scanObserver} by calling {@link 
ScanConsumer#onNext(Result[])}.

... you changed observer to be a consumer.

Is there example code on how I'd do an async Scan? I create a ScanConsumer and 
pass it in then it will get called with Results as the Scan progresses? The 
AsyncTable#scan returns immediately? Perhaps stick it in javadoc for the scan 
method? Is SimpleScanObserver a good example or just a stop gap with its queue?

Dont kill me but should ScanConsumer be ScanResultConsumer (can do in followup 
if makes sense) or just ScanResult?

CompleteResultScanResultCache should be CompleteScanResultCache to match 
AllowPartialScanResultCache?

I'd be good committing this as is and addressing what remains in follow-on. +1













> Implement basic scan
> 
>
> Key: HBASE-16838
> URL: https://issues.apache.org/jira/browse/HBASE-16838
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-16838-v1.patch, HBASE-16838-v2.patch, 
> HBASE-16838-v3.patch, HBASE-16838.patch
>
>
> Implement a scan works like the grpc streaming call that all returned results 
> will be passed to a ScanConsumer. The methods of the consumer will be called 
> directly in the rpc framework threads so it is not allowed to do time 
> consuming work in the methods. So in general only experts or the 
> implementation of other methods in AsyncTable can call this method directly, 
> that's why I call it 'basic scan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16962) Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API

2016-11-10 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-16962:
---
Release Note: 
The following RegionObserver methods are deprecated

InternalScanner preFlushScannerOpen(final 
ObserverContext c,
final Store store, final KeyValueScanner memstoreScanner, final 
InternalScanner s)
throws IOException;

InternalScanner preCompactScannerOpen(final 
ObserverContext c,
final Store store, List scanners, final ScanType 
scanType,
final long earliestPutTs, final InternalScanner s, CompactionRequest 
request)

Instead, use the following methods:

InternalScanner preFlushScannerOpen(final 
ObserverContext c,
final Store store, final KeyValueScanner memstoreScanner, final 
InternalScanner s,
final long readPoint) throws IOException;

InternalScanner preCompactScannerOpen(final 
ObserverContext c,
final Store store, List scanners, final ScanType 
scanType,
final long earliestPutTs, final InternalScanner s, final CompactionRequest 
request,
final long readPoint) throws IOException

  was:
The following RegionObserver methods are deprecated and would no longer be 
called in hbase 2.0:

InternalScanner preFlushScannerOpen(final 
ObserverContext c,
final Store store, final KeyValueScanner memstoreScanner, final 
InternalScanner s)
throws IOException;

InternalScanner preCompactScannerOpen(final 
ObserverContext c,
final Store store, List scanners, final ScanType 
scanType,
final long earliestPutTs, final InternalScanner s, CompactionRequest 
request)

Instead, use the following methods:

InternalScanner preFlushScannerOpen(final 
ObserverContext c,
final Store store, final KeyValueScanner memstoreScanner, final 
InternalScanner s,
final long readPoint) throws IOException;

InternalScanner preCompactScannerOpen(final 
ObserverContext c,
final Store store, List scanners, final ScanType 
scanType,
final long earliestPutTs, final InternalScanner s, final CompactionRequest 
request,
final long readPoint) throws IOException


> Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API
> --
>
> Key: HBASE-16962
> URL: https://issues.apache.org/jira/browse/HBASE-16962
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16956.branch-1.001.patch, 
> HBASE-16956.master.006.patch, HBASE-16962.master.001.patch, 
> HBASE-16962.master.002.patch, HBASE-16962.master.003.patch, 
> HBASE-16962.master.004.patch, HBASE-16962.rough.patch
>
>
> Similar to HBASE-15759, I would like to add readPoint to the 
> preCompactScannerOpen() API.
> I have a CP where I create a StoreScanner() as part of the 
> preCompactScannerOpen() API. I need the readpoint which was obtained in 
> Compactor.compact() method to be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16962) Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API

2016-11-10 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-16962:
---
  Resolution: Fixed
Hadoop Flags: Incompatible change,Reviewed  (was: Incompatible change)
  Status: Resolved  (was: Patch Available)

Pushed to branch-1 and  master.  Thanks for the patch.
Thanks all for the reviews and discussion.

> Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API
> --
>
> Key: HBASE-16962
> URL: https://issues.apache.org/jira/browse/HBASE-16962
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16956.branch-1.001.patch, 
> HBASE-16956.master.006.patch, HBASE-16962.master.001.patch, 
> HBASE-16962.master.002.patch, HBASE-16962.master.003.patch, 
> HBASE-16962.master.004.patch, HBASE-16962.rough.patch
>
>
> Similar to HBASE-15759, I would like to add readPoint to the 
> preCompactScannerOpen() API.
> I have a CP where I create a StoreScanner() as part of the 
> preCompactScannerOpen() API. I need the readpoint which was obtained in 
> Compactor.compact() method to be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656203#comment-15656203
 ] 

Hudson commented on HBASE-17020:


SUCCESS: Integrated in Jenkins build HBase-1.2-JDK8 #63 (See 
[https://builds.apache.org/job/HBase-1.2-JDK8/63/])
HBASE-17020 keylen in midkey() dont computed correctly (liyu: rev 
bf9614f72e3104ec0110ed018fb0b6d0174c6366)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java


> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4
>Reporter: Yu Sun
>Assignee: Yu Sun
> Fix For: 2.0.0, 1.4.0, 1.2.5, 0.98.24, 1.1.8
>
> Attachments: HBASE-17020-branch-0.98.patch, HBASE-17020-v1.patch, 
> HBASE-17020-v2.patch, HBASE-17020-v2.patch, HBASE-17020-v3-branch1.1.patch, 
> HBASE-17020.branch-0.98.patch, HBASE-17020.branch-0.98.patch, 
> HBASE-17020.branch-1.1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16962) Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API

2016-11-10 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656189#comment-15656189
 ] 

Anoop Sam John commented on HBASE-16962:


Am not sure. [~saint@gmail.com],  [~mbertozzi] ?

> Add readPoint to preCompactScannerOpen() and preFlushScannerOpen() API
> --
>
> Key: HBASE-16962
> URL: https://issues.apache.org/jira/browse/HBASE-16962
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16956.branch-1.001.patch, 
> HBASE-16956.master.006.patch, HBASE-16962.master.001.patch, 
> HBASE-16962.master.002.patch, HBASE-16962.master.003.patch, 
> HBASE-16962.master.004.patch, HBASE-16962.rough.patch
>
>
> Similar to HBASE-15759, I would like to add readPoint to the 
> preCompactScannerOpen() API.
> I have a CP where I create a StoreScanner() as part of the 
> preCompactScannerOpen() API. I need the readpoint which was obtained in 
> Compactor.compact() method to be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16838) Implement basic scan

2016-11-10 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656185#comment-15656185
 ] 

Duo Zhang commented on HBASE-16838:
---

So what's your option on the patch sir? Thanks.

> Implement basic scan
> 
>
> Key: HBASE-16838
> URL: https://issues.apache.org/jira/browse/HBASE-16838
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-16838-v1.patch, HBASE-16838-v2.patch, 
> HBASE-16838-v3.patch, HBASE-16838.patch
>
>
> Implement a scan works like the grpc streaming call that all returned results 
> will be passed to a ScanConsumer. The methods of the consumer will be called 
> directly in the rpc framework threads so it is not allowed to do time 
> consuming work in the methods. So in general only experts or the 
> implementation of other methods in AsyncTable can call this method directly, 
> that's why I call it 'basic scan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15788) Use Offheap ByteBuffers from BufferPool to read RPC requests.

2016-11-10 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656183#comment-15656183
 ] 

Anoop Sam John commented on HBASE-15788:


As of now I can not see any such need. As of now I will leave it as is in the 
patch?  If u strongly feel we should not, I can remove.  I am not thinking that 
it is going to affect too much.

> Use Offheap ByteBuffers from BufferPool to read RPC requests.
> -
>
> Key: HBASE-15788
> URL: https://issues.apache.org/jira/browse/HBASE-15788
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: ramkrishna.s.vasudevan
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15788.patch, HBASE-15788_V4.patch, 
> HBASE-15788_V5.patch, HBASE-15788_V6.patch, HBASE-15788_V7.patch, 
> HBASE-15788_V8.patch
>
>
> Right now, when an RPC request reaches RpcServer, we read the request into an 
> on demand created byte[]. When it is write request and including many 
> mutations, the request size will be some what larger and we end up creating 
> many temp on heap byte[] and causing more GCs.
> We have a ByteBufferPool of fixed sized off heap BBs. This is used at 
> RpcServer while sending read response only. We can make use of the same while 
> reading reqs also. Instead of reading whole of the request bytes into a 
> single BB, we can read into N BBs (based on the req size). When BB is not 
> available from pool, we will fall back to old way of on demand on heap byte[] 
> creation.
> Remember these are off heap BBs. We read many proto objects from this read 
> request bytes (like header, Mutation protos etc). Thanks to PB 3 and our 
> shading work as it supports off heap BB now.  Also the payload cells are also 
> in these DBBs now. The codec decoder can work on these and create off heap 
> BBs. Whole of our write path work with Cells now. At the time of addition to 
> memstore, these cells are by default copied to MSLAB ( off heap based pooled 
> MSLAB issue to follow this one). If MSLAB copy is not possible, we will do a 
> copy to on heap byte[].
> One possible down side of this is :
> Before adding to Memstore, we do write to WAL. So the Cells created out of 
> the offheap BBs (Codec#Decoder) will be used to write to WAL. The default 
> FSHLog works with an OS obtained from DFSClient. This will have only standard 
> OS write APIs which is byte[] based.  So just to write to WAL, we will end up 
> in temp on heap copy for each of the Cell. The other WAL imp (ie. AsynWAL) 
> supports writing offheap Cells directly. We have work in progress to make 
> AsycnWAL as default. Also we can raise HDFS req to support BB based write 
> APIs in their client OS? Until then, will try for a temp workaround solution. 
> Patch to say more on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15788) Use Offheap ByteBuffers from BufferPool to read RPC requests.

2016-11-10 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656180#comment-15656180
 ] 

Anoop Sam John commented on HBASE-15788:


Ya all BBs may be not from pool. so we will need extension for SBB and MBB 
which can separate out buffer from pool or not. We can not mark a single BB as 
from pool or not.  Or else we will have to go with hard assumption that if a BB 
Is DBB it is from pool. Ya that will hold true. But I dislike that kind of hard 
assumptions in code.  Ya it is one place and I renamed that method to better 
convey that we do side effect in the method and added some comments also.  Will 
go with this way. Thanks boss.

> Use Offheap ByteBuffers from BufferPool to read RPC requests.
> -
>
> Key: HBASE-15788
> URL: https://issues.apache.org/jira/browse/HBASE-15788
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: ramkrishna.s.vasudevan
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15788.patch, HBASE-15788_V4.patch, 
> HBASE-15788_V5.patch, HBASE-15788_V6.patch, HBASE-15788_V7.patch, 
> HBASE-15788_V8.patch
>
>
> Right now, when an RPC request reaches RpcServer, we read the request into an 
> on demand created byte[]. When it is write request and including many 
> mutations, the request size will be some what larger and we end up creating 
> many temp on heap byte[] and causing more GCs.
> We have a ByteBufferPool of fixed sized off heap BBs. This is used at 
> RpcServer while sending read response only. We can make use of the same while 
> reading reqs also. Instead of reading whole of the request bytes into a 
> single BB, we can read into N BBs (based on the req size). When BB is not 
> available from pool, we will fall back to old way of on demand on heap byte[] 
> creation.
> Remember these are off heap BBs. We read many proto objects from this read 
> request bytes (like header, Mutation protos etc). Thanks to PB 3 and our 
> shading work as it supports off heap BB now.  Also the payload cells are also 
> in these DBBs now. The codec decoder can work on these and create off heap 
> BBs. Whole of our write path work with Cells now. At the time of addition to 
> memstore, these cells are by default copied to MSLAB ( off heap based pooled 
> MSLAB issue to follow this one). If MSLAB copy is not possible, we will do a 
> copy to on heap byte[].
> One possible down side of this is :
> Before adding to Memstore, we do write to WAL. So the Cells created out of 
> the offheap BBs (Codec#Decoder) will be used to write to WAL. The default 
> FSHLog works with an OS obtained from DFSClient. This will have only standard 
> OS write APIs which is byte[] based.  So just to write to WAL, we will end up 
> in temp on heap copy for each of the Cell. The other WAL imp (ie. AsynWAL) 
> supports writing offheap Cells directly. We have work in progress to make 
> AsycnWAL as default. Also we can raise HDFS req to support BB based write 
> APIs in their client OS? Until then, will try for a temp workaround solution. 
> Patch to say more on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

2016-11-10 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656173#comment-15656173
 ] 

Anoop Sam John commented on HBASE-16417:


bq. I did a mistake when running data compaction in previous rounds. I turned 
off the mslab flag but did not remove the chunk pool parameters and as a result 
a chunk pool was allocated but not used.
That is not fully ur mistake.  When MSLAB is turned off by config, even if 
there are pool related configs present, we should not init the pool. This will 
never get used any way.  This is the way it worked. Only in trunk this behave 
change happened. My bad. HBASE-16407 is the culprit.  Raised HBASE-17071 to 
correct it in trunk.  Thanks for the nice find.

> In-Memory MemStore Policy for Flattening and Compactions
> 
>
> Key: HBASE-16417
> URL: https://issues.apache.org/jira/browse/HBASE-16417
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Assignee: Eshcar Hillel
> Fix For: 2.0.0
>
> Attachments: HBASE-16417-benchmarkresults-20161101.pdf, 
> HBASE-16417-benchmarkresults-20161110.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16783) Use ByteBufferPool for the header and message during Rpc response

2016-11-10 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656172#comment-15656172
 ] 

ramkrishna.s.vasudevan commented on HBASE-16783:


Linking related JIRAs

> Use ByteBufferPool for the header and message during Rpc response
> -
>
> Key: HBASE-16783
> URL: https://issues.apache.org/jira/browse/HBASE-16783
> Project: HBase
>  Issue Type: Improvement
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-16783.patch, HBASE-16783_1.patch, 
> HBASE-16783_2.patch, HBASE-16783_3.patch, HBASE-16783_4.patch, 
> HBASE-16783_5.patch, HBASE-16783_6.patch, HBASE-16783_7.patch, 
> HBASE-16783_7.patch
>
>
> With ByteBufferPool in place we could avoid the byte[] creation in 
> RpcServer#createHeaderAndMessageBytes and try using the Buffer from the pool 
> rather than creating byte[] every time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17071) Do not initialize MemstoreChunkPool when use mslab option is turned off

2016-11-10 Thread Anoop Sam John (JIRA)
Anoop Sam John created HBASE-17071:
--

 Summary: Do not initialize MemstoreChunkPool when use mslab option 
is turned off
 Key: HBASE-17071
 URL: https://issues.apache.org/jira/browse/HBASE-17071
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 2.0.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0


This is a 2.0 only issue and induced by HBASE-16407. 
We are initializing MSLAB chunk pool along with RS start itself now. (To pass 
it as a HeapMemoryTuneObserver).
When MSLAB is turned off  (ie. hbase.hregion.memstore.mslab.enabled is 
configured false) we should not be initializing MSLAB chunk pool at all.  By 
default the initial chunk count to be created will be 0 only.  Still better to 
avoid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15324) Jitter may cause desiredMaxFileSize overflow in ConstantSizeRegionSplitPolicy and trigger unexpected split

2016-11-10 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656170#comment-15656170
 ] 

huaxiang sun commented on HBASE-15324:
--

Agree with [~liyu]. Or we can just check overflow  if (jitterValue > 0).

> Jitter may cause desiredMaxFileSize overflow in ConstantSizeRegionSplitPolicy 
> and trigger unexpected split
> --
>
> Key: HBASE-15324
> URL: https://issues.apache.org/jira/browse/HBASE-15324
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.1.3
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.5, 1.1.8
>
> Attachments: HBASE-15324.patch, HBASE-15324_v2.patch, 
> HBASE-15324_v3.patch, HBASE-15324_v3.patch
>
>
> We introduce jitter for region split decision in HBASE-13412, but the 
> following line in {{ConstantSizeRegionSplitPolicy}} may cause long value 
> overflow if MAX_FILESIZE is specified to Long.MAX_VALUE:
> {code}
> this.desiredMaxFileSize += (long)(desiredMaxFileSize * (RANDOM.nextFloat() - 
> 0.5D) * jitter);
> {code}
> In our case we specify MAX_FILESIZE to Long.MAX_VALUE to prevent target 
> region to split.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

2016-11-10 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656167#comment-15656167
 ] 

ramkrishna.s.vasudevan commented on HBASE-16417:


In the figure that represent Write only work load , you see more GC when there 
is MSLAB (with no compaction)? 
How many region in your PE tool and YCSB?  You have only 16G memory and 0.42 of 
it is 6.72 G for blocking memstore. So number of regions may be important here 
to check other wise you can easily overload a region with lot of blocking 
updates. Just saying.

> In-Memory MemStore Policy for Flattening and Compactions
> 
>
> Key: HBASE-16417
> URL: https://issues.apache.org/jira/browse/HBASE-16417
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Assignee: Eshcar Hillel
> Fix For: 2.0.0
>
> Attachments: HBASE-16417-benchmarkresults-20161101.pdf, 
> HBASE-16417-benchmarkresults-20161110.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16972) Log more details for Scan#next request when responseTooSlow

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656165#comment-15656165
 ] 

Hudson commented on HBASE-16972:


FAILURE: Integrated in Jenkins build HBase-1.3-JDK8 #77 (See 
[https://builds.apache.org/job/HBase-1.3-JDK8/77/])
HBASE-16972 Log more details for Scan#next request when responseTooSlow (liyu: 
rev 996b4847fa3867e9b69e6f35727732836354f7a3)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServerInterface.java


> Log more details for Scan#next request when responseTooSlow
> ---
>
> Key: HBASE-16972
> URL: https://issues.apache.org/jira/browse/HBASE-16972
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability
>Affects Versions: 1.2.3, 1.1.7
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4, 1.1.8
>
> Attachments: HBASE-16972.patch, HBASE-16972.v2.patch, 
> HBASE-16972.v3.patch
>
>
> Currently for if responseTooSlow happens on the scan.next call, we will get 
> warn log like below:
> {noformat}
> 2016-10-31 11:43:23,430 WARN  
> [RpcServer.FifoWFPBQ.priority.handler=5,queue=1,port=60193] 
> ipc.RpcServer(2574):
> (responseTooSlow): 
> {"call":"Scan(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ScanRequest)",
> "starttimems":1477885403428,"responsesize":52,"method":"Scan","param":"scanner_id:
>  11 number_of_rows: 2147483647
> close_scanner: false next_call_seq: 0 client_handles_partials: true 
> client_handles_heartbeats: true
> track_scan_metrics: false renew: 
> false","processingtimems":2,"client":"127.0.0.1:60254","queuetimems":0,"class":"HMaster"}
> {noformat}
> From which we only have a {{scanner_id}} and impossible to know what exactly 
> this scan is about, like against which region of which table.
> After this JIRA, we will improve the message to something like below (notice 
> the last line):
> {noformat}
> 2016-10-31 11:43:23,430 WARN  
> [RpcServer.FifoWFPBQ.priority.handler=5,queue=1,port=60193] 
> ipc.RpcServer(2574):
> (responseTooSlow): 
> {"call":"Scan(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ScanRequest)",
> "starttimems":1477885403428,"responsesize":52,"method":"Scan","param":"scanner_id:
>  11 number_of_rows: 2147483647
> close_scanner: false next_call_seq: 0 client_handles_partials: true 
> client_handles_heartbeats: true
> track_scan_metrics: false renew: 
> false","processingtimems":2,"client":"127.0.0.1:60254","queuetimems":0,"class":"HMaster",
> "scandetails":"table: hbase:meta region: hbase:meta,,1.1588230740"}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17047) Add an API to get HBase connection cache statistics

2016-11-10 Thread Weiqing Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656155#comment-15656155
 ] 

Weiqing Yang commented on HBASE-17047:
--

Thanks for the reply. HBaseConnectionCacheStat calculate statistics of 
HBaseConnectionCache only. Spark users who care about the number of concurrent 
hbase connections or cost of database connect/disconnect may want to use this 
API.

> Add an API to get HBase connection cache statistics
> ---
>
> Key: HBASE-17047
> URL: https://issues.apache.org/jira/browse/HBASE-17047
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Weiqing Yang
>Assignee: Weiqing Yang
>Priority: Minor
> Attachments: HBASE-17047_v1.patch
>
>
> This patch will add a function "getStat" for the user to get the statistics 
> of the HBase connection cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

2016-11-10 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656146#comment-15656146
 ] 

ramkrishna.s.vasudevan commented on HBASE-16417:


bq.I also ran no-compaction option with no mslabs and no chunk pool which 
turned out to be the best performing setting. (See full details in the latest 
report.)
Can you tell more on this. We recently found that in the default memstore case 
- enabling mslab and chunkpool had gains in terms of PE's latency and GC. 

> In-Memory MemStore Policy for Flattening and Compactions
> 
>
> Key: HBASE-16417
> URL: https://issues.apache.org/jira/browse/HBASE-16417
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Assignee: Eshcar Hillel
> Fix For: 2.0.0
>
> Attachments: HBASE-16417-benchmarkresults-20161101.pdf, 
> HBASE-16417-benchmarkresults-20161110.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17043) parallelize select() work in mob compaction

2016-11-10 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656128#comment-15656128
 ] 

huaxiang sun commented on HBASE-17043:
--

Correct a typo, it should be 700k files instead of 70k files.

> parallelize select() work in mob compaction
> ---
>
> Key: HBASE-17043
> URL: https://issues.apache.org/jira/browse/HBASE-17043
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
>
> Today in 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/compactions/PartitionedMobCompactor.java#L141,
>   the select() is single-threaded. Give a large number of files, it will take 
> several seconds to finish the job. Will see how this work can be divided and 
> speed up the processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-17043) parallelize select() work in mob compaction

2016-11-10 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15648370#comment-15648370
 ] 

huaxiang sun edited comment on HBASE-17043 at 11/11/16 4:48 AM:


For 700k files, found that it took 6 ~ 7 seconds to finish the select logic. 
Compared with the file compact (I/O), it is nothing, still will see how to 
speed up to reduce this 6 ~ 7 seconds time.


was (Author: huaxiang):
For 70k files, found that it took 6 ~ 7 seconds to finish the select logic. 
Compared with the file compact (I/O), it is nothing, still will see how to 
speed up to reduce this 6 ~ 7 seconds time.

> parallelize select() work in mob compaction
> ---
>
> Key: HBASE-17043
> URL: https://issues.apache.org/jira/browse/HBASE-17043
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
>
> Today in 
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/compactions/PartitionedMobCompactor.java#L141,
>   the select() is single-threaded. Give a large number of files, it will take 
> several seconds to finish the job. Will see how this work can be divided and 
> speed up the processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16838) Implement basic scan

2016-11-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656086#comment-15656086
 ] 

stack commented on HBASE-16838:
---

Link to issue to address Enis remarks.

> Implement basic scan
> 
>
> Key: HBASE-16838
> URL: https://issues.apache.org/jira/browse/HBASE-16838
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-16838-v1.patch, HBASE-16838-v2.patch, 
> HBASE-16838-v3.patch, HBASE-16838.patch
>
>
> Implement a scan works like the grpc streaming call that all returned results 
> will be passed to a ScanConsumer. The methods of the consumer will be called 
> directly in the rpc framework threads so it is not allowed to do time 
> consuming work in the methods. So in general only experts or the 
> implementation of other methods in AsyncTable can call this method directly, 
> that's why I call it 'basic scan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17062) RegionSplitter throws ClassCastException

2016-11-10 Thread Jeongdae Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeongdae Kim updated HBASE-17062:
-
Attachment: HBASE-17062.003.patch

> RegionSplitter throws ClassCastException
> 
>
> Key: HBASE-17062
> URL: https://issues.apache.org/jira/browse/HBASE-17062
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Reporter: Jeongdae Kim
>Priority: Minor
> Attachments: HBASE-17062.001.patch, HBASE-17062.002.patch, 
> HBASE-17062.003.patch
>
>
> RegionSplitter throws Exception as below.
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hbase.ServerName cannot be cast to java.lang.String
>   at java.lang.String.compareTo(String.java:108)
>   at java.util.TreeMap.getEntry(TreeMap.java:346)
>   at java.util.TreeMap.get(TreeMap.java:273)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:504)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:502)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324)
>   at java.util.TimSort.sort(TimSort.java:189)
>   at java.util.TimSort.sort(TimSort.java:173)
>   at java.util.Arrays.sort(Arrays.java:659)
>   at java.util.Collections.sort(Collections.java:217)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.rollingSplit(RegionSplitter.java:502)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.main(RegionSplitter.java:367)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17062) RegionSplitter throws ClassCastException

2016-11-10 Thread Jeongdae Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeongdae Kim updated HBASE-17062:
-
Status: Patch Available  (was: Open)

> RegionSplitter throws ClassCastException
> 
>
> Key: HBASE-17062
> URL: https://issues.apache.org/jira/browse/HBASE-17062
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Reporter: Jeongdae Kim
>Priority: Minor
> Attachments: HBASE-17062.001.patch, HBASE-17062.002.patch, 
> HBASE-17062.003.patch
>
>
> RegionSplitter throws Exception as below.
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hbase.ServerName cannot be cast to java.lang.String
>   at java.lang.String.compareTo(String.java:108)
>   at java.util.TreeMap.getEntry(TreeMap.java:346)
>   at java.util.TreeMap.get(TreeMap.java:273)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:504)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:502)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324)
>   at java.util.TimSort.sort(TimSort.java:189)
>   at java.util.TimSort.sort(TimSort.java:173)
>   at java.util.Arrays.sort(Arrays.java:659)
>   at java.util.Collections.sort(Collections.java:217)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.rollingSplit(RegionSplitter.java:502)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.main(RegionSplitter.java:367)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17062) RegionSplitter throws ClassCastException

2016-11-10 Thread Jeongdae Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeongdae Kim updated HBASE-17062:
-
Status: Open  (was: Patch Available)

> RegionSplitter throws ClassCastException
> 
>
> Key: HBASE-17062
> URL: https://issues.apache.org/jira/browse/HBASE-17062
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Reporter: Jeongdae Kim
>Priority: Minor
> Attachments: HBASE-17062.001.patch, HBASE-17062.002.patch
>
>
> RegionSplitter throws Exception as below.
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hbase.ServerName cannot be cast to java.lang.String
>   at java.lang.String.compareTo(String.java:108)
>   at java.util.TreeMap.getEntry(TreeMap.java:346)
>   at java.util.TreeMap.get(TreeMap.java:273)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:504)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:502)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324)
>   at java.util.TimSort.sort(TimSort.java:189)
>   at java.util.TimSort.sort(TimSort.java:173)
>   at java.util.Arrays.sort(Arrays.java:659)
>   at java.util.Collections.sort(Collections.java:217)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.rollingSplit(RegionSplitter.java:502)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.main(RegionSplitter.java:367)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17062) RegionSplitter throws ClassCastException

2016-11-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656038#comment-15656038
 ] 

Ted Yu commented on HBASE-17062:


Almost there.
{code}
23  import java.util.*;
{code}
Please don't use '*' in import.

> RegionSplitter throws ClassCastException
> 
>
> Key: HBASE-17062
> URL: https://issues.apache.org/jira/browse/HBASE-17062
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Reporter: Jeongdae Kim
>Priority: Minor
> Attachments: HBASE-17062.001.patch, HBASE-17062.002.patch
>
>
> RegionSplitter throws Exception as below.
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hbase.ServerName cannot be cast to java.lang.String
>   at java.lang.String.compareTo(String.java:108)
>   at java.util.TreeMap.getEntry(TreeMap.java:346)
>   at java.util.TreeMap.get(TreeMap.java:273)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:504)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:502)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324)
>   at java.util.TimSort.sort(TimSort.java:189)
>   at java.util.TimSort.sort(TimSort.java:173)
>   at java.util.Arrays.sort(Arrays.java:659)
>   at java.util.Collections.sort(Collections.java:217)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.rollingSplit(RegionSplitter.java:502)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.main(RegionSplitter.java:367)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17021) Use RingBuffer to reduce the contention in AsyncFSWAL

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656035#comment-15656035
 ] 

Hudson commented on HBASE-17021:


ABORTED: Integrated in Jenkins build HBase-Trunk_matrix #1942 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1942/])
HBASE-17021 Use RingBuffer to reduce the contention in AsyncFSWAL (zhangduo: 
rev 3b629d632ae660b618422b3e2f67533a6fdc7106)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSWALEntry.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestAsyncFSWAL.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/RingBufferTruck.java
* (edit) pom.xml
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AsyncFSWAL.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestAsyncWALReplay.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WAL.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/AbstractTestWALReplay.java


> Use RingBuffer to reduce the contention in AsyncFSWAL
> -
>
> Key: HBASE-17021
> URL: https://issues.apache.org/jira/browse/HBASE-17021
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0
>
> Attachments: AsyncWAL_disruptor_7.patch, HBASE-17021-v1.patch, 
> HBASE-17021-v2.patch, HBASE-17021-v3.patch, HBASE-17021.patch
>
>
> The WALPE result in HBASE-16890 shows that with disruptor's RingBuffer we can 
> get a better performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656032#comment-15656032
 ] 

Hudson commented on HBASE-17020:


ABORTED: Integrated in Jenkins build HBase-1.4 #528 (See 
[https://builds.apache.org/job/HBase-1.4/528/])
HBASE-17020 keylen in midkey() dont computed correctly (liyu: rev 
18b31fdd32cbd59da0e41ec1083473023746f264)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java


> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4
>Reporter: Yu Sun
>Assignee: Yu Sun
> Fix For: 2.0.0, 1.4.0, 1.2.5, 0.98.24, 1.1.8
>
> Attachments: HBASE-17020-branch-0.98.patch, HBASE-17020-v1.patch, 
> HBASE-17020-v2.patch, HBASE-17020-v2.patch, HBASE-17020-v3-branch1.1.patch, 
> HBASE-17020.branch-0.98.patch, HBASE-17020.branch-0.98.patch, 
> HBASE-17020.branch-1.1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17017) Remove the current per-region latency histogram metrics

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656036#comment-15656036
 ] 

Hudson commented on HBASE-17017:


ABORTED: Integrated in Jenkins build HBase-Trunk_matrix #1942 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1942/])
HBASE-17017 Remove the current per-region latency histogram metrics (enis: rev 
03bc884ea085197a651b50ddc21f575560854f1c)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegion.java
* (edit) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSource.java
* (edit) 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* (edit) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSource.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java


> Remove the current per-region latency histogram metrics
> ---
>
> Key: HBASE-17017
> URL: https://issues.apache.org/jira/browse/HBASE-17017
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 2.0.0, 1.3.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Enis Soztutar
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: Screen Shot 2016-11-04 at 3.00.21 PM.png, Screen Shot 
> 2016-11-04 at 3.38.42 PM.png, hbase-17017_v1.patch, hbase-17017_v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656034#comment-15656034
 ] 

Hudson commented on HBASE-17039:


ABORTED: Integrated in Jenkins build HBase-1.4 #528 (See 
[https://builds.apache.org/job/HBase-1.4/528/])
HBASE-17039 SimpleLoadBalancer schedules large amount of invalid region (liyu: 
rev d248d6b0b3d3f6f0b7a265d5f8607d5f5c62eefb)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/SimpleLoadBalancer.java


> SimpleLoadBalancer schedules large amount of invalid region moves
> -
>
> Key: HBASE-17039
> URL: https://issues.apache.org/jira/browse/HBASE-17039
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 2.0.0, 1.3.0, 1.1.7, 1.2.4
>Reporter: Charlie Qiangeng Xu
>Assignee: Charlie Qiangeng Xu
> Fix For: 2.0.0, 1.4.0, 1.2.5, 1.1.8
>
> Attachments: HBASE-17039.patch
>
>
> After increasing one of our clusters to 1600 nodes, we observed a large 
> amount of invalid region moves(more than 30k moves) fired by the balance 
> chore. Thus we simulated the problem and printed out the balance plan, only 
> to find out many servers that had two regions for a certain table(we use by 
> table strategy), sent out both regions to other two servers that have zero 
> region. 
> In the SimpleLoadBalancer's balanceCluster function,
> the code block that determines the underLoadedServers might have a problem:
> {code}
>   if (load >= min && load > 0) {
> continue; // look for other servers which haven't reached min
>   }
>   int regionsToPut = min - load;
>   if (regionsToPut == 0)
>   {
> regionsToPut = 1;
>   }
> {code}
> if min is zero, some server that has load of zero, which equals to min would 
> be marked as underloaded, which would cause the phenomenon mentioned above.
> Since we increased the cluster's size to 1600+, many tables that only have 
> 1000 regions, now would encounter such issue.
> By fixing it up, the balance plan went back to normal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16985) TestClusterId failed due to wrong hbase rootdir

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656033#comment-15656033
 ] 

Hudson commented on HBASE-16985:


ABORTED: Integrated in Jenkins build HBase-1.4 #528 (See 
[https://builds.apache.org/job/HBase-1.4/528/])
HBASE-16985 TestClusterId failed due to wrong hbase rootdir (stack: rev 
e929156f96de004b2b8a0535463eff7fe8c38116)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> TestClusterId failed due to wrong hbase rootdir
> ---
>
> Key: HBASE-16985
> URL: https://issues.apache.org/jira/browse/HBASE-16985
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Fix For: 2.0.0, 1.4.0, 1.3.1, 1.2.5, 1.1.8
>
> Attachments: HBASE-16985-branch-1.1.patch, 
> HBASE-16985-branch-1.2.patch, HBASE-16985-branch-1.patch, 
> HBASE-16985-branch-1.patch, HBASE-16985-v1.patch, HBASE-16985-v1.patch, 
> HBASE-16985.patch
>
>
> https://builds.apache.org/job/PreCommit-HBASE-Build/4253/testReport/org.apache.hadoop.hbase.regionserver/TestClusterId/testClusterId/
> {code}
> java.io.IOException: Shutting down
>   at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:230)
>   at 
> org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:409)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:227)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:96)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1071)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1037)
>   at 
> org.apache.hadoop.hbase.regionserver.TestClusterId.testClusterId(TestClusterId.java:85)
> {code}
> The cluster can not start up because there are no active master. The active 
> master can not finish initialing because the hbase:namespace region can not 
> be assign. 
> In TestClusterId unit test, TEST_UTIL.startMiniHBaseCluster set new hbase 
> root dir. But the regionserver thread which stared first used  a different 
> hbase root dir. If assign hbase:namespace region to this regionserver, the 
> region can not be assigned because there are no tableinfo on wrong hbase root 
> dir.
> When regionserver report to master, it will get back some new config. But the 
> FSTableDescriptors has been initialed so it's root dir didn't changed.
> {code}
> if (LOG.isDebugEnabled()) {
> LOG.info("Config from master: " + key + "=" + value);
> }
> {code} 
> I thought FSTableDescriptors need update the rootdir when regionserver get 
> report from master.
> The master branch has same problem, too. But the balancer always assign 
> hbase:namesapce region to master. So this unit test can passed on master 
> branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656028#comment-15656028
 ] 

Hudson commented on HBASE-16570:


ABORTED: Integrated in Jenkins build HBase-1.4 #528 (See 
[https://builds.apache.org/job/HBase-1.4/528/])
HBASE-16570 Compute region locality in parallel at startup (addendum) (liyu: 
rev dac73eceb03bf871ce6def7982b39950e68be1e2)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestRegionLocationFinder.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/RegionLocationFinder.java


> Compute region locality in parallel at startup
> --
>
> Key: HBASE-16570
> URL: https://issues.apache.org/jira/browse/HBASE-16570
> Project: HBase
>  Issue Type: Sub-task
>Reporter: binlijin
>Assignee: binlijin
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16570-master_V1.patch, 
> HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, 
> HBASE-16570-master_V4.patch, HBASE-16570.branch-1.3-addendum.patch, 
> HBASE-16570_addnum.patch, HBASE-16570_addnum_v2.patch, 
> HBASE-16570_addnum_v3.patch, HBASE-16570_addnum_v4.patch, 
> HBASE-16570_addnum_v5.patch, HBASE-16570_addnum_v6.patch, 
> HBASE-16570_addnum_v7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16938) TableCFsUpdater maybe failed due to no write permission on peerNode

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656030#comment-15656030
 ] 

Hudson commented on HBASE-16938:


ABORTED: Integrated in Jenkins build HBase-1.4 #528 (See 
[https://builds.apache.org/job/HBase-1.4/528/])
HBASE-16938 TableCFsUpdater maybe failed due to no write permission on (enis: 
rev a6397e3b0c5a9c938c0a00cb5d3cd762d498afd1)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/master/TableCFsUpdater.java


> TableCFsUpdater maybe failed due to no write permission on peerNode
> ---
>
> Key: HBASE-16938
> URL: https://issues.apache.org/jira/browse/HBASE-16938
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16938.patch, HBASE-16938.patch
>
>
> After HBASE-11393, replication table-cfs use a PB object. So it need copy the 
> old string config to new PB object when upgrade cluster. In our use case, we 
> have different kerberos for different cluster, etc. online serve cluster and 
> offline processing cluster. And we use a unify global admin kerberos for all 
> clusters. The peer node is created by client. So only global admin has the 
> write  permission for it. When upgrade cluster, HMaster doesn't has the write 
> permission on peer node, it maybe failed to copy old table-cfs string to new 
> PB Object. I thought it need a tool for client to do this copy job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17054) Compactor#preCreateCoprocScanner should be passed user

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656027#comment-15656027
 ] 

Hudson commented on HBASE-17054:


ABORTED: Integrated in Jenkins build HBase-1.4 #528 (See 
[https://builds.apache.org/job/HBase-1.4/528/])
HBASE-17054 Compactor#preCreateCoprocScanner should be passed user (tedyu: rev 
1e322e68a5f383b59011d50c7f09257c459831c3)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java


> Compactor#preCreateCoprocScanner should be passed user
> --
>
> Key: HBASE-17054
> URL: https://issues.apache.org/jira/browse/HBASE-17054
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 17054.v1.txt
>
>
> As Anoop mentioned at the end of HBASE-16962:
> {code}
>   ScanType scanType = scannerFactory.getScanType(request);
>   scanner = preCreateCoprocScanner(request, scanType, fd.earliestPutTs, 
> scanners);
> {code}
> user should be passed to preCreateCoprocScanner().
> Otherwise null User would be used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17017) Remove the current per-region latency histogram metrics

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656011#comment-15656011
 ] 

Hudson commented on HBASE-17017:


ABORTED: Integrated in Jenkins build HBase-1.4 #527 (See 
[https://builds.apache.org/job/HBase-1.4/527/])
HBASE-17017 Remove the current per-region latency histogram metrics (enis: rev 
123d26ed907a9d1532386ce965ff2c388e44fe39)
* (edit) 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java
* (edit) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSource.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegion.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* (edit) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSource.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java


> Remove the current per-region latency histogram metrics
> ---
>
> Key: HBASE-17017
> URL: https://issues.apache.org/jira/browse/HBASE-17017
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 2.0.0, 1.3.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Enis Soztutar
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: Screen Shot 2016-11-04 at 3.00.21 PM.png, Screen Shot 
> 2016-11-04 at 3.38.42 PM.png, hbase-17017_v1.patch, hbase-17017_v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17062) RegionSplitter throws ClassCastException

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656002#comment-15656002
 ] 

Hadoop QA commented on HBASE-17062:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 52s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
30s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
4s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
30s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 28s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 13s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
26s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 133m 16s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.TestJMXListener |
| Timed out junit tests | 
org.apache.hadoop.hbase.master.procedure.TestDeleteTableProcedure |
|   | 
org.apache.hadoop.hbase.master.procedure.TestMasterProcedureSchedulerConcurrency
 |
|   | org.apache.hadoop.hbase.master.procedure.TestRestoreSnapshotProcedure |
|   | org.apache.hadoop.hbase.master.procedure.TestMasterProcedureWalLease |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:7bda515 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838475/HBASE-17062.002.patch 
|
| JIRA Issue | HBASE-17062 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux e00239b26f40 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 62e3b1e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4428/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HBASE-Build/4428/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 

[jira] [Commented] (HBASE-17068) Procedure v2 - inherit region locks

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655972#comment-15655972
 ] 

Hadoop QA commented on HBASE-17068:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
44s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
41s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
35s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
25m 14s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 59s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 109m 6s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | 
org.apache.hadoop.hbase.util.TestMiniClusterLoadEncoded |
|   | org.apache.hadoop.hbase.util.TestMergeTable |
|   | org.apache.hadoop.hbase.util.TestMergeTool |
|   | org.apache.hadoop.hbase.util.TestConnectionCache |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:7bda515 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838481/HBASE-17068-v1.patch |
| JIRA Issue | HBASE-17068 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 477969be7b6b 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 62e3b1e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4429/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HBASE-Build/4429/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4429/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4429/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was 

[jira] [Updated] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves

2016-11-10 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17039:
--
   Resolution: Fixed
Fix Version/s: (was: 1.3.1)
   Status: Resolved  (was: Patch Available)

Closing this one and will track the backport for 1.3.1 through HBASE-17069. 
Thanks [~xharlie] for the patch and thanks all for review.

> SimpleLoadBalancer schedules large amount of invalid region moves
> -
>
> Key: HBASE-17039
> URL: https://issues.apache.org/jira/browse/HBASE-17039
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 2.0.0, 1.3.0, 1.1.7, 1.2.4
>Reporter: Charlie Qiangeng Xu
>Assignee: Charlie Qiangeng Xu
> Fix For: 2.0.0, 1.4.0, 1.2.5, 1.1.8
>
> Attachments: HBASE-17039.patch
>
>
> After increasing one of our clusters to 1600 nodes, we observed a large 
> amount of invalid region moves(more than 30k moves) fired by the balance 
> chore. Thus we simulated the problem and printed out the balance plan, only 
> to find out many servers that had two regions for a certain table(we use by 
> table strategy), sent out both regions to other two servers that have zero 
> region. 
> In the SimpleLoadBalancer's balanceCluster function,
> the code block that determines the underLoadedServers might have a problem:
> {code}
>   if (load >= min && load > 0) {
> continue; // look for other servers which haven't reached min
>   }
>   int regionsToPut = min - load;
>   if (regionsToPut == 0)
>   {
> regionsToPut = 1;
>   }
> {code}
> if min is zero, some server that has load of zero, which equals to min would 
> be marked as underloaded, which would cause the phenomenon mentioned above.
> Since we increased the cluster's size to 1600+, many tables that only have 
> 1000 regions, now would encounter such issue.
> By fixing it up, the balance plan went back to normal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves

2016-11-10 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655937#comment-15655937
 ] 

Yu Li edited comment on HBASE-17039 at 11/11/16 2:54 AM:
-

Closing this one and will track the backport for 1.3.1 through HBASE-17059. 
Thanks [~xharlie] for the patch and thanks all for review.


was (Author: carp84):
Closing this one and will track the backport for 1.3.1 through HBASE-17069. 
Thanks [~xharlie] for the patch and thanks all for review.

> SimpleLoadBalancer schedules large amount of invalid region moves
> -
>
> Key: HBASE-17039
> URL: https://issues.apache.org/jira/browse/HBASE-17039
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 2.0.0, 1.3.0, 1.1.7, 1.2.4
>Reporter: Charlie Qiangeng Xu
>Assignee: Charlie Qiangeng Xu
> Fix For: 2.0.0, 1.4.0, 1.2.5, 1.1.8
>
> Attachments: HBASE-17039.patch
>
>
> After increasing one of our clusters to 1600 nodes, we observed a large 
> amount of invalid region moves(more than 30k moves) fired by the balance 
> chore. Thus we simulated the problem and printed out the balance plan, only 
> to find out many servers that had two regions for a certain table(we use by 
> table strategy), sent out both regions to other two servers that have zero 
> region. 
> In the SimpleLoadBalancer's balanceCluster function,
> the code block that determines the underLoadedServers might have a problem:
> {code}
>   if (load >= min && load > 0) {
> continue; // look for other servers which haven't reached min
>   }
>   int regionsToPut = min - load;
>   if (regionsToPut == 0)
>   {
> regionsToPut = 1;
>   }
> {code}
> if min is zero, some server that has load of zero, which equals to min would 
> be marked as underloaded, which would cause the phenomenon mentioned above.
> Since we increased the cluster's size to 1600+, many tables that only have 
> 1000 regions, now would encounter such issue.
> By fixing it up, the balance plan went back to normal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves

2016-11-10 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652918#comment-15652918
 ] 

Yu Li edited comment on HBASE-17039 at 11/11/16 2:53 AM:
-

Created HBASE-17059 for backporting to 1.3.1


was (Author: carp84):
Create sub-task for backporting to 1.3.1 and update fix version relatively.

Leave this JIRA open until sub-task done.

> SimpleLoadBalancer schedules large amount of invalid region moves
> -
>
> Key: HBASE-17039
> URL: https://issues.apache.org/jira/browse/HBASE-17039
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 2.0.0, 1.3.0, 1.1.7, 1.2.4
>Reporter: Charlie Qiangeng Xu
>Assignee: Charlie Qiangeng Xu
> Fix For: 2.0.0, 1.4.0, 1.3.1, 1.2.5, 1.1.8
>
> Attachments: HBASE-17039.patch
>
>
> After increasing one of our clusters to 1600 nodes, we observed a large 
> amount of invalid region moves(more than 30k moves) fired by the balance 
> chore. Thus we simulated the problem and printed out the balance plan, only 
> to find out many servers that had two regions for a certain table(we use by 
> table strategy), sent out both regions to other two servers that have zero 
> region. 
> In the SimpleLoadBalancer's balanceCluster function,
> the code block that determines the underLoadedServers might have a problem:
> {code}
>   if (load >= min && load > 0) {
> continue; // look for other servers which haven't reached min
>   }
>   int regionsToPut = min - load;
>   if (regionsToPut == 0)
>   {
> regionsToPut = 1;
>   }
> {code}
> if min is zero, some server that has load of zero, which equals to min would 
> be marked as underloaded, which would cause the phenomenon mentioned above.
> Since we increased the cluster's size to 1600+, many tables that only have 
> 1000 regions, now would encounter such issue.
> By fixing it up, the balance plan went back to normal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17059) backport HBASE-17039 to 1.3.1

2016-11-10 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17059:
--
Affects Version/s: 1.3.0
Fix Version/s: (was: 1.1.8)
   (was: 1.2.5)
   (was: 1.4.0)
   (was: 2.0.0)
   1.3.1

> backport HBASE-17039 to 1.3.1
> -
>
> Key: HBASE-17059
> URL: https://issues.apache.org/jira/browse/HBASE-17059
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 1.3.0
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 1.3.1
>
>
> Currently branch-1.3 codes are freezing for 1.3.0 release, need to backport 
> HBASE-17039 to 1.3.1 afterwards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17059) backport HBASE-17039 to 1.3.1

2016-11-10 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17059:
--
Issue Type: Bug  (was: Sub-task)
Parent: (was: HBASE-17039)

> backport HBASE-17039 to 1.3.1
> -
>
> Key: HBASE-17059
> URL: https://issues.apache.org/jira/browse/HBASE-17059
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.4.0, 1.2.5, 1.1.8
>
>
> Currently branch-1.3 codes are freezing for 1.3.0 release, need to backport 
> HBASE-17039 to 1.3.1 afterwards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-10 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17020:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed commit into branch-0.98 since HadoopQA looks good, and opened 
HBASE-17070 for branch-1.3

Closing this JIRA since all work done here. Thanks [~haoran] for the patch and 
thanks all for review.

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4
>Reporter: Yu Sun
>Assignee: Yu Sun
> Fix For: 2.0.0, 1.4.0, 1.2.5, 0.98.24, 1.1.8
>
> Attachments: HBASE-17020-branch-0.98.patch, HBASE-17020-v1.patch, 
> HBASE-17020-v2.patch, HBASE-17020-v2.patch, HBASE-17020-v3-branch1.1.patch, 
> HBASE-17020.branch-0.98.patch, HBASE-17020.branch-0.98.patch, 
> HBASE-17020.branch-1.1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17070) backport HBASE-17020 to 1.3.1

2016-11-10 Thread Yu Li (JIRA)
Yu Li created HBASE-17070:
-

 Summary: backport HBASE-17020 to 1.3.1
 Key: HBASE-17070
 URL: https://issues.apache.org/jira/browse/HBASE-17070
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Yu Li
Assignee: Yu Li
 Fix For: 1.3.1


As titled, backport HBASE-17020 after 1.3.0 got released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-10 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655917#comment-15655917
 ] 

Yu Li commented on HBASE-17020:
---

Thanks for the confirm, opened HBASE-17070 to track this.

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4
>Reporter: Yu Sun
>Assignee: Yu Sun
> Fix For: 2.0.0, 1.4.0, 1.2.5, 0.98.24, 1.1.8
>
> Attachments: HBASE-17020-branch-0.98.patch, HBASE-17020-v1.patch, 
> HBASE-17020-v2.patch, HBASE-17020-v2.patch, HBASE-17020-v3-branch1.1.patch, 
> HBASE-17020.branch-0.98.patch, HBASE-17020.branch-0.98.patch, 
> HBASE-17020.branch-1.1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655906#comment-15655906
 ] 

Hadoop QA commented on HBASE-17020:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 33s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
10s {color} | {color:green} 0.98 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} 0.98 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
22s {color} | {color:green} 0.98 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} 0.98 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 38s 
{color} | {color:red} hbase-server in 0.98 has 84 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} 0.98 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
12m 17s {color} | {color:green} The patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 117m 44s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
26s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 156m 59s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:568b3f7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838464/HBASE-17020.branch-0.98.patch
 |
| JIRA Issue | HBASE-17020 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux eed811a9368f 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/hbase.sh |
| git revision | 0.98 / 5f9cd86 |
| Default Java | 1.7.0_80 |
| findbugs | v2.0.1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4427/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4427/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4427/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4
>Reporter: Yu Sun
>

[jira] [Updated] (HBASE-16972) Log more details for Scan#next request when responseTooSlow

2016-11-10 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-16972:
--
Fix Version/s: 1.3.0

Pushed commit into branch-1.3, thanks [~mantonov] for the message in 
HBASE-17011.

> Log more details for Scan#next request when responseTooSlow
> ---
>
> Key: HBASE-16972
> URL: https://issues.apache.org/jira/browse/HBASE-16972
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability
>Affects Versions: 1.2.3, 1.1.7
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4, 1.1.8
>
> Attachments: HBASE-16972.patch, HBASE-16972.v2.patch, 
> HBASE-16972.v3.patch
>
>
> Currently for if responseTooSlow happens on the scan.next call, we will get 
> warn log like below:
> {noformat}
> 2016-10-31 11:43:23,430 WARN  
> [RpcServer.FifoWFPBQ.priority.handler=5,queue=1,port=60193] 
> ipc.RpcServer(2574):
> (responseTooSlow): 
> {"call":"Scan(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ScanRequest)",
> "starttimems":1477885403428,"responsesize":52,"method":"Scan","param":"scanner_id:
>  11 number_of_rows: 2147483647
> close_scanner: false next_call_seq: 0 client_handles_partials: true 
> client_handles_heartbeats: true
> track_scan_metrics: false renew: 
> false","processingtimems":2,"client":"127.0.0.1:60254","queuetimems":0,"class":"HMaster"}
> {noformat}
> From which we only have a {{scanner_id}} and impossible to know what exactly 
> this scan is about, like against which region of which table.
> After this JIRA, we will improve the message to something like below (notice 
> the last line):
> {noformat}
> 2016-10-31 11:43:23,430 WARN  
> [RpcServer.FifoWFPBQ.priority.handler=5,queue=1,port=60193] 
> ipc.RpcServer(2574):
> (responseTooSlow): 
> {"call":"Scan(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ScanRequest)",
> "starttimems":1477885403428,"responsesize":52,"method":"Scan","param":"scanner_id:
>  11 number_of_rows: 2147483647
> close_scanner: false next_call_seq: 0 client_handles_partials: true 
> client_handles_heartbeats: true
> track_scan_metrics: false renew: 
> false","processingtimems":2,"client":"127.0.0.1:60254","queuetimems":0,"class":"HMaster",
> "scandetails":"table: hbase:meta region: hbase:meta,,1.1588230740"}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-17011) backport HBASE-16972 to 1.3.1

2016-11-10 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li resolved HBASE-17011.
---
   Resolution: Invalid
Fix Version/s: (was: 1.3.1)

Mark this JIRA as invalid and will commit to branch-1.3 in HBASE-16972 itself.

> backport HBASE-16972 to 1.3.1
> -
>
> Key: HBASE-17011
> URL: https://issues.apache.org/jira/browse/HBASE-17011
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.3.0
>Reporter: Yu Li
>Assignee: Yu Li
>
> As discussed in HBASE-16972, holding commits for now until 1.3.0 comes out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17011) backport HBASE-16972 to 1.3.1

2016-11-10 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655894#comment-15655894
 ] 

Yu Li commented on HBASE-17011:
---

Sure, no problem, let me get this into 1.3.0. Thanks for the message sir 
[~mantonov]

> backport HBASE-16972 to 1.3.1
> -
>
> Key: HBASE-17011
> URL: https://issues.apache.org/jira/browse/HBASE-17011
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.3.0
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 1.3.1
>
>
> As discussed in HBASE-16972, holding commits for now until 1.3.0 comes out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16841) Data loss in MOB files after cloning a snapshot and deleting that snapshot

2016-11-10 Thread Jingcheng Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingcheng Du updated HBASE-16841:
-
Attachment: HBASE-16841-V5.patch

Thanks [~tedyu]!
Upload a new patch V5 according Ted's comments and fix the check style issues.

> Data loss in MOB files after cloning a snapshot and deleting that snapshot
> --
>
> Key: HBASE-16841
> URL: https://issues.apache.org/jira/browse/HBASE-16841
> Project: HBase
>  Issue Type: Bug
>  Components: mob, snapshots
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HBASE-16841-V2.patch, HBASE-16841-V3.patch, 
> HBASE-16841-V4.patch, HBASE-16841-V5.patch, HBASE-16841.patch
>
>
> Running the following steps will probably lose MOB data when working with 
> snapshots.
> 1. Create a mob-enabled table by running create 't1', {NAME => 'f1', IS_MOB 
> => true, MOB_THRESHOLD => 0}.
> 2. Put millions of data.
> 3. Run {{snapshot 't1','t1_snapshot'}} to take a snapshot for this table t1.
> 4. Run {{clone_snapshot 't1_snapshot','t1_cloned'}} to clone this snapshot.
> 5. Run {{delete_snapshot 't1_snapshot'}} to delete this snapshot.
> 6. Run {{disable 't1'}} and {{delete 't1'}} to delete the table.
> 7. Now go to the archive directory of t1, the number of .link directories is 
> different from the number of hfiles which means some data will be lost after 
> the hfile cleaner runs.
> This is because, when taking a snapshot on a enabled mob table, each region 
> flushes itself and takes a snapshot, and the mob snapshot is taken only if 
> the current region is first region of the table. At that time, the flushing 
> of some regions might not be finished, and some mob files are not flushed to 
> disk yet. Eventually some mob files are not recorded in the snapshot manifest.
> To solve this, we need to take the mob snapshot at last after the snapshots 
> on all the online and offline regions are finished in 
> {{EnabledTableSnapshotHandler}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17069) RegionServer writes invalid META entries for split daughters in some circumstances

2016-11-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655835#comment-15655835
 ] 

Andrew Purtell commented on HBASE-17069:


Sounds fair. I don't think HBASE-17044 is a blocker; nobody seems to have hit 
it.

> RegionServer writes invalid META entries for split daughters in some 
> circumstances
> --
>
> Key: HBASE-17069
> URL: https://issues.apache.org/jira/browse/HBASE-17069
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.4
>Reporter: Andrew Purtell
>Priority: Critical
> Attachments: daughter_1_d55ef81c2f8299abbddfce0445067830.log, 
> daughter_2_08629d59564726da2497f70451aafcdb.log, logs.tar.gz, 
> parent-393d2bfd8b1c52ce08540306659624f2.log
>
>
> I have been seeing frequent ITBLL failures testing various versions of 1.2.x. 
> Over the lifetime of 1.2.x the following issues have been fixed:
> - HBASE-15315 (Remove always set super user call as high priority)
> - HBASE-16093 (Fix splits failed before creating daughter regions leave meta 
> inconsistent)
> And this one is pending:
> - HBASE-17044 (Fix merge failed before creating merged region leaves meta 
> inconsistent)
> I can apply all of the above to branch-1.2 and still see this failure: 
> *The life of stillborn region d55ef81c2f8299abbddfce0445067830*
> *Master sees SPLITTING_NEW*
> {noformat}
> 2016-11-08 04:23:21,186 INFO  [AM.ZK.Worker-pool2-t82] master.RegionStates: 
> Transition null to {d55ef81c2f8299abbddfce0445067830 state=SPLITTING_NEW, 
> ts=1478579001186, server=node-3.cluster,16020,1478578389506}
> {noformat}
> *The RegionServer creates it*
> {noformat}
> 2016-11-08 04:23:26,035 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for GomnU: blockCache=LruBlockCache{blockCount=34, 
> currentSize=14996112, freeSize=12823716208, maxSize=12838712320, 
> heapSize=14996112, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,038 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for big: blockCache=LruBlockCache{blockCount=34, 
> currentSize=14996112, freeSize=12823716208, maxSize=12838712320, 
> heapSize=14996112, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,442 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for meta: blockCache=LruBlockCache{blockCount=63, 
> currentSize=17187656, freeSize=12821524664, maxSize=12838712320, 
> heapSize=17187656, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,713 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for nwmrW: blockCache=LruBlockCache{blockCount=96, 
> currentSize=19178440, freeSize=12819533880, maxSize=12838712320, 
> heapSize=19178440, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,715 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for piwbr: blockCache=LruBlockCache{blockCount=96, 
> currentSize=19178440, freeSize=12819533880, maxSize=12838712320, 
> heapSize=19178440, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,717 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for tiny: blockCache=LruBlockCache{blockCount=96, 
> currentSize=19178440, freeSize=12819533880, maxSize=12838712320, 
> heapSize=19178440, minSize=12196776960, minFactor=0.95, 

[jira] [Commented] (HBASE-17011) backport HBASE-16972 to 1.3.1

2016-11-10 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655829#comment-15655829
 ] 

Mikhail Antonov commented on HBASE-17011:
-

[~carp84] since on 1.3.0RC0 I found flaky/broken compaction test (HBASE-16852) 
and looking, you guys want to piggy back on it and get this one to 1.3.0? Seems 
good to me at the moment (won't wait specifically, but would be ok to commit). 
Sorry for any additional labor that could have caused :) 

> backport HBASE-16972 to 1.3.1
> -
>
> Key: HBASE-17011
> URL: https://issues.apache.org/jira/browse/HBASE-17011
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.3.0
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 1.3.1
>
>
> As discussed in HBASE-16972, holding commits for now until 1.3.0 comes out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16838) Implement basic scan

2016-11-10 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655825#comment-15655825
 ] 

Duo Zhang commented on HBASE-16838:
---

Ping [~stack].

Will commit later today if no other objections.

Thanks.

> Implement basic scan
> 
>
> Key: HBASE-16838
> URL: https://issues.apache.org/jira/browse/HBASE-16838
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0
>
> Attachments: HBASE-16838-v1.patch, HBASE-16838-v2.patch, 
> HBASE-16838-v3.patch, HBASE-16838.patch
>
>
> Implement a scan works like the grpc streaming call that all returned results 
> will be passed to a ScanConsumer. The methods of the consumer will be called 
> directly in the rpc framework threads so it is not allowed to do time 
> consuming work in the methods. So in general only experts or the 
> implementation of other methods in AsyncTable can call this method directly, 
> that's why I call it 'basic scan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17069) RegionServer writes invalid META entries for split daughters in some circumstances

2016-11-10 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655813#comment-15655813
 ] 

Mikhail Antonov commented on HBASE-17069:
-

[~apurtell] I don't remember seeing that, but that might be just a factor of 
having slightly different configuration for ITBLL.. 

My point was more that as this issue (and possibly, others) is present in 1.2, 
and we moved stable pointer to 1.2, and assuming it's also present in 1.3 
(which I don't know yet if it's true, just assuming), what's your call on RC0 
with that out.

I guess unless objections, I'd aim to roll first RC regardless and get more 
feedback.

> RegionServer writes invalid META entries for split daughters in some 
> circumstances
> --
>
> Key: HBASE-17069
> URL: https://issues.apache.org/jira/browse/HBASE-17069
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.4
>Reporter: Andrew Purtell
>Priority: Critical
> Attachments: daughter_1_d55ef81c2f8299abbddfce0445067830.log, 
> daughter_2_08629d59564726da2497f70451aafcdb.log, logs.tar.gz, 
> parent-393d2bfd8b1c52ce08540306659624f2.log
>
>
> I have been seeing frequent ITBLL failures testing various versions of 1.2.x. 
> Over the lifetime of 1.2.x the following issues have been fixed:
> - HBASE-15315 (Remove always set super user call as high priority)
> - HBASE-16093 (Fix splits failed before creating daughter regions leave meta 
> inconsistent)
> And this one is pending:
> - HBASE-17044 (Fix merge failed before creating merged region leaves meta 
> inconsistent)
> I can apply all of the above to branch-1.2 and still see this failure: 
> *The life of stillborn region d55ef81c2f8299abbddfce0445067830*
> *Master sees SPLITTING_NEW*
> {noformat}
> 2016-11-08 04:23:21,186 INFO  [AM.ZK.Worker-pool2-t82] master.RegionStates: 
> Transition null to {d55ef81c2f8299abbddfce0445067830 state=SPLITTING_NEW, 
> ts=1478579001186, server=node-3.cluster,16020,1478578389506}
> {noformat}
> *The RegionServer creates it*
> {noformat}
> 2016-11-08 04:23:26,035 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for GomnU: blockCache=LruBlockCache{blockCount=34, 
> currentSize=14996112, freeSize=12823716208, maxSize=12838712320, 
> heapSize=14996112, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,038 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for big: blockCache=LruBlockCache{blockCount=34, 
> currentSize=14996112, freeSize=12823716208, maxSize=12838712320, 
> heapSize=14996112, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,442 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for meta: blockCache=LruBlockCache{blockCount=63, 
> currentSize=17187656, freeSize=12821524664, maxSize=12838712320, 
> heapSize=17187656, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,713 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for nwmrW: blockCache=LruBlockCache{blockCount=96, 
> currentSize=19178440, freeSize=12819533880, maxSize=12838712320, 
> heapSize=19178440, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,715 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for piwbr: blockCache=LruBlockCache{blockCount=96, 
> currentSize=19178440, freeSize=12819533880, maxSize=12838712320, 
> heapSize=19178440, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> 

[jira] [Commented] (HBASE-16938) TableCFsUpdater maybe failed due to no write permission on peerNode

2016-11-10 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655806#comment-15655806
 ] 

Guanghao Zhang commented on HBASE-16938:


We have used a same tool on our production cluster. Thanks [~enis].

> TableCFsUpdater maybe failed due to no write permission on peerNode
> ---
>
> Key: HBASE-16938
> URL: https://issues.apache.org/jira/browse/HBASE-16938
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16938.patch, HBASE-16938.patch
>
>
> After HBASE-11393, replication table-cfs use a PB object. So it need copy the 
> old string config to new PB object when upgrade cluster. In our use case, we 
> have different kerberos for different cluster, etc. online serve cluster and 
> offline processing cluster. And we use a unify global admin kerberos for all 
> clusters. The peer node is created by client. So only global admin has the 
> write  permission for it. When upgrade cluster, HMaster doesn't has the write 
> permission on peer node, it maybe failed to copy old table-cfs string to new 
> PB Object. I thought it need a tool for client to do this copy job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14882) Provide a Put API that adds the provided family, qualifier, value without copying

2016-11-10 Thread Xiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655802#comment-15655802
 ] 

Xiang Li commented on HBASE-14882:
--

[~anoop.hbase] Would you please help to review patch 003 when you have time ^_^

> Provide a Put API that adds the provided family, qualifier, value without 
> copying
> -
>
> Key: HBASE-14882
> URL: https://issues.apache.org/jira/browse/HBASE-14882
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Jerry He
>Assignee: Xiang Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14882.master.000.patch, 
> HBASE-14882.master.001.patch, HBASE-14882.master.002.patch, 
> HBASE-14882.master.003.patch
>
>
> In the Put API, we have addImmutable()
> {code}
>  /**
>* See {@link #addColumn(byte[], byte[], byte[])}. This version expects
>* that the underlying arrays won't change. It's intended
>* for usage internal HBase to and for advanced client applications.
>*/
>   public Put addImmutable(byte [] family, byte [] qualifier, byte [] value)
> {code}
> But in the implementation, the family, qualifier and value are still being 
> copied locally to create kv.
> Hopefully we should provide an API that truly uses immutable family, 
> qualifier and value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17063) Cleanup TestHRegion : remove duplicate variables for method name and two unused params in initRegion

2016-11-10 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-17063:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks [~stack] for the review.

> Cleanup TestHRegion : remove duplicate variables for method name and two 
> unused params in initRegion
> 
>
> Key: HBASE-17063
> URL: https://issues.apache.org/jira/browse/HBASE-17063
> Project: HBase
>  Issue Type: Improvement
>Reporter: Appy
>Assignee: Appy
>Priority: Minor
>  Labels: cleanup, tests
> Fix For: 2.0.0
>
> Attachments: HBASE-17063.master.001.patch
>
>
> - Replaces test function local tablename and method names with those 
> initialized in setup() function
> - Remove unused params from initHRegion().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17063) Cleanup TestHRegion : remove duplicate variables for method name and two unused params in initRegion

2016-11-10 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-17063:
-
Fix Version/s: 2.0.0

> Cleanup TestHRegion : remove duplicate variables for method name and two 
> unused params in initRegion
> 
>
> Key: HBASE-17063
> URL: https://issues.apache.org/jira/browse/HBASE-17063
> Project: HBase
>  Issue Type: Improvement
>Reporter: Appy
>Assignee: Appy
>Priority: Minor
>  Labels: cleanup, tests
> Fix For: 2.0.0
>
> Attachments: HBASE-17063.master.001.patch
>
>
> - Replaces test function local tablename and method names with those 
> initialized in setup() function
> - Remove unused params from initHRegion().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17063) Cleanup TestHRegion : remove duplicate variables for method name and two unused params in initRegion

2016-11-10 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-17063:
-
Labels: cleanup tests  (was: )

> Cleanup TestHRegion : remove duplicate variables for method name and two 
> unused params in initRegion
> 
>
> Key: HBASE-17063
> URL: https://issues.apache.org/jira/browse/HBASE-17063
> Project: HBase
>  Issue Type: Improvement
>Reporter: Appy
>Assignee: Appy
>Priority: Minor
>  Labels: cleanup, tests
> Fix For: 2.0.0
>
> Attachments: HBASE-17063.master.001.patch
>
>
> - Replaces test function local tablename and method names with those 
> initialized in setup() function
> - Remove unused params from initHRegion().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17068) Procedure v2 - inherit region locks

2016-11-10 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-17068:

Attachment: HBASE-17068-v1.patch

> Procedure v2 - inherit region locks 
> 
>
> Key: HBASE-17068
> URL: https://issues.apache.org/jira/browse/HBASE-17068
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Fix For: 2.0.0
>
> Attachments: HBASE-17068-v0.patch, HBASE-17068-v1.patch
>
>
> Add support for inherited region locks. 
> e.g. Split will have Assign/Unassign as child which will take the lock on the 
> same region split is running on



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17068) Procedure v2 - inherit region locks

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655753#comment-15655753
 ] 

Hadoop QA commented on HBASE-17068:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
46s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
39s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
32s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
25m 7s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha1. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 41s 
{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 
total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 43s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
10s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 120m 44s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hbase-server |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler$Queue.exclusiveLockProcIdOwner;
 locked 75% of time  Unsynchronized access at MasterProcedureScheduler.java:75% 
of time  Unsynchronized access at MasterProcedureScheduler.java:[line 1153] |
| Timed out junit tests | org.apache.hadoop.hbase.util.TestIdLock |
|   | org.apache.hadoop.hbase.util.TestHBaseFsckReplicas |
|   | org.apache.hadoop.hbase.util.TestRegionSplitter |
|   | org.apache.hadoop.hbase.util.TestMiniClusterLoadParallel |
|   | org.apache.hadoop.hbase.util.TestIdReadWriteLock |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:7bda515 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838456/HBASE-17068-v0.patch |
| JIRA Issue | HBASE-17068 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 7e8ca7003313 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 12eec5b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HBASE-Build/4426/artifact/patchprocess/new-findbugs-hbase-server.html
 

[jira] [Updated] (HBASE-17062) RegionSplitter throws ClassCastException

2016-11-10 Thread Jeongdae Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeongdae Kim updated HBASE-17062:
-
Status: Open  (was: Patch Available)

> RegionSplitter throws ClassCastException
> 
>
> Key: HBASE-17062
> URL: https://issues.apache.org/jira/browse/HBASE-17062
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Reporter: Jeongdae Kim
>Priority: Minor
> Attachments: HBASE-17062.001.patch, HBASE-17062.002.patch
>
>
> RegionSplitter throws Exception as below.
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hbase.ServerName cannot be cast to java.lang.String
>   at java.lang.String.compareTo(String.java:108)
>   at java.util.TreeMap.getEntry(TreeMap.java:346)
>   at java.util.TreeMap.get(TreeMap.java:273)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:504)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:502)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324)
>   at java.util.TimSort.sort(TimSort.java:189)
>   at java.util.TimSort.sort(TimSort.java:173)
>   at java.util.Arrays.sort(Arrays.java:659)
>   at java.util.Collections.sort(Collections.java:217)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.rollingSplit(RegionSplitter.java:502)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.main(RegionSplitter.java:367)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17062) RegionSplitter throws ClassCastException

2016-11-10 Thread Jeongdae Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeongdae Kim updated HBASE-17062:
-
Status: Patch Available  (was: Open)

> RegionSplitter throws ClassCastException
> 
>
> Key: HBASE-17062
> URL: https://issues.apache.org/jira/browse/HBASE-17062
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Reporter: Jeongdae Kim
>Priority: Minor
> Attachments: HBASE-17062.001.patch, HBASE-17062.002.patch
>
>
> RegionSplitter throws Exception as below.
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hbase.ServerName cannot be cast to java.lang.String
>   at java.lang.String.compareTo(String.java:108)
>   at java.util.TreeMap.getEntry(TreeMap.java:346)
>   at java.util.TreeMap.get(TreeMap.java:273)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:504)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:502)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324)
>   at java.util.TimSort.sort(TimSort.java:189)
>   at java.util.TimSort.sort(TimSort.java:173)
>   at java.util.Arrays.sort(Arrays.java:659)
>   at java.util.Collections.sort(Collections.java:217)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.rollingSplit(RegionSplitter.java:502)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.main(RegionSplitter.java:367)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17062) RegionSplitter throws ClassCastException

2016-11-10 Thread Jeongdae Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeongdae Kim updated HBASE-17062:
-
Attachment: HBASE-17062.002.patch

> RegionSplitter throws ClassCastException
> 
>
> Key: HBASE-17062
> URL: https://issues.apache.org/jira/browse/HBASE-17062
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Reporter: Jeongdae Kim
>Priority: Minor
> Attachments: HBASE-17062.001.patch, HBASE-17062.002.patch
>
>
> RegionSplitter throws Exception as below.
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hbase.ServerName cannot be cast to java.lang.String
>   at java.lang.String.compareTo(String.java:108)
>   at java.util.TreeMap.getEntry(TreeMap.java:346)
>   at java.util.TreeMap.get(TreeMap.java:273)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:504)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter$1.compare(RegionSplitter.java:502)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324)
>   at java.util.TimSort.sort(TimSort.java:189)
>   at java.util.TimSort.sort(TimSort.java:173)
>   at java.util.Arrays.sort(Arrays.java:659)
>   at java.util.Collections.sort(Collections.java:217)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.rollingSplit(RegionSplitter.java:502)
>   at 
> org.apache.hadoop.hbase.util.RegionSplitter.main(RegionSplitter.java:367)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17069) RegionServer writes invalid META entries for split daughters in some circumstances

2016-11-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655722#comment-15655722
 ] 

Andrew Purtell commented on HBASE-17069:


I can't say [~mantonov]. I haven't test 1.3 and up. Pretty busy here. I'm 
assuming you have not seen it? I will aim to get a test of the latest 1.3.0 
tomorrow, but can't promise it

> RegionServer writes invalid META entries for split daughters in some 
> circumstances
> --
>
> Key: HBASE-17069
> URL: https://issues.apache.org/jira/browse/HBASE-17069
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.4
>Reporter: Andrew Purtell
>Priority: Critical
> Attachments: daughter_1_d55ef81c2f8299abbddfce0445067830.log, 
> daughter_2_08629d59564726da2497f70451aafcdb.log, logs.tar.gz, 
> parent-393d2bfd8b1c52ce08540306659624f2.log
>
>
> I have been seeing frequent ITBLL failures testing various versions of 1.2.x. 
> Over the lifetime of 1.2.x the following issues have been fixed:
> - HBASE-15315 (Remove always set super user call as high priority)
> - HBASE-16093 (Fix splits failed before creating daughter regions leave meta 
> inconsistent)
> And this one is pending:
> - HBASE-17044 (Fix merge failed before creating merged region leaves meta 
> inconsistent)
> I can apply all of the above to branch-1.2 and still see this failure: 
> *The life of stillborn region d55ef81c2f8299abbddfce0445067830*
> *Master sees SPLITTING_NEW*
> {noformat}
> 2016-11-08 04:23:21,186 INFO  [AM.ZK.Worker-pool2-t82] master.RegionStates: 
> Transition null to {d55ef81c2f8299abbddfce0445067830 state=SPLITTING_NEW, 
> ts=1478579001186, server=node-3.cluster,16020,1478578389506}
> {noformat}
> *The RegionServer creates it*
> {noformat}
> 2016-11-08 04:23:26,035 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for GomnU: blockCache=LruBlockCache{blockCount=34, 
> currentSize=14996112, freeSize=12823716208, maxSize=12838712320, 
> heapSize=14996112, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,038 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for big: blockCache=LruBlockCache{blockCount=34, 
> currentSize=14996112, freeSize=12823716208, maxSize=12838712320, 
> heapSize=14996112, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,442 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for meta: blockCache=LruBlockCache{blockCount=63, 
> currentSize=17187656, freeSize=12821524664, maxSize=12838712320, 
> heapSize=17187656, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,713 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for nwmrW: blockCache=LruBlockCache{blockCount=96, 
> currentSize=19178440, freeSize=12819533880, maxSize=12838712320, 
> heapSize=19178440, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,715 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for piwbr: blockCache=LruBlockCache{blockCount=96, 
> currentSize=19178440, freeSize=12819533880, maxSize=12838712320, 
> heapSize=19178440, minSize=12196776960, minFactor=0.95, multiSize=6098388480, 
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false
> 2016-11-08 04:23:26,717 INFO  
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created 
> cacheConfig for tiny: blockCache=LruBlockCache{blockCount=96, 
> currentSize=19178440, 

[jira] [Updated] (HBASE-17060) backport HBASE-16570 to 1.3.1

2016-11-10 Thread Mikhail Antonov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-17060:

Fix Version/s: 1.3.1

> backport HBASE-16570 to 1.3.1
> -
>
> Key: HBASE-17060
> URL: https://issues.apache.org/jira/browse/HBASE-17060
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 1.3.0
>Reporter: Yu Li
>Assignee: binlijin
> Fix For: 1.3.1
>
>
> Need some backport after 1.3.0 got released



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-10 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655708#comment-15655708
 ] 

Mikhail Antonov commented on HBASE-17020:
-

thanks for the ping! 1.3.1 backport seems good.

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4
>Reporter: Yu Sun
>Assignee: Yu Sun
> Fix For: 2.0.0, 1.4.0, 1.2.5, 0.98.24, 1.1.8
>
> Attachments: HBASE-17020-branch-0.98.patch, HBASE-17020-v1.patch, 
> HBASE-17020-v2.patch, HBASE-17020-v2.patch, HBASE-17020-v3-branch1.1.patch, 
> HBASE-17020.branch-0.98.patch, HBASE-17020.branch-0.98.patch, 
> HBASE-17020.branch-1.1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >