[jira] [Commented] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry

2017-06-21 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057283#comment-16057283
 ] 

deepankar commented on HBASE-16616:
---

Ah got it, thanks [~tomu.tsuruhara] for pointing me to that patch, that will 
definitely help

> Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
> --
>
> Key: HBASE-16616
> URL: https://issues.apache.org/jira/browse/HBASE-16616
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Affects Versions: 1.2.2
>Reporter: Tomu Tsuruhara
>Assignee: Tomu Tsuruhara
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, 
> HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png
>
>
> In our HBase 1.2.2 cluster, some regionserver showed too bad 
> "QueueCallTime_99th_percentile" exceeding 10 seconds.
> Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at 
> that time.
> {noformat}
> "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 
> os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000]
>java.lang.Thread.State: RUNNABLE
> at 
> java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617)
> at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499)
> at 
> java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298)
> at java.lang.ThreadLocal.remove(ThreadLocal.java:222)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881)
> at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196)
> at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113)
> at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81)
> at 
> org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81)
> at 
> org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59)
> at 
> org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194)
> at 
> org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java.
> {code}
> 616:while (tab[h] != null)
> 617:h = nextIndex(h, len);
> {code}
> So I hypothesized that there're too many consecutive entries in {{tab}} array 
> and actually I found them in the heapdump.
> !ScreenShot 2016-09-09 14.17.53.png|width=50%!
> Most of these entries pointed at instance of 
> {{org.apache.hadoop.hbase.util.Counter$1}}
> which is equivarent to {{indexHolderThreadLocal}} instance-variable in the 
> {{Counter}} class.
> Because {{RpcServer$Connection}} class creates a {{Counter}} instance 
> {{rpcCount}} for every connections,
> it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances 
> in RegionServer process
> when we repeat connect-and-close from client. As a result, a ThreadLocalMap 
> can have lots of consecutive
> entires.
> Usually, since each entry is a {{WeakReference}}, these entries are collected 
> and removed
> by garbage-collector soon after connection closed.
> But if connection's life-time was long enough to survive youngGC, it wouldn't 
> be collected until old-gen collector runs.
> Furthermore, under G1GC deployment, it is possible not to be collected even 
> by old-gen GC(mixed GC)
> if entries sit in a region which doesn't have much garbages.
> Actually we used G1GC when we encountered this problem.
> We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove 
> explicitly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry

2017-06-20 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16056554#comment-16056554
 ] 

deepankar commented on HBASE-16616:
---

Hi, we have recently encountered this issue in our production and I was going 
through the patch, I am not sure this patch solves this issue fully, the doubt 
is mainly due to the fact the Counter#destroy() will only destroy the 
ThreadLocal in the ThreadLocalMap of the thread closing the connection, while 
keeping this Counter's ThreadLocal object still alive for the rest of handler 
threads. Am I missing something in this logic ? 

cc [~stack]  [~tomu.tsuruhara]

> Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
> --
>
> Key: HBASE-16616
> URL: https://issues.apache.org/jira/browse/HBASE-16616
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Affects Versions: 1.2.2
>Reporter: Tomu Tsuruhara
>Assignee: Tomu Tsuruhara
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, 
> HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png
>
>
> In our HBase 1.2.2 cluster, some regionserver showed too bad 
> "QueueCallTime_99th_percentile" exceeding 10 seconds.
> Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at 
> that time.
> {noformat}
> "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 
> os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000]
>java.lang.Thread.State: RUNNABLE
> at 
> java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617)
> at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499)
> at 
> java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298)
> at java.lang.ThreadLocal.remove(ThreadLocal.java:222)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881)
> at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196)
> at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113)
> at 
> com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81)
> at 
> org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81)
> at 
> org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59)
> at 
> org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194)
> at 
> org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java.
> {code}
> 616:while (tab[h] != null)
> 617:h = nextIndex(h, len);
> {code}
> So I hypothesized that there're too many consecutive entries in {{tab}} array 
> and actually I found them in the heapdump.
> !ScreenShot 2016-09-09 14.17.53.png|width=50%!
> Most of these entries pointed at instance of 
> {{org.apache.hadoop.hbase.util.Counter$1}}
> which is equivarent to {{indexHolderThreadLocal}} instance-variable in the 
> {{Counter}} class.
> Because {{RpcServer$Connection}} class creates a {{Counter}} instance 
> {{rpcCount}} for every connections,
> it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances 
> in RegionServer process
> when we repeat connect-and-close from client. As a result, a ThreadLocalMap 
> can have lots of consecutive
> entires.
> Usually, since each entry is a {{WeakReference}}, these entries are collected 
> and removed
> by garbage-collector soon after connection closed.
> But if connection's life-time was long enough to survive youngGC, it wouldn't 
> be collected until old-gen collector runs.
> Furthermore, under G1GC deployment, it is possible not to be collected even 
> by old-gen GC(mixed GC)
> if entries sit in a region which doesn't have much garbages.
> Actually we used G1GC when we encountered this problem.
> We should remove the entry from ThreadLocalMap by 

[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2017-03-05 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896860#comment-15896860
 ] 

deepankar commented on HBASE-16630:
---

Thanks guys for pushing this and dealing with my errors and unresponsiveness

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Fix For: 2.0.0, 1.4.0, 1.3.1, 1.2.6
>
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch, 
> HBASE-16630.patch, HBASE-16630-v2.patch, HBASE-16630-v3-branch-1.patch, 
> HBASE-16630-v3-branch-1.X.patch, HBASE-16630-v3.patch, 
> HBASE-16630-v4-branch-1.X.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2017-02-28 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889710#comment-15889710
 ] 

deepankar commented on HBASE-16630:
---

Sorry guys I forgot to check for compilation and other minor stuff. I will make 
sure to do this from next time

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Fix For: 2.0.0, 1.3.1, 1.2.6
>
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch, 
> HBASE-16630.patch, HBASE-16630-v2.patch, HBASE-16630-v3-branch-1.patch, 
> HBASE-16630-v3-branch-1.X.patch, HBASE-16630-v3.patch, 
> HBASE-16630-v4-branch-1.X.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-16630) Fragmentation in long running Bucket Cache

2017-02-28 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-16630:
--
Attachment: HBASE-16630-v4-branch-1.X.patch

Updated patch fixing the compilation issue

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Fix For: 2.0.0, 1.3.1, 1.2.6
>
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch, 
> HBASE-16630.patch, HBASE-16630-v2.patch, HBASE-16630-v3-branch-1.patch, 
> HBASE-16630-v3-branch-1.X.patch, HBASE-16630-v3.patch, 
> HBASE-16630-v4-branch-1.X.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2017-02-28 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889510#comment-15889510
 ] 

deepankar commented on HBASE-16630:
---

Created HBASE-17711 for tracking the addition of tests for this patch

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Fix For: 2.0.0, 1.3.1, 1.2.5
>
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch, 
> HBASE-16630.patch, HBASE-16630-v2.patch, HBASE-16630-v3-branch-1.patch, 
> HBASE-16630-v3-branch-1.X.patch, HBASE-16630-v3.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17711) Add test for Bucket Cache fragmentation fix

2017-02-28 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-17711:
--
Summary: Add test for Bucket Cache fragmentation fix  (was: Add test for 
HBASE-16630 fix)

> Add test for Bucket Cache fragmentation fix
> ---
>
> Key: HBASE-17711
> URL: https://issues.apache.org/jira/browse/HBASE-17711
> Project: HBase
>  Issue Type: Task
>Reporter: deepankar
>Assignee: deepankar
>
> Add tests for the fix in HBASE-16630



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17711) Add test for HBASE-16630 fix

2017-02-28 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-17711:
--
Description: Add tests for the fix in HBASE-16630

> Add test for HBASE-16630 fix
> 
>
> Key: HBASE-17711
> URL: https://issues.apache.org/jira/browse/HBASE-17711
> Project: HBase
>  Issue Type: Task
>Reporter: deepankar
>Assignee: deepankar
>
> Add tests for the fix in HBASE-16630



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17711) Add test for HBASE-16630 fix

2017-02-28 Thread deepankar (JIRA)
deepankar created HBASE-17711:
-

 Summary: Add test for HBASE-16630 fix
 Key: HBASE-17711
 URL: https://issues.apache.org/jira/browse/HBASE-17711
 Project: HBase
  Issue Type: Task
Reporter: deepankar
Assignee: deepankar






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HBASE-16630) Fragmentation in long running Bucket Cache

2017-02-28 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889149#comment-15889149
 ] 

deepankar edited comment on HBASE-16630 at 3/1/17 12:17 AM:


Uploaded the patch [^HBASE-16630-v3-branch-1.X.patch] for branch-1, branch-1.2, 
branch-1.3


was (Author: dvdreddy):
The patch for branch-1, branch-1.2, branch-1.3

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch, 
> HBASE-16630.patch, HBASE-16630-v2.patch, HBASE-16630-v3-branch-1.X.patch, 
> HBASE-16630-v3.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-16630) Fragmentation in long running Bucket Cache

2017-02-28 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-16630:
--
Attachment: HBASE-16630-v3-branch-1.X.patch

The patch for branch-1, branch-1.2, branch-1.3

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch, 
> HBASE-16630.patch, HBASE-16630-v2.patch, HBASE-16630-v3-branch-1.X.patch, 
> HBASE-16630-v3.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2017-02-10 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861696#comment-15861696
 ] 

deepankar commented on HBASE-16630:
---

About the unit test, I was busy past weeks trying to hunt down another bug 
(will update once I confirm that) did not get a chance to look into it.

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch, 
> HBASE-16630.patch, HBASE-16630-v2.patch, HBASE-16630-v3.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2017-01-17 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826660#comment-15826660
 ] 

deepankar commented on HBASE-16630:
---

Sorry I forgot about the unit test, will see If I hack get something up. About 
[~stack]'s suggestion it was somewhat orthogonal to this issue, I agree that 
the eviction could be improved via the suggestions based in HBASE-15560, but it 
would be hard to prevent the fragmentation from happening when we have buckets 
allocated to different sizes and buckets are occupied very sparsely, and these 
occupants are blocks which are accessed recently and also frequently.

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch, 
> HBASE-16630.patch, HBASE-16630-v2.patch, HBASE-16630-v3.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-12-20 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15765587#comment-15765587
 ] 

deepankar commented on HBASE-16630:
---

Sorry for the late reply, I looked into Stack's suggestion and tried to use for 
this problem, the suggestion was more about improving the LRU part of the 
eviction and I was not sure how that help in this case where the size bucket is 
entirely full. There is already LRU by the means of accessCount and even though 
that might not be fully optimal when compared to Stack's suggestion, in the 
context of this problem the result from both the strategies will be same.

I looked into Ted Yu's suggestion last N blocks but the meta data overhead to 
track for this case was too high. We tried to pinpoint the issue of re-eviction 
int he heuristic used, with some debug statements / heap dump on real data but 
we were not successful. So we have productionz-ed the attached version of patch 
on more HBase clusters and are seeing some occasional issues but nothing 
serious. 



> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch, 
> HBASE-16630-v2.patch, HBASE-16630-v3.patch, HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-11-04 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635468#comment-15635468
 ] 

deepankar commented on HBASE-16630:
---

Yeah I am looking into stacks suggestion to counter the problem I have 
previously mentioned about re-evicted just cached blocks

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch, 
> HBASE-16630-v2.patch, HBASE-16630-v3.patch, HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-11-04 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635468#comment-15635468
 ] 

deepankar edited comment on HBASE-16630 at 11/4/16 6:49 AM:


Yeah I am looking into stacks suggestion to counter the problem I have 
previously mentioned about re-eviction just cached blocks


was (Author: dvdreddy):
Yeah I am looking into stacks suggestion to counter the problem I have 
previously mentioned about re-evicted just cached blocks

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch, 
> HBASE-16630-v2.patch, HBASE-16630-v3.patch, HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16878) Call Queue length enforced not accounting for the queue handler factor

2016-10-18 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-16878:
--
Attachment: HBASE-16878.patch

> Call Queue length enforced not accounting for the queue handler factor
> --
>
> Key: HBASE-16878
> URL: https://issues.apache.org/jira/browse/HBASE-16878
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.0.0, 0.98.4
>Reporter: deepankar
>Assignee: deepankar
>Priority: Minor
> Attachments: HBASE-16878.patch
>
>
> After HBASE-11355 currently we are not accounting for the handler factor when 
> deciding callQueue length, this leading to some pretty large queue sizes if 
> the handler factor is high enough.
> Patch attached, change is oneline



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16878) Call Queue length enforced not accounting for the queue handler factor

2016-10-18 Thread deepankar (JIRA)
deepankar created HBASE-16878:
-

 Summary: Call Queue length enforced not accounting for the queue 
handler factor
 Key: HBASE-16878
 URL: https://issues.apache.org/jira/browse/HBASE-16878
 Project: HBase
  Issue Type: Bug
  Components: rpc
Affects Versions: 0.98.4, 1.0.0
Reporter: deepankar
Assignee: deepankar
Priority: Minor


After HBASE-11355 currently we are not accounting for the handler factor when 
deciding callQueue length, this leading to some pretty large queue sizes if the 
handler factor is high enough.

Patch attached, change is oneline






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-10-16 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15580415#comment-15580415
 ] 

deepankar commented on HBASE-16630:
---

Sorry for the delayed reply, the bucket cache is not stuck as before atleast it 
is evicting blocks and adding on new blocks, but we are occasionally facing a 
different issues especially under peak load that this heuristic we have is 
repeatedly eviciting the same blocks again and again and that is leading very 
high load average some times. We are trying to come up with some tweaks by 
getting into consideration the total blocks allocated for a bucket also. if you 
have any other suggestions we can try them out too.

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: 16630-v2-suggest.patch, 16630-v3-suggest.patch, 
> HBASE-16630-v2.patch, HBASE-16630-v3.patch, HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-22 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514299#comment-15514299
 ] 

deepankar commented on HBASE-16630:
---

Which map are you referring to [~vrodionov] , I am constructing a couple of 
sets which are atmost 0(bucket  count).

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: 16630-v2-suggest.patch, HBASE-16630-v2.patch, 
> HBASE-16630-v3.patch, HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-22 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-16630:
--
Attachment: HBASE-16630-v3.patch

Sorry for delay in adding the suggestion [~tedyu], I attached a patch now which 
in addition to your suggestions contains a couple of import fixes. Also I 
tested the patch on couple of machines, every thing looked fine. we are doing a 
cluster wide deploy today, will report on that results

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: 16630-v2-suggest.patch, HBASE-16630-v2.patch, 
> HBASE-16630-v3.patch, HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-20 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508469#comment-15508469
 ] 

deepankar commented on HBASE-16630:
---

I used it predominently for limiting the number of buckets we needed, if use a 
treeset we might still need to do the copy.

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630-v2.patch, HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-20 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-16630:
--
Attachment: HBASE-16630-v2.patch

Attached v2, this patch changes the logic of defragmentation to the one 
suggested [~vrodionov] and the one used by MemCached where they just choose 
some slabs and evict them completely the code turned out to be much cleaner and 
smaller. The only heuristic is that we avoid the buckets that are the only 
buckets for a BucketSizeInfo and the ones that have blocks that are currently 
in use and order them by their occupancy ratio.

Also added the change suggested [~zjushch] to add the memory type to the 
eviction. 

If this logic looks fine to the community I will go ahead and add a unit test 
to the freeEntireBlocks method

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630-v2.patch, HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-19 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505115#comment-15505115
 ] 

deepankar edited comment on HBASE-16630 at 9/20/16 12:17 AM:
-

Yup looks like this is exactly what we want, the idea is to pick a random slab 
and evict all elements from it, but we can be a little bit more clever here and 
pick the slabs that would have least number of elements with in them so that we 
will do least number of reads to re-cache the elements from them. This idea is 
simple and could be done easily with current code structure also I think with 
small amount of code. [~vrodionov] what do you think about this heuristic ? I 
think this will also address the concerns [~zjushch] raised

[~vrodionov]Thanks for the idea


was (Author: dvdreddy):
Yup looks like this is exactly what we want, the idea is to pick a random slab 
and evict all elements from it, but we can be a little bit more clever here and 
pick the slabs that would have least number of elements with in them so that we 
will do least number of reads to re-cache the elements from them. This idea is 
simple and could be done easily with current code structure also I think with 
small amount of code. [~vrodionov] do you think ? This will also address the 
concerns [~zjushch] raised

[~vrodionov]Thanks for the idea

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-19 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505115#comment-15505115
 ] 

deepankar commented on HBASE-16630:
---

Yup looks like this is exactly what we want, the idea is to pick a random slab 
and evict all elements from it, but we can be a little bit more clever here and 
pick the slabs that would have least number of elements with in them so that we 
will do least number of reads to re-cache the elements from them. This idea is 
simple and could be done easily with current code structure also I think with 
small amount of code. [~vrodionov] do you think ? This will also address the 
concerns [~zjushch] raised

[~vrodionov]Thanks for the idea

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-19 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504536#comment-15504536
 ] 

deepankar commented on HBASE-16630:
---

This is very good idea to make sure all the buckets have freeCount, since there 
are no free counts, one way to enforce this would be force the eviction from 
that BucketSizeInfo which would lead to eviction of hot blocks also if the 
current buckets of that BucketSize is small and also it would also lockup the 
unused and fragmented space from the other bucketSizes until a compaction or 
something else frees them up.

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-19 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504518#comment-15504518
 ] 

deepankar commented on HBASE-16630:
---

Ha, nice point, I missed this, but this would still not guarantee, that we 
would freeup the buckets needed for the given BucketSize in a single run, it 
might need several runs free-up a completely free bucket and that would 
transfer into the our hot BucketSize. One other reason it might be slow is that 
bytesToFreeWithExtra will depend on the size of the currently hot BucketSize 
(for which there are no free blocks) and that is usually small and the 
transition might be slow. Another problem is that due to sparsely occupied 
buckets we might encounter keep ending up in the freeSpace method again and 
again. 

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-13 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489482#comment-15489482
 ] 

deepankar commented on HBASE-16630:
---

It is moving data, but the storage of the data (i.e byte buffers ) are 
pre-allocted at the beginning of the HRegionServer so it is more like a copy of 
the data from one to another location with out any new allocation. The old 
allocation will go on to a free list to be reused for another block of data. 

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-13 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489296#comment-15489296
 ] 

deepankar commented on HBASE-16630:
---

Commented on the comment above, SIGSEV should not be an issue as we are not 
actually deallocating memory and also we don't relocate the blocks for which 
there is an existing read that is going on (we use the check on the refCount).

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-13 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489291#comment-15489291
 ] 

deepankar commented on HBASE-16630:
---

The only problem is currently some IOEngines (Ex:- ByteBufferIOEngine) has a 
hard requirement that the bytebuffer passed to them for writing needs to be 
backed by an array, this is the sole reason for this copy, if we can remove 
that we can skip the copy. I think the main reason for this I think ByteBuff 
does not have the api methods I think for copying from a ByteBuffer I will take 
a look and see if this can be avoided.

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-13 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489287#comment-15489287
 ] 

deepankar commented on HBASE-16630:
---

This is correct, Defragmentation does not affect anything in the RPC response 
path, it happens in the said background WriterThread.

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-13 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489282#comment-15489282
 ] 

deepankar commented on HBASE-16630:
---

I'll remove the full lock, essentially the lock is to make sure only one write 
is happening, reads are unaffected as we skip the blocks that are currently 
being used in the read by checking their refCount. SIGSEV will also not an 
issue as this Defragmentation is the software one, underneath everything is 
backed by a large array of byteBuffers we are not allocating / deallocating 
anything we are just overwriting / releasing from the java level meta data

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-13 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488995#comment-15488995
 ] 

deepankar commented on HBASE-16630:
---

ping [~stack], [~anoop.hbase], [~ram_krish], any suggestions ?

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-13 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-16630:
--
Attachment: HBASE-16630.patch

> Fragmentation in long running Bucket Cache
> --
>
> Key: HBASE-16630
> URL: https://issues.apache.org/jira/browse/HBASE-16630
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-16630.patch
>
>
> As we are running bucket cache for a long time in our system, we are 
> observing cases where some nodes after some time does not fully utilize the 
> bucket cache, in some cases it is even worse in the sense they get stuck at a 
> value < 0.25 % of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables 
> are configured in-memory for simplicity sake).
> We took a heap dump and analyzed what is happening and saw that is classic 
> case of fragmentation, current implementation of BucketCache (mainly 
> BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
> switching/adjusting cache usage between different bucketSizes . But once a 
> compaction / bulkload happens and the blocks are evicted from a bucket size , 
> these are usually evicted from random places of the buckets of a bucketSize 
> and thus locking the number of buckets associated with a bucketSize and in 
> the worst case of the fragmentation we have seen some bucketSizes with 
> occupancy ratio of <  10 % But they dont have any completelyFreeBuckets to 
> share with the other bucketSize. 
> Currently the existing eviction logic helps in the cases where cache used is 
> more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also 
> done, the eviction (freeSpace function) will not evict anything and the cache 
> utilization will be stuck at that value without any allocations for other 
> required sizes.
> The fix for this we came up with is simple that we do deFragmentation ( 
> compaction) of the bucketSize and thus increasing the occupancy ratio and 
> also freeing up the buckets to be fullyFree, this logic itself is not 
> complicated as the bucketAllocator takes care of packing the blocks in the 
> buckets, we need evict and re-allocate the blocks for all the BucketSizes 
> that dont fit the criteria.
> I am attaching an initial patch just to give an idea of what we are thinking 
> and I'll improve it based on the comments from the community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16630) Fragmentation in long running Bucket Cache

2016-09-13 Thread deepankar (JIRA)
deepankar created HBASE-16630:
-

 Summary: Fragmentation in long running Bucket Cache
 Key: HBASE-16630
 URL: https://issues.apache.org/jira/browse/HBASE-16630
 Project: HBase
  Issue Type: Bug
  Components: BucketCache
Affects Versions: 1.2.3, 1.1.6, 2.0.0, 1.3.1
Reporter: deepankar
Assignee: deepankar


As we are running bucket cache for a long time in our system, we are observing 
cases where some nodes after some time does not fully utilize the bucket cache, 
in some cases it is even worse in the sense they get stuck at a value < 0.25 % 
of the bucket cache (DEFAULT_MEMORY_FACTOR as all our tables are configured 
in-memory for simplicity sake).

We took a heap dump and analyzed what is happening and saw that is classic case 
of fragmentation, current implementation of BucketCache (mainly 
BucketAllocator) relies on the logic that fullyFreeBuckets are available for 
switching/adjusting cache usage between different bucketSizes . But once a 
compaction / bulkload happens and the blocks are evicted from a bucket size , 
these are usually evicted from random places of the buckets of a bucketSize and 
thus locking the number of buckets associated with a bucketSize and in the 
worst case of the fragmentation we have seen some bucketSizes with occupancy 
ratio of <  10 % But they dont have any completelyFreeBuckets to share with the 
other bucketSize. 

Currently the existing eviction logic helps in the cases where cache used is 
more the MEMORY_FACTOR or MULTI_FACTOR and once those evictions are also done, 
the eviction (freeSpace function) will not evict anything and the cache 
utilization will be stuck at that value without any allocations for other 
required sizes.

The fix for this we came up with is simple that we do deFragmentation ( 
compaction) of the bucketSize and thus increasing the occupancy ratio and also 
freeing up the buckets to be fullyFree, this logic itself is not complicated as 
the bucketAllocator takes care of packing the blocks in the buckets, we need 
evict and re-allocate the blocks for all the BucketSizes that dont fit the 
criteria.

I am attaching an initial patch just to give an idea of what we are thinking 
and I'll improve it based on the comments from the community.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16624) MVCC DeSerialization bug in the HFileScannerImpl

2016-09-13 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486470#comment-15486470
 ] 

deepankar commented on HBASE-16624:
---

Inorder to test we may need to refactor and isolate this function and the 
reason I think this issue came up was because this is a very hot method and was 
optimized to make sure there  are unnecessary methdo redirections to make sure 
jit works and inlines it.

> MVCC DeSerialization bug in the HFileScannerImpl
> 
>
> Key: HBASE-16624
> URL: https://issues.apache.org/jira/browse/HBASE-16624
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: deepankar
>Priority: Blocker
> Attachments: HBASE-16624.patch
>
>
> My colleague [~naggarwal] found a bug in the deserialization of mvcc from 
> HFile, As a part of the optimization of deserialization of VLong, we read a 
> int at once but we forgot to convert it to unsigned one. 
> This would cause issues because once we cross the integer threshold in 
> sequenceId and a compaction happens we would write MAX_MEMSTORE_TS in the 
> trailer as 0 (because we will be reading negative values from the file that 
> got flushed with sequenceId > Integer.MAX_VALUE). And once we have 
> MAX_MEMSTORE_TS as 0, and there are sequenceId values present alongside with 
> KeyValues the regionserver will now start failing to read the compacted file 
> and thus corruption. 
> Interestingly this would happen only on the tables that don't have  
> DataBlockEncoding enabled and unfortunately in our case that turned out to be 
> META and a another small table.
> Fix is small (~20 chars) and attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-16624) MVCC DeSerialization bug in the HFileScannerImpl

2016-09-13 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486459#comment-15486459
 ] 

deepankar edited comment on HBASE-16624 at 9/13/16 6:46 AM:


Function is nested deep inside the HFileScannerImpl and it was hard to come up 
with a UT. But if you can pass the sequenceId above the Integer.MAX_VALUE you 
would immediately see the issue, to confirm the issue I have changed the 
function as below and tested it with values passed below

{code}
/**
 * Actually do the mvcc read. Does no checks.
 * @param offsetFromPos
 */
private long _readMvccVersion(ByteBuff blockBuffer) {
long currMemstoreTS = 0;
int offsetFromPos = 0;
// This is Bytes#bytesToVint inlined so can save a few instructions in 
this hot method; i.e.
// previous if one-byte vint, we'd redo the vint call to find int size.
// Also the method is kept small so can be inlined.
byte firstByte = blockBuffer.getByteAfterPosition(offsetFromPos);
int len = WritableUtils.decodeVIntSize(firstByte);
if (len == 1) {
currMemstoreTS = firstByte;
} else {
int remaining = len -1;
long i = 0;
offsetFromPos++;
if (remaining >= Bytes.SIZEOF_INT) {
// The int read has to be converted to unsigned long so the & op
i = (blockBuffer.getIntAfterPosition(offsetFromPos) & 
0xL);
remaining -= Bytes.SIZEOF_INT;
offsetFromPos += Bytes.SIZEOF_INT;
}
if (remaining >= Bytes.SIZEOF_SHORT) {
short s = blockBuffer.getShortAfterPosition(offsetFromPos);
i = i << 16;
i = i | (s & 0x);
remaining -= Bytes.SIZEOF_SHORT;
offsetFromPos += Bytes.SIZEOF_SHORT;
}
for (int idx = 0; idx < remaining; idx++) {
byte b = blockBuffer.getByteAfterPosition(offsetFromPos + idx);
i = i << 8;
i = i | (b & 0xFF);
}
currMemstoreTS = (WritableUtils.isNegativeVInt(firstByte) ? ~i : i);
}
return currMemstoreTS;
}
{code}

And I passed the byteBuff I got from

{code}
ByteArrayDataOutput out = ByteStreams.newDataOutput();
WritableUtils.writeVLong(out, 3085788160L);
new SingleByteBuff(ByteBuffer.wrap(out.toByteArray()))
{code}


was (Author: dvdreddy):
Function is nested deep inside the HFileScannerImpl and it was hard to come up 
with a UT. But if you can cross the sequenceId above the Integer.MAX_VALUE you 
would immediately see the issue, to confirm the issue I have changed the 
function 

{code}
/**
 * Actually do the mvcc read. Does no checks.
 * @param offsetFromPos
 */
private long _readMvccVersion(ByteBuff blockBuffer) {
long currMemstoreTS = 0;
int offsetFromPos = 0;
// This is Bytes#bytesToVint inlined so can save a few instructions in 
this hot method; i.e.
// previous if one-byte vint, we'd redo the vint call to find int size.
// Also the method is kept small so can be inlined.
byte firstByte = blockBuffer.getByteAfterPosition(offsetFromPos);
int len = WritableUtils.decodeVIntSize(firstByte);
if (len == 1) {
currMemstoreTS = firstByte;
} else {
int remaining = len -1;
long i = 0;
offsetFromPos++;
if (remaining >= Bytes.SIZEOF_INT) {
// The int read has to be converted to unsigned long so the & op
i = (blockBuffer.getIntAfterPosition(offsetFromPos) & 
0xL);
remaining -= Bytes.SIZEOF_INT;
offsetFromPos += Bytes.SIZEOF_INT;
}
if (remaining >= Bytes.SIZEOF_SHORT) {
short s = blockBuffer.getShortAfterPosition(offsetFromPos);
i = i << 16;
i = i | (s & 0x);
remaining -= Bytes.SIZEOF_SHORT;
offsetFromPos += Bytes.SIZEOF_SHORT;
}
for (int idx = 0; idx < remaining; idx++) {
byte b = blockBuffer.getByteAfterPosition(offsetFromPos + idx);
i = i << 8;
i = i | (b & 0xFF);
}
currMemstoreTS = (WritableUtils.isNegativeVInt(firstByte) ? ~i : i);
}
return currMemstoreTS;
}
{code}

And I passed the byteBuff I got from

{code}
ByteArrayDataOutput out = ByteStreams.newDataOutput();
WritableUtils.writeVLong(out, 3085788160L);
new SingleByteBuff(ByteBuffer.wrap(out.toByteArray()))
{code}

> MVCC DeSerialization bug in the HFileScannerImpl
> 
>
> Key: HBASE-16624
> URL: 

[jira] [Commented] (HBASE-16624) MVCC DeSerialization bug in the HFileScannerImpl

2016-09-13 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486459#comment-15486459
 ] 

deepankar commented on HBASE-16624:
---

Function is nested deep inside the HFileScannerImpl and it was hard to come up 
with a UT. But if you can cross the sequenceId above the Integer.MAX_VALUE you 
would immediately see the issue, to confirm the issue I have changed the 
function 

{code}
/**
 * Actually do the mvcc read. Does no checks.
 * @param offsetFromPos
 */
private long _readMvccVersion(ByteBuff blockBuffer) {
long currMemstoreTS = 0;
int offsetFromPos = 0;
// This is Bytes#bytesToVint inlined so can save a few instructions in 
this hot method; i.e.
// previous if one-byte vint, we'd redo the vint call to find int size.
// Also the method is kept small so can be inlined.
byte firstByte = blockBuffer.getByteAfterPosition(offsetFromPos);
int len = WritableUtils.decodeVIntSize(firstByte);
if (len == 1) {
currMemstoreTS = firstByte;
} else {
int remaining = len -1;
long i = 0;
offsetFromPos++;
if (remaining >= Bytes.SIZEOF_INT) {
// The int read has to be converted to unsigned long so the & op
i = (blockBuffer.getIntAfterPosition(offsetFromPos) & 
0xL);
remaining -= Bytes.SIZEOF_INT;
offsetFromPos += Bytes.SIZEOF_INT;
}
if (remaining >= Bytes.SIZEOF_SHORT) {
short s = blockBuffer.getShortAfterPosition(offsetFromPos);
i = i << 16;
i = i | (s & 0x);
remaining -= Bytes.SIZEOF_SHORT;
offsetFromPos += Bytes.SIZEOF_SHORT;
}
for (int idx = 0; idx < remaining; idx++) {
byte b = blockBuffer.getByteAfterPosition(offsetFromPos + idx);
i = i << 8;
i = i | (b & 0xFF);
}
currMemstoreTS = (WritableUtils.isNegativeVInt(firstByte) ? ~i : i);
}
return currMemstoreTS;
}
{code}

And I passed the byteBuff I got from

{code}
ByteArrayDataOutput out = ByteStreams.newDataOutput();
WritableUtils.writeVLong(out, 3085788160L);
new SingleByteBuff(ByteBuffer.wrap(out.toByteArray()))
{code}

> MVCC DeSerialization bug in the HFileScannerImpl
> 
>
> Key: HBASE-16624
> URL: https://issues.apache.org/jira/browse/HBASE-16624
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: deepankar
>Priority: Blocker
> Attachments: HBASE-16624.patch
>
>
> My colleague [~naggarwal] found a bug in the deserialization of mvcc from 
> HFile, As a part of the optimization of deserialization of VLong, we read a 
> int at once but we forgot to convert it to unsigned one. 
> This would cause issues because once we cross the integer threshold in 
> sequenceId and a compaction happens we would write MAX_MEMSTORE_TS in the 
> trailer as 0 (because we will be reading negative values from the file that 
> got flushed with sequenceId > Integer.MAX_VALUE). And once we have 
> MAX_MEMSTORE_TS as 0, and there are sequenceId values present alongside with 
> KeyValues the regionserver will now start failing to read the compacted file 
> and thus corruption. 
> Interestingly this would happen only on the tables that don't have  
> DataBlockEncoding enabled and unfortunately in our case that turned out to be 
> META and a another small table.
> Fix is small (~20 chars) and attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15525) OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts

2016-06-12 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326841#comment-15326841
 ] 

deepankar commented on HBASE-15525:
---

Pulled in the latest patch and deployed on one machine, working as expected no 
anomalies seen. Thanks for the patch.

 +1

> OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts
> --
>
> Key: HBASE-15525
> URL: https://issues.apache.org/jira/browse/HBASE-15525
> Project: HBase
>  Issue Type: Sub-task
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15525_V1.patch, HBASE-15525_V2.patch, 
> HBASE-15525_V3.patch, HBASE-15525_V4.patch, HBASE-15525_WIP.patch, WIP.patch
>
>
> After HBASE-13819 the system some times run out of direct memory whenever 
> there is some network congestion or some client side issues.
> This was because of pending RPCs in the RPCServer$Connection.responseQueue 
> and since all the responses in this queue hold a buffer for cellblock from 
> BoundedByteBufferPool this could takeup a lot of memory if the 
> BoundedByteBufferPool's moving average settles down towards a higher value 
> See the discussion here 
> [HBASE-13819-comment|https://issues.apache.org/jira/browse/HBASE-13819?focusedCommentId=15207822=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15207822]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15525) OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts

2016-05-06 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273688#comment-15273688
 ] 

deepankar commented on HBASE-15525:
---

We pulled in this patch and ran it on one of our production systems, except 
couple of places where we missed  returning  back the buffers, the patch solves 
this issue in an excellent way, and only allocates the exact optimal amount of 
buffers needed (with limit of 4096 only allocating ~300 only) , this is the 
difference that we had to add to this patch to fix the minor leaks

{code}
diff --git 
a/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/IPCUtil.java 
b/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/IPCUtil.java
index 1584a40..378459c 100644
--- a/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/IPCUtil.java
+++ b/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/IPCUtil.java
@@ -158,7 +158,13 @@ public class IPCUtil {
 assert pool != null;
 PoolAwareByteBuffersOutputStream pbbos = new 
PoolAwareByteBuffersOutputStream(pool);
 encodeCellsTo(pbbos, cellScanner, codec, compressor);
-if (pbbos.size() == 0) return null;
+if (pbbos.size() == 0) {
+  pbbos.closeAndPutbackBuffers();
+  return null;
+}
 return pbbos;
   }

diff --git 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
index 517339a..93fac4e 100644
--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
@@ -1184,6 +1184,7 @@ public class RpcServer implements RpcServerInterface {
 if (error) {
   LOG.debug(getName() + call.toShortString() + ": output error -- 
closing");
   closeConnection(call.connection);
+  call.done();
 }
   }
{code}

Thanks [~anoop.hbase] for the excellent patch and help in debugging the above 
issue.

> OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts
> --
>
> Key: HBASE-15525
> URL: https://issues.apache.org/jira/browse/HBASE-15525
> Project: HBase
>  Issue Type: Sub-task
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15525_V1.patch, HBASE-15525_V2.patch, 
> HBASE-15525_WIP.patch, WIP.patch
>
>
> After HBASE-13819 the system some times run out of direct memory whenever 
> there is some network congestion or some client side issues.
> This was because of pending RPCs in the RPCServer$Connection.responseQueue 
> and since all the responses in this queue hold a buffer for cellblock from 
> BoundedByteBufferPool this could takeup a lot of memory if the 
> BoundedByteBufferPool's moving average settles down towards a higher value 
> See the discussion here 
> [HBASE-13819-comment|https://issues.apache.org/jira/browse/HBASE-13819?focusedCommentId=15207822=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15207822]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15691) Port HBASE-10205 (ConcurrentModificationException in BucketAllocator) to branch-1

2016-04-28 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262987#comment-15262987
 ] 

deepankar commented on HBASE-15691:
---

Oh ok thanks for clarifying.

> Port HBASE-10205 (ConcurrentModificationException in BucketAllocator) to 
> branch-1
> -
>
> Key: HBASE-15691
> URL: https://issues.apache.org/jira/browse/HBASE-15691
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 1.3.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 1.3.0, 1.2.2
>
> Attachments: HBASE-15691-branch-1.patch
>
>
> HBASE-10205 was committed to trunk and 0.98 branches only. To preserve 
> continuity we should commit it to branch-1. The change requires more than 
> nontrivial fixups so I will attach a backport of the change from trunk to 
> current branch-1 here. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15691) Port HBASE-10205 (ConcurrentModificationException in BucketAllocator) to branch-1

2016-04-28 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262800#comment-15262800
 ] 

deepankar commented on HBASE-15691:
---

Took a look at the patch, Is there a reason why we are changing back to linked 
lists?, I am still seeing LinkedMaps on the master, moving back to linkedLists 
will probably undo the optimizations done in HBASE-14624. 

> Port HBASE-10205 (ConcurrentModificationException in BucketAllocator) to 
> branch-1
> -
>
> Key: HBASE-15691
> URL: https://issues.apache.org/jira/browse/HBASE-15691
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 1.3.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 1.3.0, 1.2.2
>
> Attachments: HBASE-15691-branch-1.patch
>
>
> HBASE-10205 was committed to trunk and 0.98 branches only. To preserve 
> continuity we should commit it to branch-1. The change requires more than 
> nontrivial fixups so I will attach a backport of the change from trunk to 
> current branch-1 here. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10205) ConcurrentModificationException in BucketAllocator

2016-04-17 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245080#comment-15245080
 ] 

deepankar commented on HBASE-10205:
---

[~stack] and [~enis] Looks like this patch has not been committed on branch-1, 
is there something I am missing, or it is missed by mistake ? I checked by 
using the command {{git log --pretty=format:"%ad  %h  %s %an" --date=short | 
grep "HBASE-10205"}}.

Thanks

> ConcurrentModificationException in BucketAllocator
> --
>
> Key: HBASE-10205
> URL: https://issues.apache.org/jira/browse/HBASE-10205
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.89-fb
>Reporter: Arjen Roodselaar
>Assignee: Arjen Roodselaar
>Priority: Minor
> Fix For: 0.89-fb, 0.99.0, 2.0.0, 0.98.6
>
> Attachments: hbase-10205-trunk.patch
>
>
> The BucketCache WriterThread calls BucketCache.freeSpace() upon draining the 
> RAM queue containing entries to be cached. freeSpace() in turn calls 
> BucketSizeInfo.statistics() through BucketAllocator.getIndexStatistics(), 
> which iterates over 'bucketList'. At the same time another WriterThread might 
> call BucketAllocator.allocateBlock(), which may call 
> BucketSizeInfo.allocateBlock(), add a bucket to 'bucketList' and consequently 
> cause a ConcurrentModificationException. Calls to 
> BucketAllocator.allocateBlock() are synchronized, but calls to 
> BucketAllocator.getIndexStatistics() are not, which allows this race to occur.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does NOT count CellScanner payload

2016-04-04 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15223789#comment-15223789
 ] 

deepankar commented on HBASE-15437:
---

Sorry for the mistake, fixed it now.

bq. Its pity that we dont pass Call to this method and so we need to do get it 
from ThreadLoal which will have slight perf overhead. And this method is marked 
non private.

Yeah I tried to some how get the Call to this method (I was looking all the 
callee's of this method actually send the params by taking them from call 
object), but felt like it turns out to be a bigger refactoring process and  
might also break some external facing pluggable APIs and then decided to take 
this route.

> Response size calculated in RPCServer for warning tooLarge responses does NOT 
> count CellScanner payload
> ---
>
> Key: HBASE-15437
> URL: https://issues.apache.org/jira/browse/HBASE-15437
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-15437-v1.patch, HBASE-15437.patch
>
>
> After HBASE-13158 where we respond back to RPCs with cells in the payload , 
> the protobuf response will just have the count the cells to read from 
> payload, but there are set of features where we log warn in RPCServer 
> whenever the response is tooLarge, but this size now is not considering the 
> sizes of the cells in the PayloadCellScanner. Code form RPCServer
> {code}
>   long responseSize = result.getSerializedSize();
>   // log any RPC responses that are slower than the configured warn
>   // response time or larger than configured warning size
>   boolean tooSlow = (processingTime > warnResponseTime && 
> warnResponseTime > -1);
>   boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize 
> > -1);
>   if (tooSlow || tooLarge) {
> // when tagging, we let TooLarge trump TooSmall to keep output simple
> // note that large responses will often also be slow.
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
>   }
> {code}
> Should this feature be not supported any more or should we add a method to 
> CellScanner or a new interface which returns the serialized size (but this 
> might not include the compression codecs which might be used during response 
> ?) Any other Idea this could be fixed ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does NOT count CellScanner payload

2016-04-04 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-15437:
--
Attachment: HBASE-15437-v1.patch

> Response size calculated in RPCServer for warning tooLarge responses does NOT 
> count CellScanner payload
> ---
>
> Key: HBASE-15437
> URL: https://issues.apache.org/jira/browse/HBASE-15437
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-15437-v1.patch, HBASE-15437.patch
>
>
> After HBASE-13158 where we respond back to RPCs with cells in the payload , 
> the protobuf response will just have the count the cells to read from 
> payload, but there are set of features where we log warn in RPCServer 
> whenever the response is tooLarge, but this size now is not considering the 
> sizes of the cells in the PayloadCellScanner. Code form RPCServer
> {code}
>   long responseSize = result.getSerializedSize();
>   // log any RPC responses that are slower than the configured warn
>   // response time or larger than configured warning size
>   boolean tooSlow = (processingTime > warnResponseTime && 
> warnResponseTime > -1);
>   boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize 
> > -1);
>   if (tooSlow || tooLarge) {
> // when tagging, we let TooLarge trump TooSmall to keep output simple
> // note that large responses will often also be slow.
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
>   }
> {code}
> Should this feature be not supported any more or should we add a method to 
> CellScanner or a new interface which returns the serialized size (but this 
> might not include the compression codecs which might be used during response 
> ?) Any other Idea this could be fixed ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does NOT count CellScanner payload

2016-04-04 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-15437:
--
Status: Patch Available  (was: Open)

> Response size calculated in RPCServer for warning tooLarge responses does NOT 
> count CellScanner payload
> ---
>
> Key: HBASE-15437
> URL: https://issues.apache.org/jira/browse/HBASE-15437
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-15437-v1.patch, HBASE-15437.patch
>
>
> After HBASE-13158 where we respond back to RPCs with cells in the payload , 
> the protobuf response will just have the count the cells to read from 
> payload, but there are set of features where we log warn in RPCServer 
> whenever the response is tooLarge, but this size now is not considering the 
> sizes of the cells in the PayloadCellScanner. Code form RPCServer
> {code}
>   long responseSize = result.getSerializedSize();
>   // log any RPC responses that are slower than the configured warn
>   // response time or larger than configured warning size
>   boolean tooSlow = (processingTime > warnResponseTime && 
> warnResponseTime > -1);
>   boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize 
> > -1);
>   if (tooSlow || tooLarge) {
> // when tagging, we let TooLarge trump TooSmall to keep output simple
> // note that large responses will often also be slow.
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
>   }
> {code}
> Should this feature be not supported any more or should we add a method to 
> CellScanner or a new interface which returns the serialized size (but this 
> might not include the compression codecs which might be used during response 
> ?) Any other Idea this could be fixed ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does NOT count CellScanner payload

2016-04-04 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-15437:
--
Status: Open  (was: Patch Available)

> Response size calculated in RPCServer for warning tooLarge responses does NOT 
> count CellScanner payload
> ---
>
> Key: HBASE-15437
> URL: https://issues.apache.org/jira/browse/HBASE-15437
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-15437-v1.patch, HBASE-15437.patch
>
>
> After HBASE-13158 where we respond back to RPCs with cells in the payload , 
> the protobuf response will just have the count the cells to read from 
> payload, but there are set of features where we log warn in RPCServer 
> whenever the response is tooLarge, but this size now is not considering the 
> sizes of the cells in the PayloadCellScanner. Code form RPCServer
> {code}
>   long responseSize = result.getSerializedSize();
>   // log any RPC responses that are slower than the configured warn
>   // response time or larger than configured warning size
>   boolean tooSlow = (processingTime > warnResponseTime && 
> warnResponseTime > -1);
>   boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize 
> > -1);
>   if (tooSlow || tooLarge) {
> // when tagging, we let TooLarge trump TooSmall to keep output simple
> // note that large responses will often also be slow.
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
>   }
> {code}
> Should this feature be not supported any more or should we add a method to 
> CellScanner or a new interface which returns the serialized size (but this 
> might not include the compression codecs which might be used during response 
> ?) Any other Idea this could be fixed ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does NOT count CellScanner payload

2016-04-03 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15223672#comment-15223672
 ] 

deepankar commented on HBASE-15437:
---

Attached patch following suggestions from Anoop and Enis

> Response size calculated in RPCServer for warning tooLarge responses does NOT 
> count CellScanner payload
> ---
>
> Key: HBASE-15437
> URL: https://issues.apache.org/jira/browse/HBASE-15437
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-15437.patch
>
>
> After HBASE-13158 where we respond back to RPCs with cells in the payload , 
> the protobuf response will just have the count the cells to read from 
> payload, but there are set of features where we log warn in RPCServer 
> whenever the response is tooLarge, but this size now is not considering the 
> sizes of the cells in the PayloadCellScanner. Code form RPCServer
> {code}
>   long responseSize = result.getSerializedSize();
>   // log any RPC responses that are slower than the configured warn
>   // response time or larger than configured warning size
>   boolean tooSlow = (processingTime > warnResponseTime && 
> warnResponseTime > -1);
>   boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize 
> > -1);
>   if (tooSlow || tooLarge) {
> // when tagging, we let TooLarge trump TooSmall to keep output simple
> // note that large responses will often also be slow.
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
>   }
> {code}
> Should this feature be not supported any more or should we add a method to 
> CellScanner or a new interface which returns the serialized size (but this 
> might not include the compression codecs which might be used during response 
> ?) Any other Idea this could be fixed ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does NOT count CellScanner payload

2016-04-03 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-15437:
--
Attachment: HBASE-15437.patch

> Response size calculated in RPCServer for warning tooLarge responses does NOT 
> count CellScanner payload
> ---
>
> Key: HBASE-15437
> URL: https://issues.apache.org/jira/browse/HBASE-15437
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
> Attachments: HBASE-15437.patch
>
>
> After HBASE-13158 where we respond back to RPCs with cells in the payload , 
> the protobuf response will just have the count the cells to read from 
> payload, but there are set of features where we log warn in RPCServer 
> whenever the response is tooLarge, but this size now is not considering the 
> sizes of the cells in the PayloadCellScanner. Code form RPCServer
> {code}
>   long responseSize = result.getSerializedSize();
>   // log any RPC responses that are slower than the configured warn
>   // response time or larger than configured warning size
>   boolean tooSlow = (processingTime > warnResponseTime && 
> warnResponseTime > -1);
>   boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize 
> > -1);
>   if (tooSlow || tooLarge) {
> // when tagging, we let TooLarge trump TooSmall to keep output simple
> // note that large responses will often also be slow.
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
>   }
> {code}
> Should this feature be not supported any more or should we add a method to 
> CellScanner or a new interface which returns the serialized size (but this 
> might not include the compression codecs which might be used during response 
> ?) Any other Idea this could be fixed ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does NOT count CellScanner payload

2016-04-03 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-15437:
--
Assignee: deepankar
  Status: Patch Available  (was: Open)

> Response size calculated in RPCServer for warning tooLarge responses does NOT 
> count CellScanner payload
> ---
>
> Key: HBASE-15437
> URL: https://issues.apache.org/jira/browse/HBASE-15437
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-15437.patch
>
>
> After HBASE-13158 where we respond back to RPCs with cells in the payload , 
> the protobuf response will just have the count the cells to read from 
> payload, but there are set of features where we log warn in RPCServer 
> whenever the response is tooLarge, but this size now is not considering the 
> sizes of the cells in the PayloadCellScanner. Code form RPCServer
> {code}
>   long responseSize = result.getSerializedSize();
>   // log any RPC responses that are slower than the configured warn
>   // response time or larger than configured warning size
>   boolean tooSlow = (processingTime > warnResponseTime && 
> warnResponseTime > -1);
>   boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize 
> > -1);
>   if (tooSlow || tooLarge) {
> // when tagging, we let TooLarge trump TooSmall to keep output simple
> // note that large responses will often also be slow.
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
>   }
> {code}
> Should this feature be not supported any more or should we add a method to 
> CellScanner or a new interface which returns the serialized size (but this 
> might not include the compression codecs which might be used during response 
> ?) Any other Idea this could be fixed ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15525) OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts

2016-03-30 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218608#comment-15218608
 ] 

deepankar commented on HBASE-15525:
---

Sure happy to help

> OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts
> --
>
> Key: HBASE-15525
> URL: https://issues.apache.org/jira/browse/HBASE-15525
> Project: HBase
>  Issue Type: Sub-task
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Attachments: WIP.patch
>
>
> After HBASE-13819 the system some times run out of direct memory whenever 
> there is some network congestion or some client side issues.
> This was because of pending RPCs in the RPCServer$Connection.responseQueue 
> and since all the responses in this queue hold a buffer for cellblock from 
> BoundedByteBufferPool this could takeup a lot of memory if the 
> BoundedByteBufferPool's moving average settles down towards a higher value 
> See the discussion here 
> [HBASE-13819-comment|https://issues.apache.org/jira/browse/HBASE-13819?focusedCommentId=15207822=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15207822]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does NOT count CellScanner payload

2016-03-30 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218605#comment-15218605
 ] 

deepankar commented on HBASE-15437:
---

This means that responseTime warning will not contain responseSize and the 
responseSize will not contain information about responseTimes right ?

> Response size calculated in RPCServer for warning tooLarge responses does NOT 
> count CellScanner payload
> ---
>
> Key: HBASE-15437
> URL: https://issues.apache.org/jira/browse/HBASE-15437
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
>
> After HBASE-13158 where we respond back to RPCs with cells in the payload , 
> the protobuf response will just have the count the cells to read from 
> payload, but there are set of features where we log warn in RPCServer 
> whenever the response is tooLarge, but this size now is not considering the 
> sizes of the cells in the PayloadCellScanner. Code form RPCServer
> {code}
>   long responseSize = result.getSerializedSize();
>   // log any RPC responses that are slower than the configured warn
>   // response time or larger than configured warning size
>   boolean tooSlow = (processingTime > warnResponseTime && 
> warnResponseTime > -1);
>   boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize 
> > -1);
>   if (tooSlow || tooLarge) {
> // when tagging, we let TooLarge trump TooSmall to keep output simple
> // note that large responses will often also be slow.
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
>   }
> {code}
> Should this feature be not supported any more or should we add a method to 
> CellScanner or a new interface which returns the serialized size (but this 
> might not include the compression codecs which might be used during response 
> ?) Any other Idea this could be fixed ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15525) OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts

2016-03-29 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217435#comment-15217435
 ] 

deepankar commented on HBASE-15525:
---

Oh ok, sorry for the confusion thanks

> OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts
> --
>
> Key: HBASE-15525
> URL: https://issues.apache.org/jira/browse/HBASE-15525
> Project: HBase
>  Issue Type: Sub-task
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Attachments: WIP.patch
>
>
> After HBASE-13819 the system some times run out of direct memory whenever 
> there is some network congestion or some client side issues.
> This was because of pending RPCs in the RPCServer$Connection.responseQueue 
> and since all the responses in this queue hold a buffer for cellblock from 
> BoundedByteBufferPool this could takeup a lot of memory if the 
> BoundedByteBufferPool's moving average settles down towards a higher value 
> See the discussion here 
> [HBASE-13819-comment|https://issues.apache.org/jira/browse/HBASE-13819?focusedCommentId=15207822=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15207822]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15525) OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts

2016-03-29 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217417#comment-15217417
 ] 

deepankar commented on HBASE-15525:
---

bq. Sure I will correct that.. One issue is that when we have curBuf with 
remaining size 100 and there is a need for 110 bytes for a cell. So we will not 
use remaining 100 bytes from this buf but go to next BB. Will try to solve this 
also.. That will auto solve the one u found

Will that still has the same the just acquired buffer will stay empty so will 
the next buffer as the buffers we are using are of fixed size ?

Side note also verified from sample internal production data, there were some 
outliers which could trigger this (thought our data was clean enough).

> OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts
> --
>
> Key: HBASE-15525
> URL: https://issues.apache.org/jira/browse/HBASE-15525
> Project: HBase
>  Issue Type: Sub-task
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Attachments: WIP.patch
>
>
> After HBASE-13819 the system some times run out of direct memory whenever 
> there is some network congestion or some client side issues.
> This was because of pending RPCs in the RPCServer$Connection.responseQueue 
> and since all the responses in this queue hold a buffer for cellblock from 
> BoundedByteBufferPool this could takeup a lot of memory if the 
> BoundedByteBufferPool's moving average settles down towards a higher value 
> See the discussion here 
> [HBASE-13819-comment|https://issues.apache.org/jira/browse/HBASE-13819?focusedCommentId=15207822=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15207822]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15525) OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts

2016-03-29 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217378#comment-15217378
 ] 

deepankar commented on HBASE-15525:
---

A minor comment on the patch for the PoolAwareByteBufferOutputStream 
CheckSizeAndGrow is not taking into account the extra needed, this could lead 
to exception if we are copying a byte array which is bigger than the size of 
the byte buffer we get from pool (which is possible in case of KVs as there is 
no limit on sizes). we should either have a loop in case we are copying byte 
arrays / buffers (copyMethod) so that expansion could happen multiple times in 
case of Larger size copy.

> OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts
> --
>
> Key: HBASE-15525
> URL: https://issues.apache.org/jira/browse/HBASE-15525
> Project: HBase
>  Issue Type: Sub-task
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Attachments: WIP.patch
>
>
> After HBASE-13819 the system some times run out of direct memory whenever 
> there is some network congestion or some client side issues.
> This was because of pending RPCs in the RPCServer$Connection.responseQueue 
> and since all the responses in this queue hold a buffer for cellblock from 
> BoundedByteBufferPool this could takeup a lot of memory if the 
> BoundedByteBufferPool's moving average settles down towards a higher value 
> See the discussion here 
> [HBASE-13819-comment|https://issues.apache.org/jira/browse/HBASE-13819?focusedCommentId=15207822=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15207822]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15362) Compression Algorithm does not respect config params from hbase-site

2016-03-29 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216605#comment-15216605
 ] 

deepankar commented on HBASE-15362:
---

Sure will create a jira

> Compression Algorithm does not respect config params from hbase-site
> 
>
> Key: HBASE-15362
> URL: https://issues.apache.org/jira/browse/HBASE-15362
> Project: HBase
>  Issue Type: Sub-task
>Reporter: deepankar
>Assignee: deepankar
>Priority: Trivial
> Attachments: HBASE-15362.patch
>
>
> Compression creates conf using new Configuration() and this leads to it not 
> respecting the confs set in hbase-site, fixing it is trivial using 
> HBaseConfiguration.create()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15545) org.apache.hadoop.io.compress.DecompressorStream allocates too much memory

2016-03-28 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215126#comment-15215126
 ] 

deepankar commented on HBASE-15545:
---

We also had a similar issue, it also allocates a lot of memory in the RPC path 
also, we fixed it an hacky way, will be good if we can come up with a better fix

Relevant Comment :
https://issues.apache.org/jira/browse/HBASE-15362?focusedCommentId=15174292=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15174292

> org.apache.hadoop.io.compress.DecompressorStream allocates too much memory
> --
>
> Key: HBASE-15545
> URL: https://issues.apache.org/jira/browse/HBASE-15545
> Project: HBase
>  Issue Type: Sub-task
>  Components: Compaction
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> It accounts for ~ 11% of overall memory allocation during compaction when 
> compression (GZ) is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15362) Compression Algorithm does not respect config params from hbase-site

2016-03-28 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215056#comment-15215056
 ] 

deepankar commented on HBASE-15362:
---

But this change is very specific to LZO and specialized (hacky) , is it ok 
include it ? We hacked it as a temporary solution.

> Compression Algorithm does not respect config params from hbase-site
> 
>
> Key: HBASE-15362
> URL: https://issues.apache.org/jira/browse/HBASE-15362
> Project: HBase
>  Issue Type: Bug
>Reporter: deepankar
>Assignee: deepankar
>Priority: Trivial
> Attachments: HBASE-15362.patch
>
>
> Compression creates conf using new Configuration() and this leads to it not 
> respecting the confs set in hbase-site, fixing it is trivial using 
> HBaseConfiguration.create()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15362) Compression Algorithm does not respect config params from hbase-site

2016-03-28 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215047#comment-15215047
 ] 

deepankar commented on HBASE-15362:
---

Depends on the default block size we have a default block size of 16 kb and 
this allocates a buffer of 64 kb , so in per read we are saving 64 - 16 kb of 
buffer I think. 

> Compression Algorithm does not respect config params from hbase-site
> 
>
> Key: HBASE-15362
> URL: https://issues.apache.org/jira/browse/HBASE-15362
> Project: HBase
>  Issue Type: Bug
>Reporter: deepankar
>Assignee: deepankar
>Priority: Trivial
> Attachments: HBASE-15362.patch
>
>
> Compression creates conf using new Configuration() and this leads to it not 
> respecting the confs set in hbase-site, fixing it is trivial using 
> HBaseConfiguration.create()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15525) OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts

2016-03-23 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-15525:
--
Summary: OutOfMemory could occur when using BoundedByteBufferPool during 
RPC bursts  (was: Fix OutOfMemory that could occur when using 
BoundedByteBufferPool during RPC bursts)

> OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts
> --
>
> Key: HBASE-15525
> URL: https://issues.apache.org/jira/browse/HBASE-15525
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
>
> After HBASE-13819 the system some times run out of direct memory whenever 
> there is some network congestion or some client side issues.
> This was because of pending RPCs in the RPCServer$Connection.responseQueue 
> and since all the responses in this queue hold a buffer for cellblock from 
> BoundedByteBufferPool this could takeup a lot of memory if the 
> BoundedByteBufferPool's moving average settles down towards a higher value 
> See the discussion here 
> [HBASE-13819-comment|https://issues.apache.org/jira/browse/HBASE-13819?focusedCommentId=15207822=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15207822]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer

2016-03-23 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209749#comment-15209749
 ] 

deepankar commented on HBASE-13819:
---

I totally agree with your idea, but I think practically it would be hard to 
exhaust out the offheap stuff if the allocation is proportional to the expected 
response size, one reason why we did not decrease the buffer size from 
BoundedByteBufferPool is that these large size requests are not that uncommon, 
they do occur occasionally and we did not want them having to allocate a full 
size buffer at that time with our initial assumption being with 
BoundedByteBufferPool there might not be any more allocations at all. May be we 
should try out internally by restricting max size from BBBP to a much lower 
value, may be its ok for large RPCs to have to create their own buffers.

bq. And one more thing for GC is that the full GC only can clean the off heap 
area?

I think young could also clear them, the problem is that Bits class when there 
is not enough offheap space calls the System.GC which tr iggers full GC I think 
but any way I agree this GC is wastefull

bq. We need to make pool such that we will give a BB back if it is having a 
free one. When it is not having a free one and capacity is not reached, it 
makes a new DBB and return

This would be nice, we had a hot patch internally which just fails the request 
when you see the Bits is going to call System.GC(), this was just temporarily 
to stop the RegionServer from crashing.

> Make RPC layer CellBlock buffer a DirectByteBuffer
> --
>
> Key: HBASE-13819
> URL: https://issues.apache.org/jira/browse/HBASE-13819
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13819.patch, HBASE-13819_branch-1.patch, 
> HBASE-13819_branch-1.patch, HBASE-13819_branch-1.patch, q.png
>
>
> In RPC layer, when we make a cellBlock to put as RPC payload, we will make an 
> on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto 
> certain number of buffers. This jira aims at testing possibility for making 
> this buffers off heap ones. (DBB)  The advantages
> 1. Unsafe based writes to off heap is faster than that to on heap. Now we are 
> not using unsafe based writes at all. Even if we add, DBB will be better
> 2. When Cells are backed by off heap (HBASE-11425) off heap to off heap 
> writes will be better
> 3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer 
> to the socket channel, it will create a temp DBB and copy data to there and 
> only DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can  
> avoid this one more level of copying.
> Will do different perf testing with changed and report back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer

2016-03-23 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209739#comment-15209739
 ] 

deepankar commented on HBASE-13819:
---

I agree with you but what I feel is in the ideal scenario the increase in the 
heap usage should be proportional to the number of RPCs coming right ?, when we 
initially allocated the heap size for BBB we accounted for the  anticipated 
burst, but the issue that came was when the size use on the server was couple 
of orders of magnitude more than this (the actual response waiting to respond 
was around 80 MB when the total heap usage was around 3.1 GB). what do you 
think ?

> Make RPC layer CellBlock buffer a DirectByteBuffer
> --
>
> Key: HBASE-13819
> URL: https://issues.apache.org/jira/browse/HBASE-13819
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13819.patch, HBASE-13819_branch-1.patch, 
> HBASE-13819_branch-1.patch, HBASE-13819_branch-1.patch, q.png
>
>
> In RPC layer, when we make a cellBlock to put as RPC payload, we will make an 
> on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto 
> certain number of buffers. This jira aims at testing possibility for making 
> this buffers off heap ones. (DBB)  The advantages
> 1. Unsafe based writes to off heap is faster than that to on heap. Now we are 
> not using unsafe based writes at all. Even if we add, DBB will be better
> 2. When Cells are backed by off heap (HBASE-11425) off heap to off heap 
> writes will be better
> 3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer 
> to the socket channel, it will create a temp DBB and copy data to there and 
> only DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can  
> avoid this one more level of copying.
> Will do different perf testing with changed and report back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer

2016-03-23 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209736#comment-15209736
 ] 

deepankar commented on HBASE-13819:
---

bq. What you think of Anoop's idea of the BB being allocated onheap rather than 
offheap if we can't get it from the pool? The allocation would be faster...


I feel both onheap / offheap allocation could suffer from the same problem as 
long as we are not tight in the allocation (with less wastage compared to the 
anticipated response),  as an example we somewhat made this error rare by 
increasing the MaxDirectMemory to a higher value, but in the onheap case also 
if a user allocated his memory by tightly accounting stuff he could may as well 
face the issue of unnecessary GCs I think. 


bq. deepankar Would you mind opening a new issue describing how you would like 
this to work?

created a jira HBASE-15525, we  would definitely help in whatever way we can on 
this.

> Make RPC layer CellBlock buffer a DirectByteBuffer
> --
>
> Key: HBASE-13819
> URL: https://issues.apache.org/jira/browse/HBASE-13819
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13819.patch, HBASE-13819_branch-1.patch, 
> HBASE-13819_branch-1.patch, HBASE-13819_branch-1.patch, q.png
>
>
> In RPC layer, when we make a cellBlock to put as RPC payload, we will make an 
> on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto 
> certain number of buffers. This jira aims at testing possibility for making 
> this buffers off heap ones. (DBB)  The advantages
> 1. Unsafe based writes to off heap is faster than that to on heap. Now we are 
> not using unsafe based writes at all. Even if we add, DBB will be better
> 2. When Cells are backed by off heap (HBASE-11425) off heap to off heap 
> writes will be better
> 3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer 
> to the socket channel, it will create a temp DBB and copy data to there and 
> only DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can  
> avoid this one more level of copying.
> Will do different perf testing with changed and report back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15525) Fix OutOfMemory that could occur when using BoundedByteBufferPool during RPC bursts

2016-03-23 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-15525:
--
Description: 
After HBASE-13819 the system some times run out of direct memory whenever there 
is some network congestion or some client side issues.
This was because of pending RPCs in the RPCServer$Connection.responseQueue and 
since all the responses in this queue hold a buffer for cellblock from 
BoundedByteBufferPool this could takeup a lot of memory if the 
BoundedByteBufferPool's moving average settles down towards a higher value 

See the discussion here 
[HBASE-13819-comment|https://issues.apache.org/jira/browse/HBASE-13819?focusedCommentId=15207822=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15207822]


  was:
After HBASE-13819 the system some times run out of direct memory whenever there 
is some network congestion or some client side issues.
This was because of pending RPCs in the RPCServer$Connection.responseQueue and 
since all the responses in this queue hold a buffer for cellblock from 
BoundedByteBufferPool this could takeup a lot of memory if the 
BoundedByteBufferPool's moving average settles down towards a higher value 

See the discussion here 
https://issues.apache.org/jira/browse/HBASE-13819?focusedCommentId=15207822=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15207822



> Fix OutOfMemory that could occur when using BoundedByteBufferPool during RPC 
> bursts
> ---
>
> Key: HBASE-15525
> URL: https://issues.apache.org/jira/browse/HBASE-15525
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
>
> After HBASE-13819 the system some times run out of direct memory whenever 
> there is some network congestion or some client side issues.
> This was because of pending RPCs in the RPCServer$Connection.responseQueue 
> and since all the responses in this queue hold a buffer for cellblock from 
> BoundedByteBufferPool this could takeup a lot of memory if the 
> BoundedByteBufferPool's moving average settles down towards a higher value 
> See the discussion here 
> [HBASE-13819-comment|https://issues.apache.org/jira/browse/HBASE-13819?focusedCommentId=15207822=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15207822]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15525) Fix OutOfMemory that could occur when using BoundedByteBufferPool during RPC bursts

2016-03-23 Thread deepankar (JIRA)
deepankar created HBASE-15525:
-

 Summary: Fix OutOfMemory that could occur when using 
BoundedByteBufferPool during RPC bursts
 Key: HBASE-15525
 URL: https://issues.apache.org/jira/browse/HBASE-15525
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: deepankar


After HBASE-13819 the system some times run out of direct memory whenever there 
is some network congestion or some client side issues.
This was because of pending RPCs in the RPCServer$Connection.responseQueue and 
since all the responses in this queue hold a buffer for cellblock from 
BoundedByteBufferPool this could takeup a lot of memory if the 
BoundedByteBufferPool's moving average settles down towards a higher value 

See the discussion here 
https://issues.apache.org/jira/browse/HBASE-13819?focusedCommentId=15207822=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15207822




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer

2016-03-23 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209425#comment-15209425
 ] 

deepankar commented on HBASE-13819:
---

Attached image of the metric over time. we are running bucket cache of size 
around 69427MB and the parameters of BBBP are 2048 max entries and max size is 
1MB, trace showed current moving avg is 512 kb

!q.png!

> Make RPC layer CellBlock buffer a DirectByteBuffer
> --
>
> Key: HBASE-13819
> URL: https://issues.apache.org/jira/browse/HBASE-13819
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13819.patch, HBASE-13819_branch-1.patch, 
> HBASE-13819_branch-1.patch, HBASE-13819_branch-1.patch, q.png
>
>
> In RPC layer, when we make a cellBlock to put as RPC payload, we will make an 
> on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto 
> certain number of buffers. This jira aims at testing possibility for making 
> this buffers off heap ones. (DBB)  The advantages
> 1. Unsafe based writes to off heap is faster than that to on heap. Now we are 
> not using unsafe based writes at all. Even if we add, DBB will be better
> 2. When Cells are backed by off heap (HBASE-11425) off heap to off heap 
> writes will be better
> 3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer 
> to the socket channel, it will create a temp DBB and copy data to there and 
> only DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can  
> avoid this one more level of copying.
> Will do different perf testing with changed and report back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer

2016-03-23 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-13819:
--
Attachment: q.png

> Make RPC layer CellBlock buffer a DirectByteBuffer
> --
>
> Key: HBASE-13819
> URL: https://issues.apache.org/jira/browse/HBASE-13819
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13819.patch, HBASE-13819_branch-1.patch, 
> HBASE-13819_branch-1.patch, HBASE-13819_branch-1.patch, q.png
>
>
> In RPC layer, when we make a cellBlock to put as RPC payload, we will make an 
> on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto 
> certain number of buffers. This jira aims at testing possibility for making 
> this buffers off heap ones. (DBB)  The advantages
> 1. Unsafe based writes to off heap is faster than that to on heap. Now we are 
> not using unsafe based writes at all. Even if we add, DBB will be better
> 2. When Cells are backed by off heap (HBASE-11425) off heap to off heap 
> writes will be better
> 3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer 
> to the socket channel, it will create a temp DBB and copy data to there and 
> only DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can  
> avoid this one more level of copying.
> Will do different perf testing with changed and report back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer

2016-03-23 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209414#comment-15209414
 ] 

deepankar commented on HBASE-13819:
---

bq. In description above though, talk is of Responder queue backed up which 
seems to say we are not clearing the server of finished responses fast 
enough

Is it possible that the issues on client side we are not able to push the 
responses instead of Responder being slow ? because we do try sending the 
responses from the handler itself if it is possible which points somewhat the 
issue could be on client side ?

> Make RPC layer CellBlock buffer a DirectByteBuffer
> --
>
> Key: HBASE-13819
> URL: https://issues.apache.org/jira/browse/HBASE-13819
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13819.patch, HBASE-13819_branch-1.patch, 
> HBASE-13819_branch-1.patch, HBASE-13819_branch-1.patch
>
>
> In RPC layer, when we make a cellBlock to put as RPC payload, we will make an 
> on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto 
> certain number of buffers. This jira aims at testing possibility for making 
> this buffers off heap ones. (DBB)  The advantages
> 1. Unsafe based writes to off heap is faster than that to on heap. Now we are 
> not using unsafe based writes at all. Even if we add, DBB will be better
> 2. When Cells are backed by off heap (HBASE-11425) off heap to off heap 
> writes will be better
> 3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer 
> to the socket channel, it will create a temp DBB and copy data to there and 
> only DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can  
> avoid this one more level of copying.
> Will do different perf testing with changed and report back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer

2016-03-23 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209409#comment-15209409
 ] 

deepankar commented on HBASE-13819:
---

bq. Which Q we talking? The is config to put bounds on the call queue where we 
will reject calls...

The one that reject calls only accounts for incoming request sizes and there is 
default 2GB limit I think. The response sizes are not accounted in this I think.
The Queue is RpcServer$Connection.responseQueue.

bq. Yeah, there is heuristic and we grow till we hit an average. Are we saying 
we grew to 512k and then afterward, all calls were 16k only? Is this a problem?

In our use case our worst case response size is 512 kb (hard capped from client 
side) and our avg response size is between 12kb, what we observe is after 3 - 4 
days of running almost always the moving avg is at 512 kb and in the heap dump 
all the response buffers is of size 512 kb

> Make RPC layer CellBlock buffer a DirectByteBuffer
> --
>
> Key: HBASE-13819
> URL: https://issues.apache.org/jira/browse/HBASE-13819
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13819.patch, HBASE-13819_branch-1.patch, 
> HBASE-13819_branch-1.patch, HBASE-13819_branch-1.patch
>
>
> In RPC layer, when we make a cellBlock to put as RPC payload, we will make an 
> on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto 
> certain number of buffers. This jira aims at testing possibility for making 
> this buffers off heap ones. (DBB)  The advantages
> 1. Unsafe based writes to off heap is faster than that to on heap. Now we are 
> not using unsafe based writes at all. Even if we add, DBB will be better
> 2. When Cells are backed by off heap (HBASE-11425) off heap to off heap 
> writes will be better
> 3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer 
> to the socket channel, it will create a temp DBB and copy data to there and 
> only DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can  
> avoid this one more level of copying.
> Will do different perf testing with changed and report back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer

2016-03-23 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209401#comment-15209401
 ] 

deepankar commented on HBASE-13819:
---

bq. Because we are not returning BB to the pool? The pool is growing w/o bound?

I think there are no leaks in BB from the analysis on the heap dump, all the 
objects were accounted

bq. We should add these if not present at TRACE level.

Sorry My comment was misled, by debug statements I meant by enabling TRACE 
(these loggings were most useful stuff in many debugging scenarios)

bq. So 2G instead of 1G? But the pool is bounded?
bq. The responder is not keeping up? It is not moving stuff out of the server 
fast enough?
bq. I am interested in leaks; how they are happening.

No concrete evidence that responder is not able to keep up, but the bound in 
pool does not help this case because we create a new BB when one is not present 
in the pool and occasionally (we are observing once in 2 - 3 days) there will 
be spew when returns to pool grows above the configured threshold.
>From the analysis we did there are no leaks 

bq. Where is pendingCallsQueue?

The queue per connection object RpcServer$Connection.responseQueue

bq. Did you observe the offheap size used growing? There s a metric IIRC.

Yes we saw this in the metric (hbase.regionserver.direct.MemoryUsed) 

bq. Where would the fixed size be? In BBBP they eventually reach fixed size?

Yes they eventually reach fixed size, but the size is that of the larger 
response sizes rather than median or some smaller number
 


> Make RPC layer CellBlock buffer a DirectByteBuffer
> --
>
> Key: HBASE-13819
> URL: https://issues.apache.org/jira/browse/HBASE-13819
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13819.patch, HBASE-13819_branch-1.patch, 
> HBASE-13819_branch-1.patch, HBASE-13819_branch-1.patch
>
>
> In RPC layer, when we make a cellBlock to put as RPC payload, we will make an 
> on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto 
> certain number of buffers. This jira aims at testing possibility for making 
> this buffers off heap ones. (DBB)  The advantages
> 1. Unsafe based writes to off heap is faster than that to on heap. Now we are 
> not using unsafe based writes at all. Even if we add, DBB will be better
> 2. When Cells are backed by off heap (HBASE-11425) off heap to off heap 
> writes will be better
> 3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer 
> to the socket channel, it will create a temp DBB and copy data to there and 
> only DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can  
> avoid this one more level of copying.
> Will do different perf testing with changed and report back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer

2016-03-23 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208009#comment-15208009
 ] 

deepankar commented on HBASE-13819:
---

What [~anoop.hbase] said is correct, in our case avg response sizes are around 
10 -16 kb but the max size is 512kb if you run for significantly longer time 
the running avg goest toward 512kb. Another point its not like that there are 
lot of request if you have around 200 clients and lets say you have around 5 % 
are GCing then 10 clients with around 400 pending reqs will lead to 4000 
pending requests and this leads to exhaustion of the direct memory we allocated 
(with buffer sizes 512 kb) but in general the overall pending response size for 
these 4000 requests is only 82 MB its atleast 1 - 2  orders of magnitude of 
space getting occupied extra.

I think fixed size BBs is an excellent idea that will definitely be useful to 
get out of this issue.

> Make RPC layer CellBlock buffer a DirectByteBuffer
> --
>
> Key: HBASE-13819
> URL: https://issues.apache.org/jira/browse/HBASE-13819
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13819.patch, HBASE-13819_branch-1.patch, 
> HBASE-13819_branch-1.patch, HBASE-13819_branch-1.patch
>
>
> In RPC layer, when we make a cellBlock to put as RPC payload, we will make an 
> on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto 
> certain number of buffers. This jira aims at testing possibility for making 
> this buffers off heap ones. (DBB)  The advantages
> 1. Unsafe based writes to off heap is faster than that to on heap. Now we are 
> not using unsafe based writes at all. Even if we add, DBB will be better
> 2. When Cells are backed by off heap (HBASE-11425) off heap to off heap 
> writes will be better
> 3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer 
> to the socket channel, it will create a temp DBB and copy data to there and 
> only DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can  
> avoid this one more level of copying.
> Will do different perf testing with changed and report back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer

2016-03-22 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207822#comment-15207822
 ] 

deepankar commented on HBASE-13819:
---

Hi, 

We recently pulled this patch internally and are seeing some significant side 
effects of BoundedByteBufferPool where the system runs out of direct memory 
whenever there is some network congestion or some client side issues. 

System props used :
RPC Handlers = 300
Number of Clients ~200, Threads in each client ~20, we use asynchbase so all 
requests are multiplexed on single connections
Initial buffer size is 16 kb
Buffer size settles to 512 kb over time(from debug statements we put in) 
Max number to cache (reservoir.initial.max ) is around 2048 (also tried with 
4096)
DirectMem accounted for this (is 2048 * 512 KB + 1 GB)

We took a heapdump and analysed the contents of the heap using VisualVM OQL and 
we found that 
number of rpcs that were queued in the responder was around ~4000 and this 
leads to exhaustion of the direct buffer space, digging a little bit more 
deeper the responses buffers in the pendingCallsQueue in connection accounted 
for 3181117440 bytes, even though the real 
response size (buffers are allocated for 512kb even when the response is small) 
accounted only for 84451734.0 bytes. 

[~anoop.hbase] suggested that since any way we are using buffer chain to create 
a CellBlock it would better to create a new ByteBufferOutputStream which 
acquires buffers from the pool instead of allocating a new one with very high 
moving average and removing the moving average overall and having a fixed size 
buffers instead ?

Here are the visualVM query used 
{code}
select sum(map(heap.objects('java.nio.DirectByteBuffer'), function (a) {
   var x = 0;
   var callx = null;
   forEachReferrer(function (y) { 
if (classof(y).name == 'org.apache.hadoop.hbase.ipc.RpcServer$Call') {
   x = -1;
   forEachReferrer(function (px) {
 if (classof(px).name == 
'java.util.concurrent.ConcurrentLinkedDeque$Node') {
callx = y; 
x = 1;
 }
   }, y);
} 
   }, a);

   if (a.att == null &&  x == 1 && callx.response.bufferOffset == 0 && 
callx.response.remaining != 0) {
  // return callx.response.remaining
  return a.capacity
   } else {
  return  0
   }
}))
{code}

> Make RPC layer CellBlock buffer a DirectByteBuffer
> --
>
> Key: HBASE-13819
> URL: https://issues.apache.org/jira/browse/HBASE-13819
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13819.patch, HBASE-13819_branch-1.patch, 
> HBASE-13819_branch-1.patch, HBASE-13819_branch-1.patch
>
>
> In RPC layer, when we make a cellBlock to put as RPC payload, we will make an 
> on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto 
> certain number of buffers. This jira aims at testing possibility for making 
> this buffers off heap ones. (DBB)  The advantages
> 1. Unsafe based writes to off heap is faster than that to on heap. Now we are 
> not using unsafe based writes at all. Even if we add, DBB will be better
> 2. When Cells are backed by off heap (HBASE-11425) off heap to off heap 
> writes will be better
> 3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer 
> to the socket channel, it will create a temp DBB and copy data to there and 
> only DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can  
> avoid this one more level of copying.
> Will do different perf testing with changed and report back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15064) BufferUnderflowException after last Cell fetched from an HFile Block served from L2 offheap cache

2016-03-22 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207082#comment-15207082
 ] 

deepankar commented on HBASE-15064:
---

Yeah I think this will fix this issue.

> BufferUnderflowException after last Cell fetched from an HFile Block served 
> from L2 offheap cache
> -
>
> Key: HBASE-15064
> URL: https://issues.apache.org/jira/browse/HBASE-15064
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15064.patch, MBB_hasRemaining.patch
>
>
> While running the newer patches on our production system, I saw this error 
> come couple of times 
> {noformat}
> ipc.RpcServer: Unexpected throwable object 
> 2016-01-01 16:42:56,090 ERROR 
> [B.defaultRpcServer.handler=20,queue=20,port=60020] ipc.RpcServer: Unexpected 
> throwable object 
> java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:500)
> at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:249)
> at org.apache.hadoop.hbase.nio.MultiByteBuff.get(MultiByteBuff.java:494)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:402)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:517)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:815)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:138)
> {noformat}
> Looking at the get code 
> {code}
> if (this.curItem.remaining() == 0) {
>   if (items.length - 1 == this.curItemIndex) {
> // means cur item is the last one and we wont be able to read a long. 
> Throw exception
> throw new BufferUnderflowException();
>   }
>   this.curItemIndex++;
>   this.curItem = this.items[this.curItemIndex];
> }
> return this.curItem.get();
> {code}
> Can the new currentItem have zero elements (position == limit), does it make 
> sense to change the {{if}} to {{while}} ? {{while (this.curItem.remaining() 
> == 0)}}. This logic is repeated may make sense abstract to a new function if 
> we plan to change to  {{if}} to {{while}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15064) BufferUnderflowException after last Cell fetched from an HFile Block served from L2 offheap cache

2016-03-19 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198794#comment-15198794
 ] 

deepankar commented on HBASE-15064:
---

pressed add too early i mean like this 
{code}
  public MultiByteBuff limit(int limit) {
this.limit = limit;
// Normally the limit will try to limit within the last BB item
int limitedIndexBegin = this.itemBeginPos[this.limitedItemIndex];
if (limit > limitedIndexBegin && limit < 
this.itemBeginPos[this.limitedItemIndex + 1]) {
  this.items[this.limitedItemIndex].limit(limit - limitedIndexBegin);
  return this;
}
int itemIndex = getItemIndex(limit - 1);
{code}

> BufferUnderflowException after last Cell fetched from an HFile Block served 
> from L2 offheap cache
> -
>
> Key: HBASE-15064
> URL: https://issues.apache.org/jira/browse/HBASE-15064
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15064.patch
>
>
> While running the newer patches on our production system, I saw this error 
> come couple of times 
> {noformat}
> ipc.RpcServer: Unexpected throwable object 
> 2016-01-01 16:42:56,090 ERROR 
> [B.defaultRpcServer.handler=20,queue=20,port=60020] ipc.RpcServer: Unexpected 
> throwable object 
> java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:500)
> at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:249)
> at org.apache.hadoop.hbase.nio.MultiByteBuff.get(MultiByteBuff.java:494)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:402)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:517)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:815)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:138)
> {noformat}
> Looking at the get code 
> {code}
> if (this.curItem.remaining() == 0) {
>   if (items.length - 1 == this.curItemIndex) {
> // means cur item is the last one and we wont be able to read a long. 
> Throw exception
> throw new BufferUnderflowException();
>   }
>   this.curItemIndex++;
>   this.curItem = this.items[this.curItemIndex];
> }
> return this.curItem.get();
> {code}
> Can the new currentItem have zero elements (position == limit), does it make 
> sense to change the {{if}} to {{while}} ? {{while (this.curItem.remaining() 
> == 0)}}. This logic is repeated may make sense abstract to a new function if 
> we plan to change to  {{if}} to {{while}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15064) BufferUnderflowException after last Cell fetched from an HFile Block served from L2 offheap cache

2016-03-19 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198793#comment-15198793
 ] 

deepankar commented on HBASE-15064:
---

What if we remove the equality in the if clause in the limit and in the limit 
method modify the new limitIndex to search for limit - 1 will that not work 
seamlessly ?

> BufferUnderflowException after last Cell fetched from an HFile Block served 
> from L2 offheap cache
> -
>
> Key: HBASE-15064
> URL: https://issues.apache.org/jira/browse/HBASE-15064
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15064.patch
>
>
> While running the newer patches on our production system, I saw this error 
> come couple of times 
> {noformat}
> ipc.RpcServer: Unexpected throwable object 
> 2016-01-01 16:42:56,090 ERROR 
> [B.defaultRpcServer.handler=20,queue=20,port=60020] ipc.RpcServer: Unexpected 
> throwable object 
> java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:500)
> at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:249)
> at org.apache.hadoop.hbase.nio.MultiByteBuff.get(MultiByteBuff.java:494)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:402)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:517)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:815)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:138)
> {noformat}
> Looking at the get code 
> {code}
> if (this.curItem.remaining() == 0) {
>   if (items.length - 1 == this.curItemIndex) {
> // means cur item is the last one and we wont be able to read a long. 
> Throw exception
> throw new BufferUnderflowException();
>   }
>   this.curItemIndex++;
>   this.curItem = this.items[this.curItemIndex];
> }
> return this.curItem.get();
> {code}
> Can the new currentItem have zero elements (position == limit), does it make 
> sense to change the {{if}} to {{while}} ? {{while (this.curItem.remaining() 
> == 0)}}. This logic is repeated may make sense abstract to a new function if 
> we plan to change to  {{if}} to {{while}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HBASE-15064) BufferUnderflowException after last Cell fetched from an HFile Block served from L2 offheap cache

2016-03-19 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar reopened HBASE-15064:
---

> BufferUnderflowException after last Cell fetched from an HFile Block served 
> from L2 offheap cache
> -
>
> Key: HBASE-15064
> URL: https://issues.apache.org/jira/browse/HBASE-15064
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15064.patch
>
>
> While running the newer patches on our production system, I saw this error 
> come couple of times 
> {noformat}
> ipc.RpcServer: Unexpected throwable object 
> 2016-01-01 16:42:56,090 ERROR 
> [B.defaultRpcServer.handler=20,queue=20,port=60020] ipc.RpcServer: Unexpected 
> throwable object 
> java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:500)
> at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:249)
> at org.apache.hadoop.hbase.nio.MultiByteBuff.get(MultiByteBuff.java:494)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:402)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:517)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:815)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:138)
> {noformat}
> Looking at the get code 
> {code}
> if (this.curItem.remaining() == 0) {
>   if (items.length - 1 == this.curItemIndex) {
> // means cur item is the last one and we wont be able to read a long. 
> Throw exception
> throw new BufferUnderflowException();
>   }
>   this.curItemIndex++;
>   this.curItem = this.items[this.curItemIndex];
> }
> return this.curItem.get();
> {code}
> Can the new currentItem have zero elements (position == limit), does it make 
> sense to change the {{if}} to {{while}} ? {{while (this.curItem.remaining() 
> == 0)}}. This logic is repeated may make sense abstract to a new function if 
> we plan to change to  {{if}} to {{while}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15064) BufferUnderflowException after last Cell fetched from an HFile Block served from L2 offheap cache

2016-03-19 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198752#comment-15198752
 ] 

deepankar commented on HBASE-15064:
---

Is there something wrong with my reasoning there ? or am i missing something ?

> BufferUnderflowException after last Cell fetched from an HFile Block served 
> from L2 offheap cache
> -
>
> Key: HBASE-15064
> URL: https://issues.apache.org/jira/browse/HBASE-15064
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15064.patch
>
>
> While running the newer patches on our production system, I saw this error 
> come couple of times 
> {noformat}
> ipc.RpcServer: Unexpected throwable object 
> 2016-01-01 16:42:56,090 ERROR 
> [B.defaultRpcServer.handler=20,queue=20,port=60020] ipc.RpcServer: Unexpected 
> throwable object 
> java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:500)
> at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:249)
> at org.apache.hadoop.hbase.nio.MultiByteBuff.get(MultiByteBuff.java:494)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:402)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:517)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:815)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:138)
> {noformat}
> Looking at the get code 
> {code}
> if (this.curItem.remaining() == 0) {
>   if (items.length - 1 == this.curItemIndex) {
> // means cur item is the last one and we wont be able to read a long. 
> Throw exception
> throw new BufferUnderflowException();
>   }
>   this.curItemIndex++;
>   this.curItem = this.items[this.curItemIndex];
> }
> return this.curItem.get();
> {code}
> Can the new currentItem have zero elements (position == limit), does it make 
> sense to change the {{if}} to {{while}} ? {{while (this.curItem.remaining() 
> == 0)}}. This logic is repeated may make sense abstract to a new function if 
> we plan to change to  {{if}} to {{while}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15064) BufferUnderflowException after last Cell fetched from an HFile Block served from L2 offheap cache

2016-03-19 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199100#comment-15199100
 ] 

deepankar commented on HBASE-15064:
---

Yeah something like that which was what [~anoop.hbase] was suggesting, just the 
limitedItemIndex computation was wrong in the first place.

> BufferUnderflowException after last Cell fetched from an HFile Block served 
> from L2 offheap cache
> -
>
> Key: HBASE-15064
> URL: https://issues.apache.org/jira/browse/HBASE-15064
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15064.patch, MBB_hasRemaining.patch
>
>
> While running the newer patches on our production system, I saw this error 
> come couple of times 
> {noformat}
> ipc.RpcServer: Unexpected throwable object 
> 2016-01-01 16:42:56,090 ERROR 
> [B.defaultRpcServer.handler=20,queue=20,port=60020] ipc.RpcServer: Unexpected 
> throwable object 
> java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:500)
> at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:249)
> at org.apache.hadoop.hbase.nio.MultiByteBuff.get(MultiByteBuff.java:494)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:402)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:517)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:815)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:138)
> {noformat}
> Looking at the get code 
> {code}
> if (this.curItem.remaining() == 0) {
>   if (items.length - 1 == this.curItemIndex) {
> // means cur item is the last one and we wont be able to read a long. 
> Throw exception
> throw new BufferUnderflowException();
>   }
>   this.curItemIndex++;
>   this.curItem = this.items[this.curItemIndex];
> }
> return this.curItem.get();
> {code}
> Can the new currentItem have zero elements (position == limit), does it make 
> sense to change the {{if}} to {{while}} ? {{while (this.curItem.remaining() 
> == 0)}}. This logic is repeated may make sense abstract to a new function if 
> we plan to change to  {{if}} to {{while}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15064) BufferUnderflowException after last Cell fetched from an HFile Block served from L2 offheap cache

2016-03-18 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198858#comment-15198858
 ] 

deepankar commented on HBASE-15064:
---

No I was saying, when you have done  mbb1.get() for 4 times and then you check 
hasRemaining it will return false;
so I think the following test will fail
{code}
bb1 = ByteBuffer.wrap(b);
bb2 = ByteBuffer.wrap(b1);
bb3 = ByteBuffer.allocate(4);
mbb1 = new MultiByteBuff(bb1, bb2, bb3);
mbb1.limit(12);
for(int i = 0; i < 12; i++) {
   assertTrue(mbb1.hasRemaining());
  mbb1.get();
}
assertFalse(mbb1.hasRemaining());
{code}

> BufferUnderflowException after last Cell fetched from an HFile Block served 
> from L2 offheap cache
> -
>
> Key: HBASE-15064
> URL: https://issues.apache.org/jira/browse/HBASE-15064
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15064.patch, MBB_hasRemaining.patch
>
>
> While running the newer patches on our production system, I saw this error 
> come couple of times 
> {noformat}
> ipc.RpcServer: Unexpected throwable object 
> 2016-01-01 16:42:56,090 ERROR 
> [B.defaultRpcServer.handler=20,queue=20,port=60020] ipc.RpcServer: Unexpected 
> throwable object 
> java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:500)
> at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:249)
> at org.apache.hadoop.hbase.nio.MultiByteBuff.get(MultiByteBuff.java:494)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:402)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:517)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:815)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:138)
> {noformat}
> Looking at the get code 
> {code}
> if (this.curItem.remaining() == 0) {
>   if (items.length - 1 == this.curItemIndex) {
> // means cur item is the last one and we wont be able to read a long. 
> Throw exception
> throw new BufferUnderflowException();
>   }
>   this.curItemIndex++;
>   this.curItem = this.items[this.curItemIndex];
> }
> return this.curItem.get();
> {code}
> Can the new currentItem have zero elements (position == limit), does it make 
> sense to change the {{if}} to {{while}} ? {{while (this.curItem.remaining() 
> == 0)}}. This logic is repeated may make sense abstract to a new function if 
> we plan to change to  {{if}} to {{while}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15064) BufferUnderflowException after last Cell fetched from an HFile Block served from L2 offheap cache

2016-03-18 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198821#comment-15198821
 ] 

deepankar commented on HBASE-15064:
---

I think the test in your patch should fail Mbb.limit(12); because once your 
cross the 4th byte the hasRemaining will start returning false as you are 
checking only the limit index. I think just [~anoop.hbase] 's suggestion  will 
not work, that coupled with the modification of the way we calculate 
limitedItemIndex should work I think

> BufferUnderflowException after last Cell fetched from an HFile Block served 
> from L2 offheap cache
> -
>
> Key: HBASE-15064
> URL: https://issues.apache.org/jira/browse/HBASE-15064
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15064.patch, MBB_hasRemaining.patch
>
>
> While running the newer patches on our production system, I saw this error 
> come couple of times 
> {noformat}
> ipc.RpcServer: Unexpected throwable object 
> 2016-01-01 16:42:56,090 ERROR 
> [B.defaultRpcServer.handler=20,queue=20,port=60020] ipc.RpcServer: Unexpected 
> throwable object 
> java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:500)
> at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:249)
> at org.apache.hadoop.hbase.nio.MultiByteBuff.get(MultiByteBuff.java:494)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:402)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:517)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:815)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:138)
> {noformat}
> Looking at the get code 
> {code}
> if (this.curItem.remaining() == 0) {
>   if (items.length - 1 == this.curItemIndex) {
> // means cur item is the last one and we wont be able to read a long. 
> Throw exception
> throw new BufferUnderflowException();
>   }
>   this.curItemIndex++;
>   this.curItem = this.items[this.curItemIndex];
> }
> return this.curItem.get();
> {code}
> Can the new currentItem have zero elements (position == limit), does it make 
> sense to change the {{if}} to {{while}} ? {{while (this.curItem.remaining() 
> == 0)}}. This logic is repeated may make sense abstract to a new function if 
> we plan to change to  {{if}} to {{while}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15064) BufferUnderflowException after last Cell fetched from an HFile Block served from L2 offheap cache

2016-03-18 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198751#comment-15198751
 ] 

deepankar commented on HBASE-15064:
---

yeah I think that will work, but what about the other bug in calculating 
limitedItemIndex

> BufferUnderflowException after last Cell fetched from an HFile Block served 
> from L2 offheap cache
> -
>
> Key: HBASE-15064
> URL: https://issues.apache.org/jira/browse/HBASE-15064
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15064.patch
>
>
> While running the newer patches on our production system, I saw this error 
> come couple of times 
> {noformat}
> ipc.RpcServer: Unexpected throwable object 
> 2016-01-01 16:42:56,090 ERROR 
> [B.defaultRpcServer.handler=20,queue=20,port=60020] ipc.RpcServer: Unexpected 
> throwable object 
> java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:500)
> at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:249)
> at org.apache.hadoop.hbase.nio.MultiByteBuff.get(MultiByteBuff.java:494)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:402)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:517)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:815)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:138)
> {noformat}
> Looking at the get code 
> {code}
> if (this.curItem.remaining() == 0) {
>   if (items.length - 1 == this.curItemIndex) {
> // means cur item is the last one and we wont be able to read a long. 
> Throw exception
> throw new BufferUnderflowException();
>   }
>   this.curItemIndex++;
>   this.curItem = this.items[this.curItemIndex];
> }
> return this.curItem.get();
> {code}
> Can the new currentItem have zero elements (position == limit), does it make 
> sense to change the {{if}} to {{while}} ? {{while (this.curItem.remaining() 
> == 0)}}. This logic is repeated may make sense abstract to a new function if 
> we plan to change to  {{if}} to {{while}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15064) BufferUnderflowException after last Cell fetched from an HFile Block served from L2 offheap cache

2016-03-18 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198796#comment-15198796
 ] 

deepankar commented on HBASE-15064:
---

Also the hasRemaining method [~anoop.hbase] suggested should also work 
seamlessly similar to the generic API right ?

> BufferUnderflowException after last Cell fetched from an HFile Block served 
> from L2 offheap cache
> -
>
> Key: HBASE-15064
> URL: https://issues.apache.org/jira/browse/HBASE-15064
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15064.patch
>
>
> While running the newer patches on our production system, I saw this error 
> come couple of times 
> {noformat}
> ipc.RpcServer: Unexpected throwable object 
> 2016-01-01 16:42:56,090 ERROR 
> [B.defaultRpcServer.handler=20,queue=20,port=60020] ipc.RpcServer: Unexpected 
> throwable object 
> java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:500)
> at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:249)
> at org.apache.hadoop.hbase.nio.MultiByteBuff.get(MultiByteBuff.java:494)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:402)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:517)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:815)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:138)
> {noformat}
> Looking at the get code 
> {code}
> if (this.curItem.remaining() == 0) {
>   if (items.length - 1 == this.curItemIndex) {
> // means cur item is the last one and we wont be able to read a long. 
> Throw exception
> throw new BufferUnderflowException();
>   }
>   this.curItemIndex++;
>   this.curItem = this.items[this.curItemIndex];
> }
> return this.curItem.get();
> {code}
> Can the new currentItem have zero elements (position == limit), does it make 
> sense to change the {{if}} to {{while}} ? {{while (this.curItem.remaining() 
> == 0)}}. This logic is repeated may make sense abstract to a new function if 
> we plan to change to  {{if}} to {{while}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15064) BufferUnderflowException after last Cell fetched from an HFile Block served from L2 offheap cache

2016-03-18 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198473#comment-15198473
 ] 

deepankar commented on HBASE-15064:
---

I am still seeing this exception on our servers, I think I found something, 
what I observe is that a couple of things, in the normal byte buffers 
(java.nio) the hasRemaining function uses the current position and limit {code} 
/**
 * Tells whether there are any elements between the current position and
 * the limit. 
 *
 * @return  true if, and only if, there is at least one element
 *  remaining in this buffer
 */
public final boolean hasRemaining() {
return position < limit;
}
{code}

But in the MultiByteBuff we have the hasRemaining is not taking care of limit 
{code}
 /**
   * Returns true if there are elements between the current position and the 
limit
   * @return true if there are elements, false otherwise
   */
  @Override
  public final boolean hasRemaining() {
return this.curItem.hasRemaining() || this.curItemIndex < this.items.length 
- 1;
  }
{code}

Also the items array is not changed in the limit(int) function, this means 
there could be a scenario where the user has asked to limit at the end of first 
buffer, but the hasRemaining() will still return true, Is there any flaw in my 
logic here ?

Also in the limit(int) function  in the MultiByteBuff function we are doing 
{code}
  // Normally the limit will try to limit within the last BB item
int limitedIndexBegin = this.itemBeginPos[this.limitedItemIndex];
if (limit >= limitedIndexBegin && limit < 
this.itemBeginPos[this.limitedItemIndex + 1]) {
  this.items[this.limitedItemIndex].limit(limit - limitedIndexBegin);
  return this;
}
{code}
here I think in the if statement isn't the logic be just {noformat} if (limit  
> limitedIndexBegin  && limit < this.itemBeginPos[this.limitedItemIndex + 1]) 
{noformat} because if somebody is trying to limit at the place which is exactly 
at the boundary of the limitIndexBuffer then we are also including the last 
item which does not have any data as you are limiting at 0 (as limit == 
limitedIndexBegin, which is at the boundary), But then once you have read 
everything in the previous buffer if the client consults hasRemaining function 
this will return again true (as curIterm < no_of_items in array) but when you 
actually try to read anything we will throw BufferUnderFlowException because 
again the last element has no data. There is similar issue with 
{{getItemIndexfunction}} when again the {{elemIndex}} matches with the boundary

> BufferUnderflowException after last Cell fetched from an HFile Block served 
> from L2 offheap cache
> -
>
> Key: HBASE-15064
> URL: https://issues.apache.org/jira/browse/HBASE-15064
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15064.patch
>
>
> While running the newer patches on our production system, I saw this error 
> come couple of times 
> {noformat}
> ipc.RpcServer: Unexpected throwable object 
> 2016-01-01 16:42:56,090 ERROR 
> [B.defaultRpcServer.handler=20,queue=20,port=60020] ipc.RpcServer: Unexpected 
> throwable object 
> java.nio.BufferUnderflowException
> at java.nio.Buffer.nextGetIndex(Buffer.java:500)
> at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:249)
> at org.apache.hadoop.hbase.nio.MultiByteBuff.get(MultiByteBuff.java:494)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:402)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:517)
>  
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:815)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:138)
> {noformat}
> Looking at the get code 
> {code}
> if (this.curItem.remaining() == 0) {
>   if (items.length - 1 == this.curItemIndex) {
> // means cur item is the last one and we wont be able to read a long. 
> Throw exception
> throw new BufferUnderflowException();
>   }
>   this.curItemIndex++;
>   this.curItem = this.items[this.curItemIndex];
> }
> return this.curItem.get();
> {code}
> Can the new currentItem have zero elements (position == limit), does it make 
> sense to change the {{if}} to {{while}} ? {{while (this.curItem.remaining() 
> == 0)}}. This logic is repeated may make sense abstract to a new function if 
> we plan to change to  {{if}} to {{while}}



--
This message was sent by 

[jira] [Commented] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does count CellScanner payload

2016-03-09 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188500#comment-15188500
 ] 

deepankar commented on HBASE-15437:
---

In that case should the values of  queueTime, processingTime be stored inside 
the call ? or should the calculation of those be also moved to setResponse() ?

> Response size calculated in RPCServer for warning tooLarge responses does 
> count CellScanner payload
> ---
>
> Key: HBASE-15437
> URL: https://issues.apache.org/jira/browse/HBASE-15437
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
>
> After HBASE-13158 where we respond back to RPCs with cells in the payload , 
> the protobuf response will just have the count the cells to read from 
> payload, but there are set of features where we log warn in RPCServer 
> whenever the response is tooLarge, but this size now is not considering the 
> sizes of the cells in the PayloadCellScanner. Code form RPCServer
> {code}
>   long responseSize = result.getSerializedSize();
>   // log any RPC responses that are slower than the configured warn
>   // response time or larger than configured warning size
>   boolean tooSlow = (processingTime > warnResponseTime && 
> warnResponseTime > -1);
>   boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize 
> > -1);
>   if (tooSlow || tooLarge) {
> // when tagging, we let TooLarge trump TooSmall to keep output simple
> // note that large responses will often also be slow.
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
>   }
> {code}
> Should this feature be not supported any more or should we add a method to 
> CellScanner or a new interface which returns the serialized size (but this 
> might not include the compression codecs which might be used during response 
> ?) Any other Idea this could be fixed ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does count CellScanner payload

2016-03-09 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188288#comment-15188288
 ] 

deepankar commented on HBASE-15437:
---

ping [~stack]

> Response size calculated in RPCServer for warning tooLarge responses does 
> count CellScanner payload
> ---
>
> Key: HBASE-15437
> URL: https://issues.apache.org/jira/browse/HBASE-15437
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
>
> After HBASE-13158 where we respond back to RPCs with cells in the payload , 
> the protobuf response will just have the count the cells to read from 
> payload, but there are set of features where we log warn in RPCServer 
> whenever the response is tooLarge, but this size now is not considering the 
> sizes of the cells in the PayloadCellScanner. Code form RPCServer
> {code}
>   long responseSize = result.getSerializedSize();
>   // log any RPC responses that are slower than the configured warn
>   // response time or larger than configured warning size
>   boolean tooSlow = (processingTime > warnResponseTime && 
> warnResponseTime > -1);
>   boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize 
> > -1);
>   if (tooSlow || tooLarge) {
> // when tagging, we let TooLarge trump TooSmall to keep output simple
> // note that large responses will often also be slow.
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
>   }
> {code}
> Should this feature be not supported any more or should we add a method to 
> CellScanner or a new interface which returns the serialized size (but this 
> might not include the compression codecs which might be used during response 
> ?) Any other Idea this could be fixed ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does count CellScanner payload

2016-03-09 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188285#comment-15188285
 ] 

deepankar commented on HBASE-15437:
---

ping [~saint@gmail.com] [~anoopsamjohn] [~ram_krish]

> Response size calculated in RPCServer for warning tooLarge responses does 
> count CellScanner payload
> ---
>
> Key: HBASE-15437
> URL: https://issues.apache.org/jira/browse/HBASE-15437
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Reporter: deepankar
>
> After HBASE-13158 where we respond back to RPCs with cells in the payload , 
> the protobuf response will just have the count the cells to read from 
> payload, but there are set of features where we log warn in RPCServer 
> whenever the response is tooLarge, but this size now is not considering the 
> sizes of the cells in the PayloadCellScanner. Code form RPCServer
> {code}
>   long responseSize = result.getSerializedSize();
>   // log any RPC responses that are slower than the configured warn
>   // response time or larger than configured warning size
>   boolean tooSlow = (processingTime > warnResponseTime && 
> warnResponseTime > -1);
>   boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize 
> > -1);
>   if (tooSlow || tooLarge) {
> // when tagging, we let TooLarge trump TooSmall to keep output simple
> // note that large responses will often also be slow.
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
>   }
> {code}
> Should this feature be not supported any more or should we add a method to 
> CellScanner or a new interface which returns the serialized size (but this 
> might not include the compression codecs which might be used during response 
> ?) Any other Idea this could be fixed ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does count CellScanner payload

2016-03-09 Thread deepankar (JIRA)
deepankar created HBASE-15437:
-

 Summary: Response size calculated in RPCServer for warning 
tooLarge responses does count CellScanner payload
 Key: HBASE-15437
 URL: https://issues.apache.org/jira/browse/HBASE-15437
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: deepankar


After HBASE-13158 where we respond back to RPCs with cells in the payload , the 
protobuf response will just have the count the cells to read from payload, but 
there are set of features where we log warn in RPCServer whenever the response 
is tooLarge, but this size now is not considering the sizes of the cells in the 
PayloadCellScanner. Code form RPCServer
{code}
  long responseSize = result.getSerializedSize();
  // log any RPC responses that are slower than the configured warn
  // response time or larger than configured warning size
  boolean tooSlow = (processingTime > warnResponseTime && warnResponseTime 
> -1);
  boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize > 
-1);
  if (tooSlow || tooLarge) {
// when tagging, we let TooLarge trump TooSmall to keep output simple
// note that large responses will often also be slow.
logResponse(new Object[]{param},
md.getName(), md.getName() + "(" + param.getClass().getName() + ")",
(tooLarge ? "TooLarge" : "TooSlow"),
status.getClient(), startTime, processingTime, qTime,
responseSize);
  }
{code}

Should this feature be not supported any more or should we add a method to 
CellScanner or a new interface which returns the serialized size (but this 
might not include the compression codecs which might be used during response ?) 
Any other Idea this could be fixed ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15362) Compression Algorithm does not respect config params from hbase-site

2016-03-01 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174292#comment-15174292
 ] 

deepankar commented on HBASE-15362:
---

I was trying to set {{io.compression.codec.lzo.buffersize}}, the reason for 
setting this is to decrease the buffer size allocated in 
BlockDecompressorStream, it defaults to 64 kb and if you are missing cache 
frequently then this was causing some significant amount of garbage.

Since you brought this up does it make sense that we should recognize that some 
(almost all in the current case) of the codecs does not need to create the 
above stream and directly send the calls to codec or the Decompressor class 
exposed by the codec, it will avoid this 64 kb allocations across the board and 
the decompressors are recycled so the buffers inside them are not a problem.

Emulating the code inside the BlockDecompressorStream, the only problem is it 
cannot work with byte arrays, we special cased this internal and the code 
turned out be very small, pasting this part here 

{code}
// Code similar to the code in BlockDecompressionStream#decompress
Decompressor decompressor = null;
try {
  decompressor = compressAlgo.getDecompressor();
  int curInputOffset = inputOffset, curOutputOffset = destOffset;
  int originalBlockSize = Bytes.toInt(inputArray, curInputOffset, 4);
  int noUncompressedBytes = 0;
  curInputOffset += 4;
  // Iterate until you finish the input
  while (curInputOffset < (inputOffset + compressedSize)
  && noUncompressedBytes < uncompressedSize) {
if (decompressor.needsInput()) {
  int blockSize = Bytes.toInt(inputArray, curInputOffset, 4);
  curInputOffset += 4;
  decompressor.setInput(inputArray, curInputOffset, blockSize);
  curInputOffset += blockSize;
}
int n = decompressor.decompress(dest, curOutputOffset,
uncompressedSize - noUncompressedBytes);
noUncompressedBytes += n;
curOutputOffset += n;
if (n == 0) {
  if (decompressor.finished() || decompressor.needsDictionary()) {
if (noUncompressedBytes >= originalBlockSize) {
  throw new IOException("uncompressed bytes >= orginal size");
}
  }
}
  }
  if (noUncompressedBytes != uncompressedSize) {
throw new IOException("Premature EOF from the input bytes");
  }
} finally {
  if (decompressor != null) {
compressAlgo.returnDecompressor(decompressor);
  }
}
{code}

> Compression Algorithm does not respect config params from hbase-site
> 
>
> Key: HBASE-15362
> URL: https://issues.apache.org/jira/browse/HBASE-15362
> Project: HBase
>  Issue Type: Bug
>Reporter: deepankar
>Assignee: deepankar
>Priority: Trivial
> Attachments: HBASE-15362.patch
>
>
> Compression creates conf using new Configuration() and this leads to it not 
> respecting the confs set in hbase-site, fixing it is trivial using 
> HBaseConfiguration.create()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15362) Compression Algorithm does not respect config params from hbase-site

2016-02-29 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-15362:
--
Attachment: HBASE-15362.patch

> Compression Algorithm does not respect config params from hbase-site
> 
>
> Key: HBASE-15362
> URL: https://issues.apache.org/jira/browse/HBASE-15362
> Project: HBase
>  Issue Type: Bug
>Reporter: deepankar
>Assignee: deepankar
>Priority: Trivial
> Attachments: HBASE-15362.patch
>
>
> Compression creates conf using new Configuration() and this leads to it not 
> respecting the confs set in hbase-site, fixing it is trivial using 
> HBaseConfiguration.create()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15362) Compression Algorithm does not respect config params from hbase-site

2016-02-29 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-15362:
--
Status: Patch Available  (was: In Progress)

> Compression Algorithm does not respect config params from hbase-site
> 
>
> Key: HBASE-15362
> URL: https://issues.apache.org/jira/browse/HBASE-15362
> Project: HBase
>  Issue Type: Bug
>Reporter: deepankar
>Assignee: deepankar
>Priority: Trivial
> Attachments: HBASE-15362.patch
>
>
> Compression creates conf using new Configuration() and this leads to it not 
> respecting the confs set in hbase-site, fixing it is trivial using 
> HBaseConfiguration.create()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15362) Compression Algorithm does not respect config params from hbase-site

2016-02-29 Thread deepankar (JIRA)
deepankar created HBASE-15362:
-

 Summary: Compression Algorithm does not respect config params from 
hbase-site
 Key: HBASE-15362
 URL: https://issues.apache.org/jira/browse/HBASE-15362
 Project: HBase
  Issue Type: Bug
Reporter: deepankar
Assignee: deepankar
Priority: Trivial


Compression creates conf using new Configuration() and this leads to it not 
respecting the confs set in hbase-site, fixing it is trivial using 
HBaseConfiguration.create()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HBASE-15362) Compression Algorithm does not respect config params from hbase-site

2016-02-29 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-15362 started by deepankar.
-
> Compression Algorithm does not respect config params from hbase-site
> 
>
> Key: HBASE-15362
> URL: https://issues.apache.org/jira/browse/HBASE-15362
> Project: HBase
>  Issue Type: Bug
>Reporter: deepankar
>Assignee: deepankar
>Priority: Trivial
>
> Compression creates conf using new Configuration() and this leads to it not 
> respecting the confs set in hbase-site, fixing it is trivial using 
> HBaseConfiguration.create()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15361) Remove unnecessary or Document constraints on BucketCache possible bucket sizes

2016-02-29 Thread deepankar (JIRA)
deepankar created HBASE-15361:
-

 Summary: Remove unnecessary or Document  constraints on 
BucketCache possible bucket sizes
 Key: HBASE-15361
 URL: https://issues.apache.org/jira/browse/HBASE-15361
 Project: HBase
  Issue Type: Sub-task
  Components: BucketCache
Reporter: deepankar
Priority: Minor


When we were trying to tune the bucket sizes {{hbase.bucketcache.bucket.sizes}} 
according to our workload, we encountered an issue due to the way offset is 
stored in the bucket entry. We divide the offset into integer base and byte 
value and it assumes that all bucket offsets  will be a multiple of 256 (left 
shifting by 8). See the code below
{code}

long offset() { // Java has no unsigned numbers
  long o = ((long) offsetBase) & 0x;
  o += (((long) (offset1)) & 0xFF) << 32;
  return o << 8;
}

private void setOffset(long value) {
  assert (value & 0xFF) == 0;
  value >>= 8;
  offsetBase = (int) value;
  offset1 = (byte) (value >> 32);
}
{code}

This was there to save 3 bytes per BucketEntry instead of using long and when 
there are no other fields in the Bucket Entry, but now there are lot of fields 
in the bucket entry , This not documented so we could either document the 
constraint that it should be a strict 256 bytes multiple of just go away with 
this constraint.  




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15361) Remove unnecessary or Document constraints on BucketCache possible bucket sizes

2016-02-29 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172708#comment-15172708
 ] 

deepankar commented on HBASE-15361:
---

I can put up a patch based on the direction community suggests 

> Remove unnecessary or Document  constraints on BucketCache possible bucket 
> sizes
> 
>
> Key: HBASE-15361
> URL: https://issues.apache.org/jira/browse/HBASE-15361
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: deepankar
>Priority: Minor
>
> When we were trying to tune the bucket sizes 
> {{hbase.bucketcache.bucket.sizes}} according to our workload, we encountered 
> an issue due to the way offset is stored in the bucket entry. We divide the 
> offset into integer base and byte value and it assumes that all bucket 
> offsets  will be a multiple of 256 (left shifting by 8). See the code below
> {code}
> long offset() { // Java has no unsigned numbers
>   long o = ((long) offsetBase) & 0x;
>   o += (((long) (offset1)) & 0xFF) << 32;
>   return o << 8;
> }
> private void setOffset(long value) {
>   assert (value & 0xFF) == 0;
>   value >>= 8;
>   offsetBase = (int) value;
>   offset1 = (byte) (value >> 32);
> }
> {code}
> This was there to save 3 bytes per BucketEntry instead of using long and when 
> there are no other fields in the Bucket Entry, but now there are lot of 
> fields in the bucket entry , This not documented so we could either document 
> the constraint that it should be a strict 256 bytes multiple of just go away 
> with this constraint.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082

2016-01-20 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110224#comment-15110224
 ] 

deepankar commented on HBASE-15101:
---

It is on on branch-1 only. ( But I have ported other patches also so When I 
pulled this in there no significant merge conflicts)

> Leaked References to StoreFile.Reader after HBASE-13082
> ---
>
> Key: HBASE-15101
> URL: https://issues.apache.org/jira/browse/HBASE-15101
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, 
> HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch
>
>
> We observed this production that after a region server dies there are huge 
> number of hfiles in that region for the region server running the version 
> with HBASE-13082, In the doc it is given that it is expected to happen, but 
> we found a one place where scanners are not being closed. If the scanners are 
> not closed their references are not decremented and that is leading to the 
> issue of huge number of store files not being finalized
> All I was able to find is in the selectScannersFrom, where we discard some of 
> the scanners and we are not closing them. I am attaching a patch for that.
> Also to avoid these issues should the files that are done be logged and 
> finalized (moved to archive) as a part of region close operation. This will 
> solve any leaks that can happen and does not cause any dire consequences?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082

2016-01-19 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106413#comment-15106413
 ] 

deepankar commented on HBASE-15101:
---

I thought before HBASE-13082, when a compaction starts and before it completes 
the files are present in .tmp directory (of the region folder) and finalized 
once it completes giving a very small window (after moving in the files from 
.tmp and moving out files from RegionServer) where there could be that all 
files are present. This is not the case after HBASE-13082 because both the set 
of files are present in the folder for a longer period of time and if there is 
any leak in the reference counting then all the files co exist and it can lead 
to a region size explosion . 

This is what exactly happened with us, without this patch we were running one 
regionserver with HBASE-13082 and almost all the regions on that server had all 
the files from the time of begining of that regionserver and movement of region 
to that server (movement rarely happens). The worst is we force major compact 
regions daily and that lead to the region data getting repeated over 7 times 
and In panic when we shutdown (gracefully) this server it lead to other 
regionservers that hosted these regions keep on compacting the whole next day 
(as each of them contained 5-7x the data of normal region). 

So then when applied this patch and hosted only two regions on this 
experimental regionserver for 2 days, and the samething repeated and when again 
we shutdown (again gracefully) the regionserver all the files did remain in the 
directory and it did lead to longer compaction next time.

If we can come up with patch after leak may I could take a stab testing again, 
I will also go through the close() to see if I am missing any thing.

Thanks




> Leaked References to StoreFile.Reader after HBASE-13082
> ---
>
> Key: HBASE-15101
> URL: https://issues.apache.org/jira/browse/HBASE-15101
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: deepankar
> Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, 
> HBASE-15101-v3.patch, HBASE-15101.patch
>
>
> We observed this production that after a region server dies there are huge 
> number of hfiles in that region for the region server running the version 
> with HBASE-13082, In the doc it is given that it is expected to happen, but 
> we found a one place where scanners are not being closed. If the scanners are 
> not closed their references are not decremented and that is leading to the 
> issue of huge number of store files not being finalized
> All I was able to find is in the selectScannersFrom, where we discard some of 
> the scanners and we are not closing them. I am attaching a patch for that.
> Also to avoid these issues should the files that are done be logged and 
> finalized (moved to archive) as a part of region close operation. This will 
> solve any leaks that can happen and does not cause any dire consequences?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082

2016-01-19 Thread deepankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

deepankar updated HBASE-15101:
--
Attachment: HBASE-15101-v4.patch

> Leaked References to StoreFile.Reader after HBASE-13082
> ---
>
> Key: HBASE-15101
> URL: https://issues.apache.org/jira/browse/HBASE-15101
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, 
> HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch
>
>
> We observed this production that after a region server dies there are huge 
> number of hfiles in that region for the region server running the version 
> with HBASE-13082, In the doc it is given that it is expected to happen, but 
> we found a one place where scanners are not being closed. If the scanners are 
> not closed their references are not decremented and that is leading to the 
> issue of huge number of store files not being finalized
> All I was able to find is in the selectScannersFrom, where we discard some of 
> the scanners and we are not closing them. I am attaching a patch for that.
> Also to avoid these issues should the files that are done be logged and 
> finalized (moved to archive) as a part of region close operation. This will 
> solve any leaks that can happen and does not cause any dire consequences?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082

2016-01-19 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108017#comment-15108017
 ] 

deepankar commented on HBASE-15101:
---

Attached patch with close calls also.

> Leaked References to StoreFile.Reader after HBASE-13082
> ---
>
> Key: HBASE-15101
> URL: https://issues.apache.org/jira/browse/HBASE-15101
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, 
> HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch
>
>
> We observed this production that after a region server dies there are huge 
> number of hfiles in that region for the region server running the version 
> with HBASE-13082, In the doc it is given that it is expected to happen, but 
> we found a one place where scanners are not being closed. If the scanners are 
> not closed their references are not decremented and that is leading to the 
> issue of huge number of store files not being finalized
> All I was able to find is in the selectScannersFrom, where we discard some of 
> the scanners and we are not closing them. I am attaching a patch for that.
> Also to avoid these issues should the files that are done be logged and 
> finalized (moved to archive) as a part of region close operation. This will 
> solve any leaks that can happen and does not cause any dire consequences?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082

2016-01-19 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108013#comment-15108013
 ] 

deepankar commented on HBASE-15101:
---

Should I add the close calls before return statement as I mentioned above ?

> Leaked References to StoreFile.Reader after HBASE-13082
> ---
>
> Key: HBASE-15101
> URL: https://issues.apache.org/jira/browse/HBASE-15101
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, 
> HBASE-15101-v3.patch, HBASE-15101.patch
>
>
> We observed this production that after a region server dies there are huge 
> number of hfiles in that region for the region server running the version 
> with HBASE-13082, In the doc it is given that it is expected to happen, but 
> we found a one place where scanners are not being closed. If the scanners are 
> not closed their references are not decremented and that is leading to the 
> issue of huge number of store files not being finalized
> All I was able to find is in the selectScannersFrom, where we discard some of 
> the scanners and we are not closing them. I am attaching a patch for that.
> Also to avoid these issues should the files that are done be logged and 
> finalized (moved to archive) as a part of region close operation. This will 
> solve any leaks that can happen and does not cause any dire consequences?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082

2016-01-19 Thread deepankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108036#comment-15108036
 ] 

deepankar commented on HBASE-15101:
---

Adding the close did not help before, but I thought it should follow 
convention. I added it to all places where we are returning the NO_MORE_VALUES, 
is there any other place I am missing ?

> Leaked References to StoreFile.Reader after HBASE-13082
> ---
>
> Key: HBASE-15101
> URL: https://issues.apache.org/jira/browse/HBASE-15101
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, io
>Affects Versions: 2.0.0
>Reporter: deepankar
>Assignee: deepankar
>Priority: Critical
> Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, 
> HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch
>
>
> We observed this production that after a region server dies there are huge 
> number of hfiles in that region for the region server running the version 
> with HBASE-13082, In the doc it is given that it is expected to happen, but 
> we found a one place where scanners are not being closed. If the scanners are 
> not closed their references are not decremented and that is leading to the 
> issue of huge number of store files not being finalized
> All I was able to find is in the selectScannersFrom, where we discard some of 
> the scanners and we are not closing them. I am attaching a patch for that.
> Also to avoid these issues should the files that are done be logged and 
> finalized (moved to archive) as a part of region close operation. This will 
> solve any leaks that can happen and does not cause any dire consequences?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >