[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2018-03-14 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11425:
--
Release Note: 
For E2E off heaped read path, first of all there should be an off heap backed 
BucketCache(BC). Configure 'hbase.bucketcache.ioengine' to offheap in 
hbase-site.xml. Also specify the total capacity of the BC using 
hbase.bucketcache.size config.  Please remember to adjust value of 
'HBASE_OFFHEAPSIZE' in hbase-env.sh as per this capacity. Here-by we specify 
the max possible off-heap memory allocation for the RS java process. So this 
should be bigger than the off-heap BC size. Please keep in mind that there is 
no default for hbase.bucketcache.ioengine which means the BC is turned OFF by 
default.

Next thing to tune is the ByteBuffer pool in the RPC server side. The buffers 
from this pool will be used to accumulate the cell bytes and create a result 
cell block to send back to the client side. 
'hbase.ipc.server.reservoir.enabled' can be used to turn this pool ON or OFF. 
By default this pool is ON and available. HBase will create off heap 
ByteBuffers and pool them. Please make sure not to turn this OFF if you want 
E2E off heaping in read path. If this pool is turned off, the server will 
create temp buffers on heap to accumulate the cell bytes and make a result cell 
block. This can impact the GC on a highly read loaded server.  The user can 
tune this pool with respect to how many buffers are in the pool and what should 
be the size of each ByteBuffer.
Use the config 'hbase.ipc.server.reservoir.initial.buffer.size' to tune each of 
the buffer sizes. Defaults is 64 KB.

When the read pattern is a random row read and each of the rows are smaller in 
size compared to this 64 KB, try reducing this. When the result size is larger 
than one ByteBuffer size, the server will try to grab more than one buffer and 
make a result cell block out of these.  When the pool is running out of 
buffers, the server will end up creating temporary on-heap buffers.

The maximum number of ByteBuffers in the pool can be tuned using the config 
'hbase.ipc.server.reservoir.initial.max'. Its value defaults to 64 * region 
server handlers configured (See the config 'hbase.regionserver.handler.count'). 
The math is such that by default we consider 2 MB as the result cell block size 
per read result and each handler will be handling a read. For 2 MB size, we 
need 32 buffers each of size 64 KB (See default buffer size in pool).  So per 
handler 32 ByteBuffers(BB). We allocate twice this size as the max BBs count 
such that one handler can be creating the response and handing it to the RPC 
Responder thread and then handling a new request creating a new response cell 
block (using pooled buffers). Even if the responder could not send back the 
first TCP reply immediately, our count should allow that we should still have 
enough buffers in our pool without having to make temporary buffers on the 
heap.  Again for smaller sized random row reads, tune this max count. There are 
lazily created buffers and the count is the max count to be pooled.

The setting for HBASE_OFFHEAPSIZE in hbase-env.sh should consider this off heap 
buffer pool at the RPC side also.  We need to config this max off heap size for 
RS as a bit higher than the sum of this max pool size and the off heap cache 
size. The TCP layer will also need to create direct bytebuffers for TCP 
communication. Also the DFS client will need some off-heap to do its workings 
especially if short-circuit reads are configured. Allocating an extra of 1 - 2 
GB for the max direct memory size has worked in tests.

If you still see GC issues even after making E2E read path off heap, look for 
issues in the appropriate buffer pool. Check the below RS log with INFO level:

  "Pool already reached its max capacity : XXX and no free buffers now. 
Consider increasing the value for 'hbase.ipc.server.reservoir.initial.max' ?"

If you are using co processors and refer the Cells in the read results, DO NOT 
store reference to these Cells out of the scope of the CP hook methods. Some 
times the CPs need store info about the cell (Like its row key) for considering 
in the next CP hook call etc. For such cases, pls clone the required fields of 
the entire Cell as per the use cases.  [ See CellUtil#cloneXXX(Cell) APIs ]

  was:
For E2E off heaped read path, first of all there should be an off heap backed 
BucketCache(BC). Configure 'hbase.bucketcache.ioengine' to offheap in 
hbase-site.xml. Also to specify the total capacity of the BC using 
hbase.bucketcache.size config.  Please remember to adjust value of 
'HBASE_OFFHEAPSIZE' in hbase-env.sh as per this capacity. Here by we specify 
the max possible off heap memory allocation by the RS java process. So this 
should be bigger than the off heap BC size. Please keep in mind that there is 
no default for hbase.bucketcache.ioengine means the BC is 

[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2018-03-13 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11425:
---
Affects Version/s: (was: 0.99.0)
 Release Note: 
For E2E off heaped read path, first of all there should be an off heap backed 
BucketCache(BC). Configure 'hbase.bucketcache.ioengine' to offheap in 
hbase-site.xml. Also to specify the total capacity of the BC using 
hbase.bucketcache.size config.  Please remember to adjust value of 
'HBASE_OFFHEAPSIZE' in hbase-env.sh as per this capacity. Here by we specify 
the max possible off heap memory allocation by the RS java process. So this 
should be bigger than the off heap BC size. Please keep in mind that there is 
no default for hbase.bucketcache.ioengine means the BC is turned OFF by default.
Next thing to tune is the ByteBuffer pool in the RPC server side. The buffers 
from this pool will be used to accumulate the cell bytes and create a result 
cell block to be send back to the client side. 
'hbase.ipc.server.reservoir.enabled' can be used to turn this pool ON or OFF. 
By default this pool is ON and available. HBase will create off heap 
ByteBuffers and pool them. Please make sure not to turn this OFF if you want 
E2E off heaping in read path. If this pool is turned off, the server will 
create temp buffers on heap to accumulate the cell bytes and make a result cell 
block. This can impact the GC on a highly read loaded server.  The user can 
tune this pool wrt how buffers to be there in pool and what should be the size 
of each of this ByteBuffer.
Use the config 'hbase.ipc.server.reservoir.initial.buffer.size' to tune each of 
the buffer's size. This defaults to 64 KB. When the read pattern is a random 
row read and each of the rows are smaller in size compared to this 64 KB, try 
reducing this. When the result size is larger than one ByteBuffer size, the 
server will try to grab more than one buffer and make the result cell block out 
of those.  When the pool is running out of buffers, to return the cell block 
result, the server will end up creating temp on heap buffers. The max number of 
ByteBuffers in the pool can be tuned using the config 
'hbase.ipc.server.reservoir.initial.max'. Its value defaults to 64 * region 
server handlers configured (See also the config 
'hbase.regionserver.handler.count'). The math is such that by default we 
consider 2 MB for result cell block size per read result and each handler will 
be handling this read. For 2 MB size, we need 32 buffers each of size 64 KB 
(See default buffer size in pool).  So per handler 32 BBs. We allocate twice 
this size as the max BBs count such that one handler created the response and 
handed over it to the RPC Responder thread and again handling one more request 
and can create the reponse cell block (using pooled buffers). Even if the 
responder could not send back the first TCP reply, the pool can handle.  Again 
for smaller sized random row reads, tune this max count. This is any way lazily 
created buffers and its is the max count to be pooled.  The setting for 
HBASE_OFFHEAPSIZE in hbase-env.sh should consider this off heap buffer pool at 
RPC side also.  We need to config this max off heap size for RS as a bit higher 
than the sum of this max pool size and the off heap cache size. The TCP layer 
will also need create direct bytebuffers for TCP communication. Also the DFS 
client will need some more. An extra of 1 - 2 GB for the max direct memory size 
would work as per tests.

If you still see GC issues even after making E2E read path off heap, look for 
the possibility of such in appropriate buffer pool. Check the below RS log with 
INFO level
"Pool already reached its max capacity : XXX and no free buffers now. Consider 
increasing the value for 'hbase.ipc.server.reservoir.initial.max' ?"

If you are using co processors and refer the Cells in the read results, DO NOT 
store reference to these Cells out of the scope of the CP hook methods. Some 
times the CPs need store info about the cell (Like its row key) for considering 
in the next CP hook call etc. For such cases, pls clone the required fields of 
the entire Cell as per the use cases.  [ See CellUtil#cloneXXX(Cell) APIs ]

> Cell/DBB end-to-end on the read-path
> 
>
> Key: HBASE-11425
> URL: https://issues.apache.org/jira/browse/HBASE-11425
> Project: HBase
>  Issue Type: Umbrella
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, GC pics 
> with evictions_4G heap.png, HBASE-11425-E2E-NotComplete.patch, 
> HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in 
> HBase using BBs_final.pdf, Screen Shot 

[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2016-01-27 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11425:
---
Fix Version/s: 2.0.0

> Cell/DBB end-to-end on the read-path
> 
>
> Key: HBASE-11425
> URL: https://issues.apache.org/jira/browse/HBASE-11425
> Project: HBase
>  Issue Type: Umbrella
>  Components: regionserver, Scanners
>Affects Versions: 0.99.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, GC pics 
> with evictions_4G heap.png, HBASE-11425-E2E-NotComplete.patch, 
> HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in 
> HBase using BBs_final.pdf, Screen Shot 2015-10-16 at 5.13.22 PM.png, gc.png, 
> gets.png, heap.png, load.png, median.png, ram.log
>
>
> Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
> In the entire read path, we can refer to this offheap buffer and avoid onheap 
> copying.
> The high level items I can identify as of now are
> 1. Avoid the array() call on BB in read path.. (This is there in many 
> classes. We can handle class by class)
> 2. Support Buffer based getter APIs in cell.  In read path we will create a 
> new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
> CPs etc.
> 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
> 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
> (In read path)
> Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-10-16 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11425:
--
Attachment: ram.log

The log is not that interesting. It does not even have the eviction issue.  
Just shows us struggling w/ GC.  Finally we OOME here:

2015-10-16T13:16:42.032-0700: [Full GC (Allocation Failure) 
2015-10-16T13:16:42.032-0700: [CMS: 15276447K->15276422K(15276480K), 4.9700400 
secs] 16273215K->16273119K(16273280K), [Metaspace: 48934K->48934K(1093632K)], 
4.9774537 secs] [Times: user=4.97 sys=0.00, real=4.98 secs]
2015-10-16T13:16:47.012-0700: [Full GC (Allocation Failure) 
2015-10-16T13:16:47.012-0700: [CMS: 15276422K->15276422K(15276480K), 1.2150090 
secs] 16273204K->16273186K(16273280K), [Metaspace: 48901K->48901K(1093632K)], 
1.2151393 secs] [Times: user=1.21 sys=0.00, real=1.22 secs]
2015-10-16T13:16:48.227-0700: [Full GC (Allocation Failure) 
2015-10-16T13:16:48.227-0700: [CMS: 15276422K->15276422K(15276480K), 1.2185531 
secs] 16273186K->16273186K(16273280K), [Metaspace: 48901K->48901K(1093632K)], 
1.2186671 secs] [Times: user=1.22 sys=0.00, real=1.21 secs]
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill -9 %p"
#   Executing /bin/sh -c "kill -9 16941"...

Configs are this:

# The maximum amount of heap to use, in MB. Default is 1000.
export HBASE_HEAPSIZE=16000


export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize=16g"



  hbase.bucketcache.ioengine
offheap


  hbase.bucketcache.size
8196


  hfile.block.cache.size
0.1



Looking at UI, hardly any meta blocks in L1... a couple of hundred.

> Cell/DBB end-to-end on the read-path
> 
>
> Key: HBASE-11425
> URL: https://issues.apache.org/jira/browse/HBASE-11425
> Project: HBase
>  Issue Type: Umbrella
>  Components: regionserver, Scanners
>Affects Versions: 0.99.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, GC pics 
> with evictions_4G heap.png, HBASE-11425-E2E-NotComplete.patch, 
> HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in 
> HBase using BBs_final.pdf, gc.png, gets.png, heap.png, load.png, median.png, 
> ram.log
>
>
> Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
> In the entire read path, we can refer to this offheap buffer and avoid onheap 
> copying.
> The high level items I can identify as of now are
> 1. Avoid the array() call on BB in read path.. (This is there in many 
> classes. We can handle class by class)
> 2. Support Buffer based getter APIs in cell.  In read path we will create a 
> new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
> CPs etc.
> 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
> 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
> (In read path)
> Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-10-16 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11425:
--
Attachment: Screen Shot 2015-10-16 at 5.13.22 PM.png

Does this help? Looks like 7.5G retained by HFileScannerImpl in arraylist.

> Cell/DBB end-to-end on the read-path
> 
>
> Key: HBASE-11425
> URL: https://issues.apache.org/jira/browse/HBASE-11425
> Project: HBase
>  Issue Type: Umbrella
>  Components: regionserver, Scanners
>Affects Versions: 0.99.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, GC pics 
> with evictions_4G heap.png, HBASE-11425-E2E-NotComplete.patch, 
> HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in 
> HBase using BBs_final.pdf, Screen Shot 2015-10-16 at 5.13.22 PM.png, gc.png, 
> gets.png, heap.png, load.png, median.png, ram.log
>
>
> Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
> In the entire read path, we can refer to this offheap buffer and avoid onheap 
> copying.
> The high level items I can identify as of now are
> 1. Avoid the array() call on BB in read path.. (This is there in many 
> classes. We can handle class by class)
> 2. Support Buffer based getter APIs in cell.  In read path we will create a 
> new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
> CPs etc.
> 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
> 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
> (In read path)
> Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-10-15 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11425:
---
Attachment: GC pics with evictions_4G heap.png

Tried out the experiments in a new cluster with single node. 
Loaded around 15G of data with 10 regions. 
Initially configured 4G as heap space, offheap space as 5G and the bucket cache 
size as 4G.
Ran a pure read workload c for 30 mins.  With 50 threads.  I am not running 
into OOME and also the block eviction part is fine.  The bigger GCs are around 
450ms to 500ms.  Repeated the same experiment with 10 G of heap space also.  
With this configuration we are sure that evictions are happening from the 
bucket cache. Attaching a GC snapshot for 5 mins captured during the workload 
test. 
Stack, 
Also in your experiment I think your data does not fit into the bucket cache 
and hence it is trying to evict. Or if it is fitting into the bucket cache, 
probably that was a file that was trying to get compacted and there was a 
reader referencing to it and due to the OOME the ref count decrement did not 
happen and the forceful eviction was failing. Will keep checking this.  Any 
logs can you attach when this happened?  Easier to debug (hopefully).

> Cell/DBB end-to-end on the read-path
> 
>
> Key: HBASE-11425
> URL: https://issues.apache.org/jira/browse/HBASE-11425
> Project: HBase
>  Issue Type: Umbrella
>  Components: regionserver, Scanners
>Affects Versions: 0.99.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, GC pics 
> with evictions_4G heap.png, HBASE-11425-E2E-NotComplete.patch, 
> HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in 
> HBase using BBs_final.pdf, gc.png, gets.png, heap.png, load.png, median.png
>
>
> Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
> In the entire read path, we can refer to this offheap buffer and avoid onheap 
> copying.
> The high level items I can identify as of now are
> 1. Avoid the array() call on BB in read path.. (This is there in many 
> classes. We can handle class by class)
> 2. Support Buffer based getter APIs in cell.  In read path we will create a 
> new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
> CPs etc.
> 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
> 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
> (In read path)
> Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-10-14 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11425:
--
Attachment: gets.png
load.png
gc.png
median.png

Some coarse graphs that run YCSB workload c (total random read) running for an 
hour with 100 clients against a dataset that is totally cached hosted on one 
server. The first run is against a RS that is using default, onheap memcache. 
The second is using bucketcache.

I see that the work here makes it so using the bucketcache has the same latency 
and throughput (perhaps a little less throughput) as serving all from onheap 
(recall that in tests, buckecache as best if there were cache misses... if you 
could serve all from heap, onheap had a much nicer profile). To me, this makes 
it possible to run with the bucketcache all the time whether serving all from 
heap or when cache misses (recall, bucketcache did better when there were cache 
misses -- I have not looked to see if this work improves on what we saw 
previous).

More testing to follow (a redo of our block cache comparisions post might be in 
order).

The graphs are gc basic profile (this is CMS), gets per second, the median (the 
75th and 95th percentiles weren't showing up for some reason... need to dig 
in... hopefully its because their incidence was low...), and overall loading 
and seeks.

Offheap puts al little more load on the system, has a better GC profile, and is 
slightly less throughput.



> Cell/DBB end-to-end on the read-path
> 
>
> Key: HBASE-11425
> URL: https://issues.apache.org/jira/browse/HBASE-11425
> Project: HBase
>  Issue Type: Umbrella
>  Components: regionserver, Scanners
>Affects Versions: 0.99.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, 
> HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase 
> using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf, gc.png, 
> gets.png, load.png, median.png
>
>
> Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
> In the entire read path, we can refer to this offheap buffer and avoid onheap 
> copying.
> The high level items I can identify as of now are
> 1. Avoid the array() call on BB in read path.. (This is there in many 
> classes. We can handle class by class)
> 2. Support Buffer based getter APIs in cell.  In read path we will create a 
> new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
> CPs etc.
> 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
> 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
> (In read path)
> Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-10-14 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11425:
--
Attachment: heap.png

Have you fellas run this for a while? I seem to OOME easy enough. I tried with 
a small heap, 4G, but it OOME'd then had to keep going back up to 16G again 
though I had a big offheap.  I tried to capture the increasing use of heap but 
this diagram is best I got... I've not done heap analysis.. but in the diagram 
you can see heap use start to rise and then plummet... now I am crawling doing 
Full GCs every couple of seconds.

The last meaningful log was this in regionserver log:

2015-10-14 16:51:10,058 DEBUG [main-BucketCacheWriter-2] bucket.BucketCache: 
This block 3f0157e7daee45fdb25202c496c95c46_1649898813 is still referred by 1 
readers. Can not be freed now

It started complaining 50minutes ago...





> Cell/DBB end-to-end on the read-path
> 
>
> Key: HBASE-11425
> URL: https://issues.apache.org/jira/browse/HBASE-11425
> Project: HBase
>  Issue Type: Umbrella
>  Components: regionserver, Scanners
>Affects Versions: 0.99.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, 
> HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase 
> using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf, gc.png, 
> gets.png, heap.png, load.png, median.png
>
>
> Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
> In the entire read path, we can refer to this offheap buffer and avoid onheap 
> copying.
> The high level items I can identify as of now are
> 1. Avoid the array() call on BB in read path.. (This is there in many 
> classes. We can handle class by class)
> 2. Support Buffer based getter APIs in cell.  In read path we will create a 
> new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
> CPs etc.
> 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
> 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
> (In read path)
> Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-05-05 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11425:
---
Attachment: (was: Benchmarks_Tests.docx)

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, 
 HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase 
 using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-05-05 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11425:
---
Attachment: Benchmarks_Tests.docx

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, 
 HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase 
 using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-05-05 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11425:
---
Attachment: (was: Benchmarks_Tests.docx)

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, 
 HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase 
 using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-05-02 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11425:
---
Attachment: Benchmarks_Tests.docx

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, 
 Benchmarks_Tests.docx, HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, 
 Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using 
 BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-05-02 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11425:
---
Attachment: BenchmarkTestCode.zip

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, 
 HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase 
 using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-05-01 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11425:
---
Attachment: Benchmarks_Tests.docx

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: Benchmarks_Tests.docx, 
 HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase 
 using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-05-01 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11425:
---
Status: Open  (was: Patch Available)

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, 
 Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using 
 BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-03-31 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11425:
---
Attachment: HBASE-11425.patch

Updated patch handling cases of ByteBuffers and byte[] through out read the 
path and some refactoring also done.  Still may not be the final patch. But 
more suitable for reviews.

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, 
 Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using 
 BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-03-31 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11425:
---
Status: Patch Available  (was: Open)

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, 
 Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using 
 BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-03-18 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11425:
---
Attachment: HBASE-11425-E2E-NotComplete.patch

Attaching an E2E patch for reference. Still some more cleanups we are doing.  
Also avoiding some code duplication still in patch.

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: HBASE-11425-E2E-NotComplete.patch, Offheap reads in 
 HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-03-18 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11425:
---
Attachment: Offheap reads in HBase using BBs_V2.pdf

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: Offheap reads in HBase using BBs_V2.pdf, Offheap reads 
 in HBase using BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-03-09 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11425:
---
Attachment: Offheap reads in HBase using BBs_final.pdf

A document explaining the motive, the design considerations, the reason for 
arriving at BB and the Cell level APIS changes required for supporting offheap 
memory in HBase's read path.
We will be uploading a patch ported to trunk shortly by the end of this week or 
early next week. Along with some perf results. 
Request feedback/comments on the doc and the approach.

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: Offheap reads in HBase using BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)