[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11425: -- Release Note: For E2E off heaped read path, first of all there should be an off heap backed BucketCache(BC). Configure 'hbase.bucketcache.ioengine' to offheap in hbase-site.xml. Also specify the total capacity of the BC using hbase.bucketcache.size config. Please remember to adjust value of 'HBASE_OFFHEAPSIZE' in hbase-env.sh as per this capacity. Here-by we specify the max possible off-heap memory allocation for the RS java process. So this should be bigger than the off-heap BC size. Please keep in mind that there is no default for hbase.bucketcache.ioengine which means the BC is turned OFF by default. Next thing to tune is the ByteBuffer pool in the RPC server side. The buffers from this pool will be used to accumulate the cell bytes and create a result cell block to send back to the client side. 'hbase.ipc.server.reservoir.enabled' can be used to turn this pool ON or OFF. By default this pool is ON and available. HBase will create off heap ByteBuffers and pool them. Please make sure not to turn this OFF if you want E2E off heaping in read path. If this pool is turned off, the server will create temp buffers on heap to accumulate the cell bytes and make a result cell block. This can impact the GC on a highly read loaded server. The user can tune this pool with respect to how many buffers are in the pool and what should be the size of each ByteBuffer. Use the config 'hbase.ipc.server.reservoir.initial.buffer.size' to tune each of the buffer sizes. Defaults is 64 KB. When the read pattern is a random row read and each of the rows are smaller in size compared to this 64 KB, try reducing this. When the result size is larger than one ByteBuffer size, the server will try to grab more than one buffer and make a result cell block out of these. When the pool is running out of buffers, the server will end up creating temporary on-heap buffers. The maximum number of ByteBuffers in the pool can be tuned using the config 'hbase.ipc.server.reservoir.initial.max'. Its value defaults to 64 * region server handlers configured (See the config 'hbase.regionserver.handler.count'). The math is such that by default we consider 2 MB as the result cell block size per read result and each handler will be handling a read. For 2 MB size, we need 32 buffers each of size 64 KB (See default buffer size in pool). So per handler 32 ByteBuffers(BB). We allocate twice this size as the max BBs count such that one handler can be creating the response and handing it to the RPC Responder thread and then handling a new request creating a new response cell block (using pooled buffers). Even if the responder could not send back the first TCP reply immediately, our count should allow that we should still have enough buffers in our pool without having to make temporary buffers on the heap. Again for smaller sized random row reads, tune this max count. There are lazily created buffers and the count is the max count to be pooled. The setting for HBASE_OFFHEAPSIZE in hbase-env.sh should consider this off heap buffer pool at the RPC side also. We need to config this max off heap size for RS as a bit higher than the sum of this max pool size and the off heap cache size. The TCP layer will also need to create direct bytebuffers for TCP communication. Also the DFS client will need some off-heap to do its workings especially if short-circuit reads are configured. Allocating an extra of 1 - 2 GB for the max direct memory size has worked in tests. If you still see GC issues even after making E2E read path off heap, look for issues in the appropriate buffer pool. Check the below RS log with INFO level: "Pool already reached its max capacity : XXX and no free buffers now. Consider increasing the value for 'hbase.ipc.server.reservoir.initial.max' ?" If you are using co processors and refer the Cells in the read results, DO NOT store reference to these Cells out of the scope of the CP hook methods. Some times the CPs need store info about the cell (Like its row key) for considering in the next CP hook call etc. For such cases, pls clone the required fields of the entire Cell as per the use cases. [ See CellUtil#cloneXXX(Cell) APIs ] was: For E2E off heaped read path, first of all there should be an off heap backed BucketCache(BC). Configure 'hbase.bucketcache.ioengine' to offheap in hbase-site.xml. Also to specify the total capacity of the BC using hbase.bucketcache.size config. Please remember to adjust value of 'HBASE_OFFHEAPSIZE' in hbase-env.sh as per this capacity. Here by we specify the max possible off heap memory allocation by the RS java process. So this should be bigger than the off heap BC size. Please keep in mind that there is no default for hbase.bucketcache.ioengine means the BC is
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11425: --- Affects Version/s: (was: 0.99.0) Release Note: For E2E off heaped read path, first of all there should be an off heap backed BucketCache(BC). Configure 'hbase.bucketcache.ioengine' to offheap in hbase-site.xml. Also to specify the total capacity of the BC using hbase.bucketcache.size config. Please remember to adjust value of 'HBASE_OFFHEAPSIZE' in hbase-env.sh as per this capacity. Here by we specify the max possible off heap memory allocation by the RS java process. So this should be bigger than the off heap BC size. Please keep in mind that there is no default for hbase.bucketcache.ioengine means the BC is turned OFF by default. Next thing to tune is the ByteBuffer pool in the RPC server side. The buffers from this pool will be used to accumulate the cell bytes and create a result cell block to be send back to the client side. 'hbase.ipc.server.reservoir.enabled' can be used to turn this pool ON or OFF. By default this pool is ON and available. HBase will create off heap ByteBuffers and pool them. Please make sure not to turn this OFF if you want E2E off heaping in read path. If this pool is turned off, the server will create temp buffers on heap to accumulate the cell bytes and make a result cell block. This can impact the GC on a highly read loaded server. The user can tune this pool wrt how buffers to be there in pool and what should be the size of each of this ByteBuffer. Use the config 'hbase.ipc.server.reservoir.initial.buffer.size' to tune each of the buffer's size. This defaults to 64 KB. When the read pattern is a random row read and each of the rows are smaller in size compared to this 64 KB, try reducing this. When the result size is larger than one ByteBuffer size, the server will try to grab more than one buffer and make the result cell block out of those. When the pool is running out of buffers, to return the cell block result, the server will end up creating temp on heap buffers. The max number of ByteBuffers in the pool can be tuned using the config 'hbase.ipc.server.reservoir.initial.max'. Its value defaults to 64 * region server handlers configured (See also the config 'hbase.regionserver.handler.count'). The math is such that by default we consider 2 MB for result cell block size per read result and each handler will be handling this read. For 2 MB size, we need 32 buffers each of size 64 KB (See default buffer size in pool). So per handler 32 BBs. We allocate twice this size as the max BBs count such that one handler created the response and handed over it to the RPC Responder thread and again handling one more request and can create the reponse cell block (using pooled buffers). Even if the responder could not send back the first TCP reply, the pool can handle. Again for smaller sized random row reads, tune this max count. This is any way lazily created buffers and its is the max count to be pooled. The setting for HBASE_OFFHEAPSIZE in hbase-env.sh should consider this off heap buffer pool at RPC side also. We need to config this max off heap size for RS as a bit higher than the sum of this max pool size and the off heap cache size. The TCP layer will also need create direct bytebuffers for TCP communication. Also the DFS client will need some more. An extra of 1 - 2 GB for the max direct memory size would work as per tests. If you still see GC issues even after making E2E read path off heap, look for the possibility of such in appropriate buffer pool. Check the below RS log with INFO level "Pool already reached its max capacity : XXX and no free buffers now. Consider increasing the value for 'hbase.ipc.server.reservoir.initial.max' ?" If you are using co processors and refer the Cells in the read results, DO NOT store reference to these Cells out of the scope of the CP hook methods. Some times the CPs need store info about the cell (Like its row key) for considering in the next CP hook call etc. For such cases, pls clone the required fields of the entire Cell as per the use cases. [ See CellUtil#cloneXXX(Cell) APIs ] > Cell/DBB end-to-end on the read-path > > > Key: HBASE-11425 > URL: https://issues.apache.org/jira/browse/HBASE-11425 > Project: HBase > Issue Type: Umbrella > Components: regionserver, Scanners >Reporter: Anoop Sam John >Assignee: Anoop Sam John >Priority: Major > Fix For: 2.0.0 > > Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, GC pics > with evictions_4G heap.png, HBASE-11425-E2E-NotComplete.patch, > HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in > HBase using BBs_final.pdf, Screen Shot
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11425: --- Fix Version/s: 2.0.0 > Cell/DBB end-to-end on the read-path > > > Key: HBASE-11425 > URL: https://issues.apache.org/jira/browse/HBASE-11425 > Project: HBase > Issue Type: Umbrella > Components: regionserver, Scanners >Affects Versions: 0.99.0 >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0 > > Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, GC pics > with evictions_4G heap.png, HBASE-11425-E2E-NotComplete.patch, > HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in > HBase using BBs_final.pdf, Screen Shot 2015-10-16 at 5.13.22 PM.png, gc.png, > gets.png, heap.png, load.png, median.png, ram.log > > > Umbrella jira to make sure we can have blocks cached in offheap backed cache. > In the entire read path, we can refer to this offheap buffer and avoid onheap > copying. > The high level items I can identify as of now are > 1. Avoid the array() call on BB in read path.. (This is there in many > classes. We can handle class by class) > 2. Support Buffer based getter APIs in cell. In read path we will create a > new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), > CPs etc. > 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. > 4. Remove all CP hooks (which are already deprecated) which deal with KVs. > (In read path) > Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11425: -- Attachment: ram.log The log is not that interesting. It does not even have the eviction issue. Just shows us struggling w/ GC. Finally we OOME here: 2015-10-16T13:16:42.032-0700: [Full GC (Allocation Failure) 2015-10-16T13:16:42.032-0700: [CMS: 15276447K->15276422K(15276480K), 4.9700400 secs] 16273215K->16273119K(16273280K), [Metaspace: 48934K->48934K(1093632K)], 4.9774537 secs] [Times: user=4.97 sys=0.00, real=4.98 secs] 2015-10-16T13:16:47.012-0700: [Full GC (Allocation Failure) 2015-10-16T13:16:47.012-0700: [CMS: 15276422K->15276422K(15276480K), 1.2150090 secs] 16273204K->16273186K(16273280K), [Metaspace: 48901K->48901K(1093632K)], 1.2151393 secs] [Times: user=1.21 sys=0.00, real=1.22 secs] 2015-10-16T13:16:48.227-0700: [Full GC (Allocation Failure) 2015-10-16T13:16:48.227-0700: [CMS: 15276422K->15276422K(15276480K), 1.2185531 secs] 16273186K->16273186K(16273280K), [Metaspace: 48901K->48901K(1093632K)], 1.2186671 secs] [Times: user=1.22 sys=0.00, real=1.21 secs] # # java.lang.OutOfMemoryError: Java heap space # -XX:OnOutOfMemoryError="kill -9 %p" # Executing /bin/sh -c "kill -9 16941"... Configs are this: # The maximum amount of heap to use, in MB. Default is 1000. export HBASE_HEAPSIZE=16000 export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize=16g" hbase.bucketcache.ioengine offheap hbase.bucketcache.size 8196 hfile.block.cache.size 0.1 Looking at UI, hardly any meta blocks in L1... a couple of hundred. > Cell/DBB end-to-end on the read-path > > > Key: HBASE-11425 > URL: https://issues.apache.org/jira/browse/HBASE-11425 > Project: HBase > Issue Type: Umbrella > Components: regionserver, Scanners >Affects Versions: 0.99.0 >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, GC pics > with evictions_4G heap.png, HBASE-11425-E2E-NotComplete.patch, > HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in > HBase using BBs_final.pdf, gc.png, gets.png, heap.png, load.png, median.png, > ram.log > > > Umbrella jira to make sure we can have blocks cached in offheap backed cache. > In the entire read path, we can refer to this offheap buffer and avoid onheap > copying. > The high level items I can identify as of now are > 1. Avoid the array() call on BB in read path.. (This is there in many > classes. We can handle class by class) > 2. Support Buffer based getter APIs in cell. In read path we will create a > new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), > CPs etc. > 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. > 4. Remove all CP hooks (which are already deprecated) which deal with KVs. > (In read path) > Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11425: -- Attachment: Screen Shot 2015-10-16 at 5.13.22 PM.png Does this help? Looks like 7.5G retained by HFileScannerImpl in arraylist. > Cell/DBB end-to-end on the read-path > > > Key: HBASE-11425 > URL: https://issues.apache.org/jira/browse/HBASE-11425 > Project: HBase > Issue Type: Umbrella > Components: regionserver, Scanners >Affects Versions: 0.99.0 >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, GC pics > with evictions_4G heap.png, HBASE-11425-E2E-NotComplete.patch, > HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in > HBase using BBs_final.pdf, Screen Shot 2015-10-16 at 5.13.22 PM.png, gc.png, > gets.png, heap.png, load.png, median.png, ram.log > > > Umbrella jira to make sure we can have blocks cached in offheap backed cache. > In the entire read path, we can refer to this offheap buffer and avoid onheap > copying. > The high level items I can identify as of now are > 1. Avoid the array() call on BB in read path.. (This is there in many > classes. We can handle class by class) > 2. Support Buffer based getter APIs in cell. In read path we will create a > new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), > CPs etc. > 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. > 4. Remove all CP hooks (which are already deprecated) which deal with KVs. > (In read path) > Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11425: --- Attachment: GC pics with evictions_4G heap.png Tried out the experiments in a new cluster with single node. Loaded around 15G of data with 10 regions. Initially configured 4G as heap space, offheap space as 5G and the bucket cache size as 4G. Ran a pure read workload c for 30 mins. With 50 threads. I am not running into OOME and also the block eviction part is fine. The bigger GCs are around 450ms to 500ms. Repeated the same experiment with 10 G of heap space also. With this configuration we are sure that evictions are happening from the bucket cache. Attaching a GC snapshot for 5 mins captured during the workload test. Stack, Also in your experiment I think your data does not fit into the bucket cache and hence it is trying to evict. Or if it is fitting into the bucket cache, probably that was a file that was trying to get compacted and there was a reader referencing to it and due to the OOME the ref count decrement did not happen and the forceful eviction was failing. Will keep checking this. Any logs can you attach when this happened? Easier to debug (hopefully). > Cell/DBB end-to-end on the read-path > > > Key: HBASE-11425 > URL: https://issues.apache.org/jira/browse/HBASE-11425 > Project: HBase > Issue Type: Umbrella > Components: regionserver, Scanners >Affects Versions: 0.99.0 >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, GC pics > with evictions_4G heap.png, HBASE-11425-E2E-NotComplete.patch, > HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in > HBase using BBs_final.pdf, gc.png, gets.png, heap.png, load.png, median.png > > > Umbrella jira to make sure we can have blocks cached in offheap backed cache. > In the entire read path, we can refer to this offheap buffer and avoid onheap > copying. > The high level items I can identify as of now are > 1. Avoid the array() call on BB in read path.. (This is there in many > classes. We can handle class by class) > 2. Support Buffer based getter APIs in cell. In read path we will create a > new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), > CPs etc. > 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. > 4. Remove all CP hooks (which are already deprecated) which deal with KVs. > (In read path) > Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11425: -- Attachment: gets.png load.png gc.png median.png Some coarse graphs that run YCSB workload c (total random read) running for an hour with 100 clients against a dataset that is totally cached hosted on one server. The first run is against a RS that is using default, onheap memcache. The second is using bucketcache. I see that the work here makes it so using the bucketcache has the same latency and throughput (perhaps a little less throughput) as serving all from onheap (recall that in tests, buckecache as best if there were cache misses... if you could serve all from heap, onheap had a much nicer profile). To me, this makes it possible to run with the bucketcache all the time whether serving all from heap or when cache misses (recall, bucketcache did better when there were cache misses -- I have not looked to see if this work improves on what we saw previous). More testing to follow (a redo of our block cache comparisions post might be in order). The graphs are gc basic profile (this is CMS), gets per second, the median (the 75th and 95th percentiles weren't showing up for some reason... need to dig in... hopefully its because their incidence was low...), and overall loading and seeks. Offheap puts al little more load on the system, has a better GC profile, and is slightly less throughput. > Cell/DBB end-to-end on the read-path > > > Key: HBASE-11425 > URL: https://issues.apache.org/jira/browse/HBASE-11425 > Project: HBase > Issue Type: Umbrella > Components: regionserver, Scanners >Affects Versions: 0.99.0 >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, > HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase > using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf, gc.png, > gets.png, load.png, median.png > > > Umbrella jira to make sure we can have blocks cached in offheap backed cache. > In the entire read path, we can refer to this offheap buffer and avoid onheap > copying. > The high level items I can identify as of now are > 1. Avoid the array() call on BB in read path.. (This is there in many > classes. We can handle class by class) > 2. Support Buffer based getter APIs in cell. In read path we will create a > new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), > CPs etc. > 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. > 4. Remove all CP hooks (which are already deprecated) which deal with KVs. > (In read path) > Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11425: -- Attachment: heap.png Have you fellas run this for a while? I seem to OOME easy enough. I tried with a small heap, 4G, but it OOME'd then had to keep going back up to 16G again though I had a big offheap. I tried to capture the increasing use of heap but this diagram is best I got... I've not done heap analysis.. but in the diagram you can see heap use start to rise and then plummet... now I am crawling doing Full GCs every couple of seconds. The last meaningful log was this in regionserver log: 2015-10-14 16:51:10,058 DEBUG [main-BucketCacheWriter-2] bucket.BucketCache: This block 3f0157e7daee45fdb25202c496c95c46_1649898813 is still referred by 1 readers. Can not be freed now It started complaining 50minutes ago... > Cell/DBB end-to-end on the read-path > > > Key: HBASE-11425 > URL: https://issues.apache.org/jira/browse/HBASE-11425 > Project: HBase > Issue Type: Umbrella > Components: regionserver, Scanners >Affects Versions: 0.99.0 >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, > HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase > using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf, gc.png, > gets.png, heap.png, load.png, median.png > > > Umbrella jira to make sure we can have blocks cached in offheap backed cache. > In the entire read path, we can refer to this offheap buffer and avoid onheap > copying. > The high level items I can identify as of now are > 1. Avoid the array() call on BB in read path.. (This is there in many > classes. We can handle class by class) > 2. Support Buffer based getter APIs in cell. In read path we will create a > new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), > CPs etc. > 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. > 4. Remove all CP hooks (which are already deprecated) which deal with KVs. > (In read path) > Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11425: --- Attachment: (was: Benchmarks_Tests.docx) Cell/DBB end-to-end on the read-path Key: HBASE-11425 URL: https://issues.apache.org/jira/browse/HBASE-11425 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Affects Versions: 0.99.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the entire read path, we can refer to this offheap buffer and avoid onheap copying. The high level items I can identify as of now are 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can handle class by class) 2. Support Buffer based getter APIs in cell. In read path we will create a new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc. 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. 4. Remove all CP hooks (which are already deprecated) which deal with KVs. (In read path) Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11425: --- Attachment: Benchmarks_Tests.docx Cell/DBB end-to-end on the read-path Key: HBASE-11425 URL: https://issues.apache.org/jira/browse/HBASE-11425 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Affects Versions: 0.99.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the entire read path, we can refer to this offheap buffer and avoid onheap copying. The high level items I can identify as of now are 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can handle class by class) 2. Support Buffer based getter APIs in cell. In read path we will create a new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc. 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. 4. Remove all CP hooks (which are already deprecated) which deal with KVs. (In read path) Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11425: --- Attachment: (was: Benchmarks_Tests.docx) Cell/DBB end-to-end on the read-path Key: HBASE-11425 URL: https://issues.apache.org/jira/browse/HBASE-11425 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Affects Versions: 0.99.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the entire read path, we can refer to this offheap buffer and avoid onheap copying. The high level items I can identify as of now are 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can handle class by class) 2. Support Buffer based getter APIs in cell. In read path we will create a new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc. 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. 4. Remove all CP hooks (which are already deprecated) which deal with KVs. (In read path) Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11425: --- Attachment: Benchmarks_Tests.docx Cell/DBB end-to-end on the read-path Key: HBASE-11425 URL: https://issues.apache.org/jira/browse/HBASE-11425 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Affects Versions: 0.99.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, Benchmarks_Tests.docx, HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the entire read path, we can refer to this offheap buffer and avoid onheap copying. The high level items I can identify as of now are 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can handle class by class) 2. Support Buffer based getter APIs in cell. In read path we will create a new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc. 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. 4. Remove all CP hooks (which are already deprecated) which deal with KVs. (In read path) Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11425: --- Attachment: BenchmarkTestCode.zip Cell/DBB end-to-end on the read-path Key: HBASE-11425 URL: https://issues.apache.org/jira/browse/HBASE-11425 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Affects Versions: 0.99.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: BenchmarkTestCode.zip, Benchmarks_Tests.docx, HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the entire read path, we can refer to this offheap buffer and avoid onheap copying. The high level items I can identify as of now are 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can handle class by class) 2. Support Buffer based getter APIs in cell. In read path we will create a new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc. 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. 4. Remove all CP hooks (which are already deprecated) which deal with KVs. (In read path) Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11425: --- Attachment: Benchmarks_Tests.docx Cell/DBB end-to-end on the read-path Key: HBASE-11425 URL: https://issues.apache.org/jira/browse/HBASE-11425 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Affects Versions: 0.99.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: Benchmarks_Tests.docx, HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the entire read path, we can refer to this offheap buffer and avoid onheap copying. The high level items I can identify as of now are 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can handle class by class) 2. Support Buffer based getter APIs in cell. In read path we will create a new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc. 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. 4. Remove all CP hooks (which are already deprecated) which deal with KVs. (In read path) Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11425: --- Status: Open (was: Patch Available) Cell/DBB end-to-end on the read-path Key: HBASE-11425 URL: https://issues.apache.org/jira/browse/HBASE-11425 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Affects Versions: 0.99.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the entire read path, we can refer to this offheap buffer and avoid onheap copying. The high level items I can identify as of now are 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can handle class by class) 2. Support Buffer based getter APIs in cell. In read path we will create a new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc. 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. 4. Remove all CP hooks (which are already deprecated) which deal with KVs. (In read path) Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11425: --- Attachment: HBASE-11425.patch Updated patch handling cases of ByteBuffers and byte[] through out read the path and some refactoring also done. Still may not be the final patch. But more suitable for reviews. Cell/DBB end-to-end on the read-path Key: HBASE-11425 URL: https://issues.apache.org/jira/browse/HBASE-11425 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Affects Versions: 0.99.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the entire read path, we can refer to this offheap buffer and avoid onheap copying. The high level items I can identify as of now are 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can handle class by class) 2. Support Buffer based getter APIs in cell. In read path we will create a new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc. 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. 4. Remove all CP hooks (which are already deprecated) which deal with KVs. (In read path) Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11425: --- Status: Patch Available (was: Open) Cell/DBB end-to-end on the read-path Key: HBASE-11425 URL: https://issues.apache.org/jira/browse/HBASE-11425 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Affects Versions: 0.99.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-11425-E2E-NotComplete.patch, HBASE-11425.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the entire read path, we can refer to this offheap buffer and avoid onheap copying. The high level items I can identify as of now are 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can handle class by class) 2. Support Buffer based getter APIs in cell. In read path we will create a new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc. 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. 4. Remove all CP hooks (which are already deprecated) which deal with KVs. (In read path) Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11425: --- Attachment: HBASE-11425-E2E-NotComplete.patch Attaching an E2E patch for reference. Still some more cleanups we are doing. Also avoiding some code duplication still in patch. Cell/DBB end-to-end on the read-path Key: HBASE-11425 URL: https://issues.apache.org/jira/browse/HBASE-11425 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Affects Versions: 0.99.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-11425-E2E-NotComplete.patch, Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the entire read path, we can refer to this offheap buffer and avoid onheap copying. The high level items I can identify as of now are 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can handle class by class) 2. Support Buffer based getter APIs in cell. In read path we will create a new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc. 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. 4. Remove all CP hooks (which are already deprecated) which deal with KVs. (In read path) Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11425: --- Attachment: Offheap reads in HBase using BBs_V2.pdf Cell/DBB end-to-end on the read-path Key: HBASE-11425 URL: https://issues.apache.org/jira/browse/HBASE-11425 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Affects Versions: 0.99.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: Offheap reads in HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the entire read path, we can refer to this offheap buffer and avoid onheap copying. The high level items I can identify as of now are 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can handle class by class) 2. Support Buffer based getter APIs in cell. In read path we will create a new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc. 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. 4. Remove all CP hooks (which are already deprecated) which deal with KVs. (In read path) Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path
[ https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-11425: --- Attachment: Offheap reads in HBase using BBs_final.pdf A document explaining the motive, the design considerations, the reason for arriving at BB and the Cell level APIS changes required for supporting offheap memory in HBase's read path. We will be uploading a patch ported to trunk shortly by the end of this week or early next week. Along with some perf results. Request feedback/comments on the doc and the approach. Cell/DBB end-to-end on the read-path Key: HBASE-11425 URL: https://issues.apache.org/jira/browse/HBASE-11425 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Affects Versions: 0.99.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: Offheap reads in HBase using BBs_final.pdf Umbrella jira to make sure we can have blocks cached in offheap backed cache. In the entire read path, we can refer to this offheap buffer and avoid onheap copying. The high level items I can identify as of now are 1. Avoid the array() call on BB in read path.. (This is there in many classes. We can handle class by class) 2. Support Buffer based getter APIs in cell. In read path we will create a new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), CPs etc. 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy. 4. Remove all CP hooks (which are already deprecated) which deal with KVs. (In read path) Will add subtasks under this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)