[
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068240#comment-13068240
]
Li Pi commented on HBASE-4027:
------------------------------
-And when we cache a block, how we know where to put it? On or off heap? How
you decide where to put it (Looks like you put it off heap always here).
When we catch a block, we throw it into on-heap. Originally, I planned for it
to go off heap, but as the cache partitions in LRUBlockCache gives it scan
resistance already, this seemed unnecessary.
-Any chance of some offheap stats (is this getStats used?)
getStats now works for the offheap cache, at least for eviction count/cache hit
rates.
-We only return heapsize of onheap cache. You think we should not include
offheap?
I figured since offheap wasn't really in the heap, I didn't include it. Not
sure if this was the best option.
-Yeah, a bunch of these state methods go unimplemented. Can we do any of them?
Or is it that they just don't make sense in offheap context?
I took a look, implemented some of them. SlabCache implements everything except
getFreeSize. Since all the space it's given is allocated using
directbytebuffers immediately, theres never any free size. So I have it return
0.
-What is this limit? int blocksPerSlab = Integer.MAX_VALUE / blockSize; Max of
4G per slab?
ByteBuffer positions are addressed using integers. You can only declare one up
to 2gb using ByteBuffer.allocateDirect().
-Should slabsize just be hardcoded as Integer.MAX_VALUE?
Sometimes we want a slab thats less than 2gb. Such as when the size for our
entire cache is smaller than that.
-The kung fu going on in the shutdown of metaslab needs a comment. I think I
know whats going on. Explain what 'cleaner' is.
Yeah, this is how you deallocate directbytebuffers. Added comments, but its
still pretty voodoo. Basically cleaner is a deconstructor for a direct byte
buffer.
-Is there something up w/ the naming in MetaSlab or is it me misreading? We are
iterating up to blocksPerSlab but we are creating slabs, not blocks. I'd think
we'd be making blocks in a slab, not slabs.
If we want to create N blocks of size X, and a slab, can, at max, contain B
blocks, we decrement N by B until N is below B. We create a slab each time and
divide that slab into blocks. If N < B and N > 0, then we create our final slab
that we use to get space for the remaining blocks.
-You think this is going to happen or how could it happen?
This shouldn't happen unless we have a race condition. But we might, in which
case we should throw an exception. I'll add some comments to this portion.
-Whats going on here (I see this in a few places):
If we get a null pointer exception, that means the cache missed. Therefore we
return null, increment the missed counter.
-The get from backingStore will never return a null here?
As an invariant, it should never. Because if we are running out of buffers, one
should be evicted by the ConcurrentLinkedHashMap when we do a read. On closer
inspection, this can happen in a multithreaded environment, I'll figure out a
way to fix this. (Probably by synchronizing it.)
-Is this a good name for the default blocksize config?
"hbase.mapreduce.hfileoutputformat.blocksize" The 'outputformat' would seem to
come from mapreduce world (Which seems to be where you found it if I grep src).
SHould we be using DEFAULT_BLOCKSIZE from hfile here instead?
Switched it to DEFAULT_BLOCKSIZE.
Apologies for the first few patches. Apparently I diffed against a different
branch than I had intended. Thus the pom.xml edits and commented out code. That
was for speed testing and benchmarking. I fixed a few more things, and will
continue to search for bugs and add documentation. I'm also implementing better
metrics.
> Enable direct byte buffers LruBlockCache
> ----------------------------------------
>
> Key: HBASE-4027
> URL: https://issues.apache.org/jira/browse/HBASE-4027
> Project: HBase
> Issue Type: Improvement
> Reporter: Jason Rutherglen
> Assignee: Li Pi
> Priority: Minor
> Attachments: slabcachepatch.diff, slabcachepatchv2.diff,
> slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff
>
>
> Java offers the creation of direct byte buffers which are allocated outside
> of the heap.
> They need to be manually free'd, which can be accomplished using an
> documented {{clean}} method.
> The feature will be optional. After implementing, we can benchmark for
> differences in speed and garbage collection observances.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira