[
https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564311#comment-14564311
]
Anoop Sam John commented on HBASE-12295:
----------------------------------------
bq.We'll have to dig in on why. You'd think w/ less intermediaries that it
would be faster.
It should be the cost at socket layer and we will need N transfers instead of
one. This one time transfer was looking better even if we need a temp copy.
Regarding knowing whether L1 or L2 looking at key, actually this info of
whether L1 or L2 is a state of HFileBlock. We have added this with an enum
L1/L2/NOT_CACHED. Based on this type, we decided at the HFileScanner layer (on
close) whether to call return on BlockCache. Also within the BlockCache impl,
we might need to know the type. This is for CombineBC. If it is L2, then we
call the BucketCache return and else call LRU cache return. So if we add the
L1/L2 info also to BlockCacheKey, I am not sure whether this looks clean.
BlockCacheKey is some thing which we will be creating while fetching the block
from BC. While return, we can just pass the info by setting it in
BlockCacheKey. It will just act as a carrier then. Or may be we can use
HFileBlock object alone in the return API? Using a key we have got an object
from a cache and we return *that* object back to the cache. It is always
possible to make the BlockCacheKey from HFileBlock.
bq. You going to mark the object as from L2 or something
Yes. HFileBlock will contain state info whether it is from L1 or L2 or
NOT_CACHED one. When it is CombinedBC, HFileReader ask the cache to give
block and it returns the HFileBlock. So we are not sure from where it has come
L1/L2. So better set it as a state info in HFileBlock
carry the cellBlock in Result, am not sure.. At HRegion level, the get()
return a Result but the scanner returns a List of Cells. Then in RsRpcServer
level, we call in al loop to make those many rows/results as per caching/max
size limit. Even if we make it to return a Result in scan area also, it will
make overhead of creating smaller sized cellBlock buffer for each of the rows.
So finally we will have to deal with more smaller size block buffers. It will
be better to collect all rows and then make a single cellBlock at once for the
scan case. Making sense? Agree to your point of not passing RPC stuff even to
HRegion level. We have to see what else we can do to return this payload.
I think I got now what is in your mind on saying finalize/close on Result and
handle things that way. Right now, when we get a block from BC, we increase
its ref count by 1, means one scanner is working on this. So if we have to do
in this suggestion, then whenever we are creating a cell from this block, we
have to again increment the ref count. Some thing like java ref counting way.
Only Q is Result/Cell is a client side thing and am not sure how we can add
server only BlockCache/ HFileBlock... But this would have made max NOT copy to
happen.. Thinking more...
> Prevent block eviction under us if reads are in progress from the BBs
> ---------------------------------------------------------------------
>
> Key: HBASE-12295
> URL: https://issues.apache.org/jira/browse/HBASE-12295
> Project: HBase
> Issue Type: Sub-task
> Components: regionserver, Scanners
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: HBASE-12295.pdf, HBASE-12295_trunk.patch
>
>
> While we try to serve the reads from the BBs directly from the block cache,
> we need to ensure that the blocks does not get evicted under us while
> reading. This JIRA is to discuss and implement a strategy for the same.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)