[ 
https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14583177#comment-14583177
 ] 

ramkrishna.s.vasudevan commented on HBASE-12295:
------------------------------------------------

We had more discussions on the Cache type and memory type
Let me try to explain use of each (Part is already said by Ram as well as me in 
RB. Pardon for the repetition)
CacheType is saying whether the block is from which cache or not at all from 
cache. This is useful while returning back the block.  The return back is what 
is doing the ref count decrement now. But the return block can do any other 
kind of cleanup. We tried to make it general. So while return back, we have to 
know to which cache the block has to be returned (L1 or L2). As of today the L1 
return is a noop still we are doing it. If we dont have the CacheType in block, 
we have to return it to both and search in both places. This is an overhead. 
Also there is another problem with CombinedCache. Consider a block is demoted 
from L1 to L2. While it was in L1, it was served from it to a scanner. But 
before it is returned back, it got moved to L2. Then another scanner get this 
same block and this time from L2. So the ref count for this block (block key) 
got incremented and now it is 1. (Remember the old scanner got from L1 and so 
ref count increment at that time).. Now the old scanner returning block and we 
return it to both L1 and L2. L2 will have an entry and ref count will become 
zero. Still an active scanner refering this.  This can cause issues. Tomorrow 
if L3 cache also comes, if we mark the block from where it has come, we can do 
correct return and correct action.
The usage of MemType (as of now Shared or NonShared) is for cell creation. We 
have to make Cells backed by shared cache memory location as SharedMemoryCell 
marked. Cells which are not coming from shared mem backed blocks, need not be 
SharedMemoryCell marked. CacheType of L2 does not mean always that the block is 
backed by shared memory. An eg: is FileIOEngine. Here while reading the blocks, 
we will have to read the data to a heap memory area (byte[]) from files.  

May be we can say for SharedMem type only the return is needed as of today.  
Still we wanted these 2 to be independent general things and f/w.  IMHO, this 
looks cleaner and more extendable for future.

To which Stack replied

One more Q from Stack was why we need CacheType and MemType 2 enums? Will one 
be enough?
Let me try to explain use of each (Part is already said by Ram as well as me in 
RB. Pardon for the repetition)
CacheType is saying whether the block is from which cache or not at all from 
cache. This is useful while returning back the block.  The return back is what 
is doing the ref count decrement now. But the return block can do any other 
kind of cleanup. We tried to make it general. So while return back, we have to 
know to which cache the block has to be returned (L1 or L2). As of today the L1 
return is a noop still we are doing it. If we dont have the CacheType in block, 
we have to return it to both and search in both places.


>>Can you mark the HFileBlock with where it came form when you cache it rather 
>>than write out the type with the data?
 
This is an overhead. Also there is another problem with CombinedCache.

>>CombinedCache is one possible combination. Change it if is making your life 
>>more difficult.

 
Consider a block is demoted from L1 to L2.

For CombinedCache, this would be an index or bloom block only.

 
While it was in L1, it was served from it to a scanner. But before it is 
returned back, it got moved to L2. Then another scanner get this same block and 
this time from L2. So the ref count for this block (block key) got incremented 
and now it is 1. (Remember the old scanner got from L1 and so ref count 
increment at that time).. Now the old scanner returning block and we return it 
to both L1 and L2. L2 will have an entry and ref count will become zero. Still 
an active scanner refering this.  This can cause issues. Tomorrow if L3 cache 
also comes, if we mark the block from where it has come, we can do correct 
return and correct action.

>>They'd be the same HFileBlock instance? The same item in cache?  We're 
>>talking now of moving between caches while something is being used. That'd be 
>>a no-no, right? If its referenced you can't move it, not unless its L1 where 
>>it is safe to move it (you'd just scrub refcounts)

>>Can keep type specific refcounts if an issue?

 
The usage of MemType (as of now Shared or NonShared) is for cell creation. We 
have to make Cells backed by shared cache memory location as SharedMemoryCell 
marked. Cells which are not coming from shared mem backed blocks, need not be 
SharedMemoryCell marked. CacheType of L2 does not mean always that the block is 
backed by shared memory. An eg: is FileIOEngine. Here while reading the blocks, 
we will have to read the data to a heap memory area (byte[]) from files.  


>>Ok

 
May be we can say for SharedMem type only the return is needed as of today.  
Still we wanted these 2 to be independent general things and f/w.  IMHO, this 
looks cleaner and more extendable for future.


>>Ok.

> Prevent block eviction under us if reads are in progress from the BBs
> ---------------------------------------------------------------------
>
>                 Key: HBASE-12295
>                 URL: https://issues.apache.org/jira/browse/HBASE-12295
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver, Scanners
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 2.0.0
>
>         Attachments: HBASE-12295.pdf, HBASE-12295_1.patch, HBASE-12295_1.pdf, 
> HBASE-12295_2.patch, HBASE-12295_4.patch, HBASE-12295_trunk.patch
>
>
> While we try to serve the reads from the BBs directly from the block cache, 
> we need to ensure that the blocks does not get evicted under us while 
> reading.  This JIRA is to discuss and implement a strategy for the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to