[
https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14583173#comment-14583173
]
ramkrishna.s.vasudevan commented on HBASE-12295:
------------------------------------------------
Had some internal discussions on some of the questions raised
Thanks for the reviews so far over in RB. We discussed some of your comments
and thought of finalizing things and also thought it would help to clarify
certain aspects as why it was done.
-> CellBlock creation
The first area is the cellblock creation. As per the previous mail discussion
now we will have now create the cellBlock in the RsRpcServices layer both for
scan and gets.
The scans already were creating the cellScanner in the RsRpcServices layer -
now that we have added the cellBlock creation here.
But in order to do this we are creating a new getScanner() API in HRegion -
(note that it is not in Region.java) so it is not exposed to CPs.
This new API would indicate if the client is java based or not. For java based
client we will allow the cellblock creation in the RsRpcServices layer and then
the blocks are returned back (ref count decremented).
But for the non-java cases we will go with the old getScanner() API only but
internally it would set a state in the RegionScanner saying that the results
are to be copied then and there as there is no cellblock creation for it. (We
don't create cellblock for non-java cases as you may be already knowing it).
Once the copying of the results are done we are sure that we can return the
blocks back.
This new state in the REgionScanner also helps us with the case of CPs wrapping
the scanner where though the scan is from java -client the partial results
could be used in the CPs and we may not know how the CPs will use the result.
Hence it is better to copy those results like we do for non-java client cases.
In case of scan() after every next() we make a call to return the blocks so
that we don't hold them up till the scanners are done.
Now coming to get(), we have now tweaked the logic of get() in RsrpcServices
such that what ever was being done in Region.get() is now moved to
RsRpcServices included CP hooks and once the result is formed we create the
cellblock in RsRpcServices.
For the non-java case the old Region.get() API is still used. Just like in
scan the results would be copied once it is formed.
Remember we need to use it for CP when it needs to do a get() on a region
directly.
In case of gets() the scanner.close() woudl ensure that the block ref count
decrement happens.
-> MemType and CacheType
First of all we can be sure that none of the blocks coming from L1 cache needs
to go through all these shared memory or copying stuff. Because they are just
objects referring to the blocks created from HDFS and nothing is serialized and
there is no problem of eviction.
We need to do this only for Bucketcache. BucketCache again has 3 things
-> RamQueue - which is like initially the blocks are added to this queue and
then moved to the actual buckets. So there are chances the reads are served
from these queue itself. Those cases there is no need for any copying or
sharing of memory. (this is similar to L1).
-> ByteBufferIOEngine
This is the actual case where we need to solve. Here the block is for sure
from L2 and also the memory is shared. So if you see in our patch we would
have set this L2 and shared_memory only at this layer.
-> fileIOEngine
Here though it is a bucketcache impl there is no need for having a shared_mem
because already there is copy from the file.
But who is responsible for setting the memory type-? It should be the engine.
The engine knows what type of memory it is creating. The engines just return a
BB and we don't know the type of it. so better the engine set the type of the
memory also.
-> ReturnBlock in between code in HFileREaders
This is important. As I earlier commented in the RB, the index or bloom blocks
from L1 cache can get demoted to L2 cache. Our code is such that in some case
we try to read a block and we iterate till we reach a data block. Those blocks
could have been from L1 or L2 - we are not sure.
So we have analysed the code and ensured that where ever we read blocks and
just throw it away till we end up in a data block- all such blocks have to be
returned back. If the block is from L1 it is going to be a noop.
I verified it once again and to be safe added some try/finally around it. I
can explain more on this if you have some queries.
->IPCUtil singleton
IPCUtil singleton was mainly for ease of use. The IPCutil had the code to
create Cellblock. So we wanted to reuse that code rather than duplicating it.
We could create IPCUTil inside Payload also. But for doing that we need the
'conf' object. Most of the cases Payload is created from RPcControllerfactory
where already we have 'conf'. But in case of AbstractRPcClient,
AbstractRpcCaller etc we need to get the access to these conf object so that we
can pass a 'conf' to Payload and use that conf to create the IPCUtil object.
So in order to avoid this in the latest will pass the IPCtuil to the Payload
using a helper class which will also have all the other ingredients needed to
create a cellblock. This wil avoid all the singleton related changes to
IPCUtil.
To which Stack had some comments
Hmm.
RsRpcServices is a weird class. It has not character. It is just something
Jimmy made to move a bunch of HRegionServer to its own class.
So, will Get be the only thing in Region that cannot be intercepted by CPs?
Other methods in Region can be intercepted by CPs? Having Get be an exception
will cause confusion.
Or is it that you are saying that the old Get can be used by CPs to
intercept... and in this case there will be the unavoidable copy. If so, that
sounds ok (where can I see current state of code?)
For the MemType and CacheType his reply was
bq.
Thats fine. I just don't like labeling each individual block. High-level, the
engine knows where the block came from, right? It can stamp the return with
where it came from rather than do it per block?
> Prevent block eviction under us if reads are in progress from the BBs
> ---------------------------------------------------------------------
>
> Key: HBASE-12295
> URL: https://issues.apache.org/jira/browse/HBASE-12295
> Project: HBase
> Issue Type: Sub-task
> Components: regionserver, Scanners
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: HBASE-12295.pdf, HBASE-12295_1.patch, HBASE-12295_1.pdf,
> HBASE-12295_2.patch, HBASE-12295_4.patch, HBASE-12295_trunk.patch
>
>
> While we try to serve the reads from the BBs directly from the block cache,
> we need to ensure that the blocks does not get evicted under us while
> reading. This JIRA is to discuss and implement a strategy for the same.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)