[ 
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648808#comment-14648808
 ] 

Dave Latham commented on HBASE-12219:
-------------------------------------

Sorry I missed this JIRA at the time, but I have a couple of concerns if I'm 
understanding this change correctly.  I'd like to check to see if I am.  It 
looks like the effect is to back out the directory modtime caching and instead 
have the master just maintain a persistent in memory cache.

Previous to this change it was safe to have any processes on the cluster use 
FSTableDescriptors to read or write table descriptors.  Updates would be atomic 
and consistent, immediately available to all readers.  However, the cost of 
that was a HDFS NN operation on every table descriptor read to prove it had not 
changed.  Since in practice it appears that only the master process ever 
updates an existing table descriptor, it should be safe to have the master skip 
the directory modtime checks and proactively update its cached copy.  (hbck can 
also create table descriptors for orphaned tables but hopefully those don't 
happen to tables the master has already cached).

This change makes the master descriptor reads faster but imposes the constraint 
that only the active master should update table descriptors - any other writers 
would cause the master cache to become stale.  It also means that no other 
processes should use the cache the same way as the master could change the data 
and cause stale caches.  Assuming this is the case, I think we'd be better 
served by reflecting that in the FSTableDescriptors API and javadoc.  For 
example, currently most constructors and usages now default to keeping a 
persistent cache as well as allowing updates which sets a bad example for new 
uses.  There are also no warnings in the javadoc about the new contract.  
Possibly better would be to make the default constructor be read only and have 
persistent caching disabled.  Then another constructor for the master allowing 
both writes and persistent caching.

This change also seems to remove all table descriptor caching from the region 
servers (the old directory modtime caching is gone and the new caching is 
disabled for region servers).  Thanks to HBASE-8778 reloading from the FS each 
time is cheaper than it used to be, but this change still increases the cost 
from 1 NN operation (check directory modtime) to 2 NN + 3 DN operations (find 
current file, get its block locations, open block, read close block).  This 
slows things down a bit again for mass assignments/balances on huge tables.  It 
seems better for the region servers to retain the directory modtime caching, 
but simply skip the modtime check when running inside the master.

Does that understanding of this change sound correct - or did I botch it?  
Sorry I missed it at the time.  If that sounds right, a follow up JIRA may be 
good, and if I see our table assignments slower from this and no one else gets 
to it I can try to put up the changes.

> Cache more efficiently getAll() and get() in FSTableDescriptors
> ---------------------------------------------------------------
>
>                 Key: HBASE-12219
>                 URL: https://issues.apache.org/jira/browse/HBASE-12219
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
>            Reporter: Esteban Gutierrez
>            Assignee: Esteban Gutierrez
>              Labels: scalability
>             Fix For: 2.0.0, 0.98.8, 0.99.2
>
>         Attachments: HBASE-12219-0.94.patch, HBASE-12219-0.98.patch, 
> HBASE-12219-0.98.v1.addendum.patch, HBASE-12219-0.98.v1.patch, 
> HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch, 
> HBASE-12219-0.99.v1.patch, HBASE-12219-v1.patch, HBASE-12219-v1.patch, 
> HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png
>
>
> Currently table descriptors and tables are cached once they are accessed for 
> the first time. Next calls to the master only require a trip to HDFS to 
> lookup the modified time in order to reload the table descriptors if 
> modified. However in clusters with a large number of tables or concurrent 
> clients and this can be too aggressive to HDFS and the master causing 
> contention to process other requests. A simple solution is to have a TTL 
> based cached for FSTableDescriptors#getAll() and  
> FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to 
> process those calls faster without causing contention without having to 
> perform a trip to HDFS for every call. to listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to