[
https://issues.apache.org/jira/browse/HBASE-12219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648808#comment-14648808
]
Dave Latham commented on HBASE-12219:
-------------------------------------
Sorry I missed this JIRA at the time, but I have a couple of concerns if I'm
understanding this change correctly. I'd like to check to see if I am. It
looks like the effect is to back out the directory modtime caching and instead
have the master just maintain a persistent in memory cache.
Previous to this change it was safe to have any processes on the cluster use
FSTableDescriptors to read or write table descriptors. Updates would be atomic
and consistent, immediately available to all readers. However, the cost of
that was a HDFS NN operation on every table descriptor read to prove it had not
changed. Since in practice it appears that only the master process ever
updates an existing table descriptor, it should be safe to have the master skip
the directory modtime checks and proactively update its cached copy. (hbck can
also create table descriptors for orphaned tables but hopefully those don't
happen to tables the master has already cached).
This change makes the master descriptor reads faster but imposes the constraint
that only the active master should update table descriptors - any other writers
would cause the master cache to become stale. It also means that no other
processes should use the cache the same way as the master could change the data
and cause stale caches. Assuming this is the case, I think we'd be better
served by reflecting that in the FSTableDescriptors API and javadoc. For
example, currently most constructors and usages now default to keeping a
persistent cache as well as allowing updates which sets a bad example for new
uses. There are also no warnings in the javadoc about the new contract.
Possibly better would be to make the default constructor be read only and have
persistent caching disabled. Then another constructor for the master allowing
both writes and persistent caching.
This change also seems to remove all table descriptor caching from the region
servers (the old directory modtime caching is gone and the new caching is
disabled for region servers). Thanks to HBASE-8778 reloading from the FS each
time is cheaper than it used to be, but this change still increases the cost
from 1 NN operation (check directory modtime) to 2 NN + 3 DN operations (find
current file, get its block locations, open block, read close block). This
slows things down a bit again for mass assignments/balances on huge tables. It
seems better for the region servers to retain the directory modtime caching,
but simply skip the modtime check when running inside the master.
Does that understanding of this change sound correct - or did I botch it?
Sorry I missed it at the time. If that sounds right, a follow up JIRA may be
good, and if I see our table assignments slower from this and no one else gets
to it I can try to put up the changes.
> Cache more efficiently getAll() and get() in FSTableDescriptors
> ---------------------------------------------------------------
>
> Key: HBASE-12219
> URL: https://issues.apache.org/jira/browse/HBASE-12219
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.94.24, 0.99.1, 0.98.6.1
> Reporter: Esteban Gutierrez
> Assignee: Esteban Gutierrez
> Labels: scalability
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: HBASE-12219-0.94.patch, HBASE-12219-0.98.patch,
> HBASE-12219-0.98.v1.addendum.patch, HBASE-12219-0.98.v1.patch,
> HBASE-12219-0.99.addendum.patch, HBASE-12219-0.99.patch,
> HBASE-12219-0.99.v1.patch, HBASE-12219-v1.patch, HBASE-12219-v1.patch,
> HBASE-12219.v0.txt, HBASE-12219.v2.patch, HBASE-12219.v3.patch, list.png
>
>
> Currently table descriptors and tables are cached once they are accessed for
> the first time. Next calls to the master only require a trip to HDFS to
> lookup the modified time in order to reload the table descriptors if
> modified. However in clusters with a large number of tables or concurrent
> clients and this can be too aggressive to HDFS and the master causing
> contention to process other requests. A simple solution is to have a TTL
> based cached for FSTableDescriptors#getAll() and
> FSTableDescriptors#TableDescriptorAndModtime() that can allow the master to
> process those calls faster without causing contention without having to
> perform a trip to HDFS for every call. to listtables() or getTableDescriptor()
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)