[
https://issues.apache.org/jira/browse/HDFS-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897447#comment-13897447
]
Colin Patrick McCabe commented on HDFS-5810:
--------------------------------------------
munmap is going to be manipulating things in memory; mmap often has to hit
disk. That's why the latter is more expensive. Recent Linux kernels have more
fine-grained locking in this area, although I'm not an expert on that area of
the kernel. We can't do I/O while holding a global client-side lock-- clients
like HBase have on the order of 10k open files and we don't want to block
everyone.
bq. ClientContext#getFromConf, can we push the creation of a new DFSClient.Conf
into #get when it's necessary? Seems better to avoid doing all those hash
lookups.
That method is really only for tests, where it's inconvenient to dig around to
get a DFSClient.Conf. I will add a comment explaining that this is mostly for
testing. (I think JspHelper uses it too.)
bq. We removed the javadoc parameter descriptions in a few places, some of
which were helpful (e.g. len of -1 means read as many bytes as possible). Could
we add the one-line docs back to the builder variables?
Good idea. I added javadoc for the BlockReaderFactory members.
bq. Mind adding "dfs.client.cached.conn.retry" to hdfs-default.xml?
OK.
bq. cacheTries now counts down instead of counting up, so I think it needs a
new name. cacheTriesRemaining isn't great, but something like that.
ok
bq. cacheTries used to also only tick when we got a stale peer out of the
cache. Now, nextTcpPeer and nextDomainPeer tick cacheTries unconditionally.
The effect is the same, since if we get a non-stale (i.e. usable) peer out of
the cache, we're done. Centralizing it is a good idea since it avoids the kind
of bugs we had in the past where we forgot to handle certain kinds of retries
correctly.
bq. Previously, we would disable domain sockets or throw an exception if we hit
an error when using a new Peer (domain or TCP respectively). Now, we don't know
if a peer is cached or new, and spin until we run out of cacheTries (which
isn't really related here).
OK, that's fair. That variable is supposed to be about how many times we'll
try the *cache*, not how many times we'll retry in general. Fixed.
> Unify mmap cache and short-circuit file descriptor cache
> --------------------------------------------------------
>
> Key: HDFS-5810
> URL: https://issues.apache.org/jira/browse/HDFS-5810
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Affects Versions: 2.3.0
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Attachments: HDFS-5810.001.patch, HDFS-5810.004.patch,
> HDFS-5810.006.patch, HDFS-5810.008.patch, HDFS-5810.015.patch,
> HDFS-5810.016.patch, HDFS-5810.018.patch, HDFS-5810.019.patch
>
>
> We should unify the client mmap cache and the client file descriptor cache.
> Since mmaps are granted corresponding to file descriptors in the cache
> (currently FileInputStreamCache), they have to be tracked together to do
> "smarter" things like HDFS-5182.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)