[jira] [Commented] (HDFS-5810) Unify mmap cache and short-circuit file descriptor cache

Colin Patrick McCabe (JIRA) Mon, 10 Feb 2014 18:05:14 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897447#comment-13897447
 ]


Colin Patrick McCabe commented on HDFS-5810:
--------------------------------------------

munmap is going to be manipulating things in memory; mmap often has to hit 
disk.  That's why the latter is more expensive.  Recent Linux kernels have more 
fine-grained locking in this area, although I'm not an expert on that area of 
the kernel.  We can't do I/O while holding a global client-side lock-- clients 
like HBase have on the order of 10k open files and we don't want to block 
everyone.

bq. ClientContext#getFromConf, can we push the creation of a new DFSClient.Conf 
into #get when it's necessary? Seems better to avoid doing all those hash 
lookups.

That method is really only for tests, where it's inconvenient to dig around to 
get a DFSClient.Conf.  I will add a comment explaining that this is mostly for 
testing.  (I think JspHelper uses it too.)

bq. We removed the javadoc parameter descriptions in a few places, some of 
which were helpful (e.g. len of -1 means read as many bytes as possible). Could 
we add the one-line docs back to the builder variables?

Good idea.  I added javadoc for the BlockReaderFactory members.

bq. Mind adding "dfs.client.cached.conn.retry" to hdfs-default.xml?

OK.

bq. cacheTries now counts down instead of counting up, so I think it needs a 
new name. cacheTriesRemaining isn't great, but something like that.

ok

bq. cacheTries used to also only tick when we got a stale peer out of the 
cache. Now, nextTcpPeer and nextDomainPeer tick cacheTries unconditionally.

The effect is the same, since if we get a non-stale (i.e. usable) peer out of 
the cache, we're done.  Centralizing it is a good idea since it avoids the kind 
of bugs we had in the past where we forgot to handle certain kinds of retries 
correctly.

bq. Previously, we would disable domain sockets or throw an exception if we hit 
an error when using a new Peer (domain or TCP respectively). Now, we don't know 
if a peer is cached or new, and spin until we run out of cacheTries (which 
isn't really related here).

OK, that's fair.  That variable is supposed to be about how many times we'll 
try the *cache*, not how many times we'll retry in general.  Fixed.

> Unify mmap cache and short-circuit file descriptor cache
> --------------------------------------------------------
>
>                 Key: HDFS-5810
>                 URL: https://issues.apache.org/jira/browse/HDFS-5810
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: 2.3.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5810.001.patch, HDFS-5810.004.patch, 
> HDFS-5810.006.patch, HDFS-5810.008.patch, HDFS-5810.015.patch, 
> HDFS-5810.016.patch, HDFS-5810.018.patch, HDFS-5810.019.patch
>
>
> We should unify the client mmap cache and the client file descriptor cache.  
> Since mmaps are granted corresponding to file descriptors in the cache 
> (currently FileInputStreamCache), they have to be tracked together to do 
> "smarter" things like HDFS-5182.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5810) Unify mmap cache and short-circuit file descriptor cache

Reply via email to