[
https://issues.apache.org/jira/browse/HDFS-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893011#comment-13893011
]
Andrew Wang commented on HDFS-5810:
-----------------------------------
I don't ask this lightly, but I really would like some patch splitting here. I
see a lot of nice refactors, but it's hard to separate out the intent of the
changes. One potential easy split is making BlockReader into a builder, since
it's pretty mechanical and touches a lot of files.
The below is just some nitty stuff and a read of ShortCircuitCache, I haven't
gotten to the rest yet:
Some small nits:
* ClientContext#confAsString needs to be updated to match the ClientContext
members, one dupe too.
* hdfs-default.xml, the default value of stale.threshold.ms of 3000000 does not
match the DFSConfigKeys default value of 1800000.
* SCReplica#Key is almost identical to FsDatasetCache#Key, but I guess that's
okay unless you want to make a shared BlockKey class.
* How did you decide on the default mmap cache size of 1024? I feel like since
we default to 4096 max datanode transfer threads, that'd be a more appropriate
number. If the default FD ulimit of 1024 is your concern, seems like we should
lower it below that for some headroom.
ShortCircuitCache:
* If the mmap and normal caches are configured to a small and a big value
respectively, an evicted mmap will go to the normal cache and sit there until
the big timeout, which is kind of weird. Can we unlink the two caches
completely?
* The max # of mmaps is a good reason to have two separate caches, would be
nice to comment this. It's more compelling to me than the perf reasons in the
current comment.
* These caches right now expire purely on a time basis, not actually based on
pressure like, say, the OS page cache. For the mmap cache at least, how about a
hybrid: have a really long timeout, but make eviction also pressure based? The
danger of caching forever seems to be holding on to a "bad fd" (e.g. DN deletes
the file), but with the forthcoming shm communication segment this is mitigated.
* {{s/mmaped/mmapped/g}}, good to be consistent. {{munmapped}} too, and case
insensitive.
* Can use java's built-in TimeUnit to convert rather than the new
nanosecondsToMilliseconds function
* In CacheCleaner, rather than a while loop, why not a for each? It'll use the
iterator order, which should be ascending insertion time.
* Could probably split out the two eviction loops into a single parameterized
function.
* ref, unref, fetchOrCreate javadoc say you need to hold the lock, but they
themselves take the cache lock.
* close, the try/finally lock idiom takes the lock inside the try, normally
this is outside?
* Propagating a string tag for trace logging is ugly. Can we walk the stack
instead to get the caller, e.g. via {{ExceptionUtils#getFullStackTrace}} or
other? I'm okay paying the stack construction cost if it's only for trace
logging.
* allocMmap, no longer a toClose param mentioned in the javadoc
* unref, we could get rid of {{newRefCount}}, it doesn't really improve
readability
* ShortCircuitReplica#unref calls ShortCircuitCache#unref, but there's no
purge. Will this precondition abort?
* Some references to eviction "lists" but they look to be maps now
> Unify mmap cache and short-circuit file descriptor cache
> --------------------------------------------------------
>
> Key: HDFS-5810
> URL: https://issues.apache.org/jira/browse/HDFS-5810
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Affects Versions: 2.3.0
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Attachments: HDFS-5810.001.patch, HDFS-5810.004.patch,
> HDFS-5810.006.patch, HDFS-5810.008.patch, HDFS-5810.015.patch
>
>
> We should unify the client mmap cache and the client file descriptor cache.
> Since mmaps are granted corresponding to file descriptors in the cache
> (currently FileInputStreamCache), they have to be tracked together to do
> "smarter" things like HDFS-5182.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)