[ 
https://issues.apache.org/jira/browse/HDFS-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13895233#comment-13895233
 ] 

Colin Patrick McCabe commented on HDFS-5810:
--------------------------------------------

bq. If the mmap and normal caches are configured to a small and a big value 
respectively...Can we unlink the two caches completely?

I guess it seemed elegant to me at the time: 
{{dfs.client.mmap.cache.timeout.ms}} controls how long something is mmapped, 
but even after the mmap is gone, we honor 
{{dfs.client.read.shortcircuit.streams.cache.expiry.ms}}.   Nothing is 
wasted... we reuse stuff.  Maybe we just make this clearer in the docs?  What 
do you think.

bq. For the tracing, my goal is to avoid runtime overhead if tracing is off. I 
think us savvy HDFS developers could figure out how to read a compressed stack 
trace, e.g. just showing the names of methods. Another alternative is putting 
the trace tag in a thread-local, maybe with a unique ID.

I don't think the strings will create that much runtime overhead.  They're 
constant, which means they will probably be stored in some permanent memory 
section that never gets GC'ed.  I agree that it would look a little "cleaner" 
without them, but the knowledge of who called what really saved my bacon during 
some debugging sessions, so I'm pretty resistant to removing them completely.  
Thread-locals seem messy for this.  I'll think about what the options are 
here... maybe a stack trace wouldn't be that bad?  Using stack traces would 
remove that "caller" parameter, but they might make things a bit harder to read 
when using TRACE.  I didn't change it in the latest patch...

bq. On the unref/purge, I missed that the refcount starts off at 2, which is 
unusual. I'd like to see the SCCache ref done explicitly in SCCache, or at 
least documented in SCCache too.

I added a comment in {{ShortCircuitCache#create}} explaining why we don't need 
to ref a new replica.

bq. On timeouts, we might want to be even more aggressive at increasing them. A 
recent FB HBase/HDFS paper published at FAST 14 shows that caching on the order 
of hours to provide continued benefit. The inflection point is something like 1 
or 2 hours, so that'd be my target. IIUC, allocMmap has pressure-based 
eviction, so we can tolerate it.

OK, let me increase the mmap cache timeout to an hour.

bq. Need timeouts for new tests (I just checked via grep in all changed/new 
test files)

Added.

bq. ...

bq. rename "SCReplica#checkStale" to "isStale" per java/hadoop convention

OK

bq. What's the rationale for the extremely polymorphic Object pointer in 
SCReplica? All the casting is messy

I guess the rationale is that we only use one of {condition variable, last 
failure time, clientmmap object} at a time.  I can take a look at whether this 
could be improved.  I kind of like the way the code flows in 
getOrCreateClientMmap now, though.  Maybe it could be improved, though...

bq. Do we really need to release and retake the lock for making an mmap? I 
didn't think this was expensive enough to warrant this, and I noticed you 
didn't do this when unmapping.

mmap is an expensive operation, and this is global lock we're dealing with 
here.  The existing code does this I/O after releasing the lock, and I wouldn't 
want to move backwards here.

bq. allocMmap, we decrement outstandingMmapCount, but isn't that already 
handled by the munmap in the line before? If this is legit, a verifying test 
case would be good.

Yes, this was a bug.  I didn't originally have a test for verifying the 'mmap 
scavenging' code path where we have to unmap and existing mmap.  Let me add one 
now.

> Unify mmap cache and short-circuit file descriptor cache
> --------------------------------------------------------
>
>                 Key: HDFS-5810
>                 URL: https://issues.apache.org/jira/browse/HDFS-5810
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: 2.3.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5810.001.patch, HDFS-5810.004.patch, 
> HDFS-5810.006.patch, HDFS-5810.008.patch, HDFS-5810.015.patch, 
> HDFS-5810.016.patch
>
>
> We should unify the client mmap cache and the client file descriptor cache.  
> Since mmaps are granted corresponding to file descriptors in the cache 
> (currently FileInputStreamCache), they have to be tracked together to do 
> "smarter" things like HDFS-5182.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to