[ 
https://issues.apache.org/jira/browse/HDFS-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894103#comment-13894103
 ] 

Andrew Wang commented on HDFS-5810:
-----------------------------------

Thanks for revving quickly, I looked at the latest patch and agree with the 
fixes you mentioned. The stylistic comments were just suggestions, so it's fine 
to skip them.

I had a few follow-on replies and new comments though too (still getting 
through the patch):

Replies:
* Was this addressed?
bq. If the mmap and normal caches are configured to a small and a big value 
respectively...Can we unlink the two caches completely?
* For the tracing, my goal is to avoid runtime overhead if tracing is off. I 
think us savvy HDFS developers could figure out how to read a compressed stack 
trace, e.g. just showing the names of methods. Another alternative is putting 
the trace tag in a thread-local, maybe with a unique ID.
* On the unref/purge, I missed that the refcount starts off at 2, which is 
unusual. I'd like to see the SCCache ref done explicitly in SCCache, or at 
least documented in SCCache too.
* On timeouts, we might want to be even *more* aggressive at increasing them. A 
recent FB HBase/HDFS paper published at FAST 14 shows that caching on the order 
of hours to provide continued benefit. The inflection point is something like 1 
or 2 hours, so that'd be my target. IIUC, {{allocMmap}} has pressure-based 
eviction, so we can tolerate it.

New comments:
* Need timeouts for new tests (I just checked via grep in all changed/new test 
files)

More SCCache related:
* The condvar usage in SCCache is mildly complex. It'd be nice to document 
where and why we release the lock and wait on a CV, e.g. in fetchOrCreate
* rename "SCReplica#checkStale" to "isStale" per java/hadoop convention
* What's the rationale for the extremely polymorphic Object pointer in 
SCReplica? All the casting is messy
* Determining the correct cache sizes right now is complicated for users. I 
think users essentially want to specify a global resource limit like "use this 
many FDs for caching" and have the HDFS client do the right thing. Right now 
though, these limits are a) per ClientContext, so you need to know how many 
ClientContexts you'll be using, and b) the two caches have entirely separate FD 
allocations, so you can waste your allocation unless you know really well if 
you are doing mmapped vs. normal reads. I hope this is possible without config 
incompatibility.
* Do we really need to release and retake the lock for making an mmap? I didn't 
think this was expensive enough to warrant this, and I noticed you didn't do 
this when unmapping.
* allocMmap, we decrement outstandingMmapCount, but isn't that already handled 
by the munmap in the line before? If this is legit, a verifying test case would 
be good.

I skimmed through the rest of the changes and it looks like SCCache is really 
the meat of the change. I'm pretty happy with SCCache after the latest round of 
review, and I'll get through BlockReaderFactory and ClientMmap soon. We're 
getting close!

> Unify mmap cache and short-circuit file descriptor cache
> --------------------------------------------------------
>
>                 Key: HDFS-5810
>                 URL: https://issues.apache.org/jira/browse/HDFS-5810
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: 2.3.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5810.001.patch, HDFS-5810.004.patch, 
> HDFS-5810.006.patch, HDFS-5810.008.patch, HDFS-5810.015.patch, 
> HDFS-5810.016.patch
>
>
> We should unify the client mmap cache and the client file descriptor cache.  
> Since mmaps are granted corresponding to file descriptors in the cache 
> (currently FileInputStreamCache), they have to be tracked together to do 
> "smarter" things like HDFS-5182.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to