[jira] [Commented] (HDFS-5364) Add OpenFileCtx cache

Jing Zhao (JIRA) Tue, 05 Nov 2013 13:16:16 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814255#comment-13814255
 ]


Jing Zhao commented on HDFS-5364:
---------------------------------

Some comments for the current patch:

# Minor: In the javadoc of OpenFileCtxCache, maybe we can mention that each 
OpenFileCtx cache entry is used to maintain the writing context for a single
file? Or we can just add link to the javadoc of OpenFileCtx.
# Currently for every put opeartion of OpenFileCtxCache, if the current cache 
is full, we need to scan the whole cache and find the "idlest" entry. So here
instead of using a HashMap for the cache, can we use a queue/list-based map 
(e.g., concurrentSkipListMap etc.) and keep the idlest entry in the tail by
adjusting the entries' position based on their last access time.
# Is it possible that a OpenFileCtx cache entry becomes inactive while its last 
access time is still not the idlest? In that case, we can first evict this kind
of entry for a put operation (in case that we still do scanning for put).
# OpenFileCtxCache#scan can be declared as a private method. In the meanwhile, 
we may not need to hold the lock in the scan and cleanAll methods.
# In the shutdown method, it will be better to interrupt and join the 
streamMonitor thread before calling the cleanAll method. Also, we need a flag 
for StreamMonitor to indicate it should stop (currently we use while(true)). 
Otherwise the stream monitor thread will not stop if its working time for each 
round is longer than 5s.
# Minor: StreamMonitor#rotation can be declared as final.

> Add OpenFileCtx cache
> ---------------------
>
>                 Key: HDFS-5364
>                 URL: https://issues.apache.org/jira/browse/HDFS-5364
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: nfs
>            Reporter: Brandon Li
>            Assignee: Brandon Li
>         Attachments: HDFS-5364.001.patch, HDFS-5364.002.patch, 
> HDFS-5364.003.patch, HDFS-5364.004.patch, HDFS-5364.005.patch
>
>
> NFS gateway can run out of memory when the stream timeout is set to a 
> relatively long period(e.g., >1 minute) and user uploads thousands of files 
> in parallel.  Each stream DFSClient creates a DataStreamer thread, and will 
> eventually run out of memory by creating too many threads.
> NFS gateway should have a OpenFileCtx cache to limit the total opened files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5364) Add OpenFileCtx cache

Reply via email to