[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100100#comment-13100100
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-2691:
----------------------------------------------------

Took a while to load back the NM code into my memory :)

Had a comprehensive look at the patch. Mostly looks good. Some comments:

General: Please wrap around lines longer than 80 chars (you can look at the 
patch itself to figure out the lines).

ContainerImpl.java
 - _handle()_ method already takes a writeLock, doesn't need to be synchronized.

ContainerManagerImpl.java
 - RM doesn't kill containers using _stopContainer()_, it instead sends a 
FINISH_CONTAINER in the heartbeat. So the error message should only refer to 
ApplicationMaster.

_RELEASE_CONTAINER_RESOURCES_ is always sent along with 
_CLEANUP_CONTAINER_RESOURCES_ event. I think we should just merge these into 
_CLEANUP_CONTAINER_RESOURCES_ event itself. This will also be inline with the 
fact that creation of container-dirs and the localization of files both happen 
as part of a single event _INIT_CONTAINER_RESOURCES_, so cleanup should also be 
a single event. We can send a {{Map<LocalResourceVisibility, 
Collection<LocalResourceRequest>>}} as the event payload. To be symmetric, we 
should probably also merge the multiple _INIT_CONTAINER_RESOURCES_ calls one 
for each _LocalResourceVisibility_ to be a single event. Thoughts?

TestContainer:
 - Shall we rename _ContainerRelMatcher_ to _ResourcesReleasedMatcher_? 
Similarly _ContainerReqMatcher_ to _ResourcesRequestedMatcher_?
 - Most of the new static methods and classes added can be private.
 - Good that you've refactored out a _WrappedContainer_!
 - When we use this _WrappedContainer_, we always want to drain events 
immediately. So, I think we should drop passing a drain boolean for every 
method and always drain the events. We can rename _maybeDrain(boolean)_ method 
to be _drainDispatcherEvents()_ and use the same. Improves readability IMO.

TestResourceLocalizationService:
 - _testLocalizationCleanup()_ can be renamed to _testResourcesRelease()_ as we 
are only verifying the ref-counts here?
 - We can try different refCount values for different resource types (all of 
the types use a ref-count 1 in your tests).

The tests are getting a bit difficult to read because of the mocks. In both the 
above tests, I think we can use the real objects particularly for records like 
_ContainerId_, _ApplicationId_ using the _BuilderUtil_ APIs. We don't really 
need to mock these, it just adds more code and confuses as to what fields are 
mocked and what aren't. We seem to mocking everything as opposed to mocking 
only those objects that need modified behaviour. Some of these records:
 - TestContainer
   -- getMockRsrc(), createResource(), getMockContainerId(), 
 - TestResourceLocalizationService
   -- creating appId in testLocalizationCleanup()
   -- creating records in getMockContainer()
   -- getPath(), getMockedResource(), getAppMockedResource(), 
getPublicMockedResource() and getPrivateMockedResource()

bq. It doesn't look like an LRU cleanup is implemented, which i believe was 
added recently to 20security - that would be a separate jira.
bq. LRU was put into trunk, but it was on MRV1.
There is existing code for purging of cache under disk pressure - See 
_ResourceLocalization.CacheCleanup_ and _ResourceRetentionSet_. (We need tests 
for this though, will file a ticket) This only deletes files that aren't in use 
at all. By LRU, do you mean selective deletion of these files based on their 
usage? Can you please point me to the relevant MRV1 JIRA? Thanks!

> Finish up the cleanup of distributed cache file resources and related tests.
> ----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2691
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2691
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>            Reporter: Amol Kekre
>            Assignee: Siddharth Seth
>             Fix For: 0.23.0, 0.24.0
>
>         Attachments: MR2691_1.patch, MR2691_2.patch, MR2691_3.patch
>
>
> Implement cleanup of distributed cache file resources

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to