[ https://issues.apache.org/jira/browse/MAPREDUCE-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793922#action_12793922 ]
Hemanth Yamijala commented on MAPREDUCE-1186: --------------------------------------------- Some feedback: - I would like to keep the call for cleanupStorage in TaskTracker.initialize(). It helps in cases of abrupt TT shutdown. Admittedly, it may not be fully effective - because we would not be able to cleanup files like we do in MAPREDUCE-896 (i.e. using the task-controller's help). Still, it would cleanup whatever it can and that would be a considerable amount of space under normal conditions, I think. - In TrackerDistributedCacheManager.localizeCache where we construct DistributedCacheContext, I think it makes sense to just call conf.get(JobContext.USER_NAME) rather than building a new JobConf object for this one value. - Should we rename DistributedCacheContext to DistributedCacheFileContext, as it is operating on only one file ? Same for TaskController.initializeDistributedCache - change to initializeDistributedCacheFile. Likewise in C code as well. - Document DistributedCacheContext mentioning what it is useful for. - Store the uniqueDir as a Path object instead of a string. This is consistent with our using higher level objects like File in the Contexts. - Remove LOG.info print in DefaultTaskController.initializeDistributedCache. Instead have a LOG.warn in the exception handler. - Guard the Debug logs using LOG.isDebugEnabled. - Do not send full paths (as given in localizedUniqueDir) to task-controller. This is for security purposes to prevent people from sending random paths and getting permissions changed as a hack. Instead, we *hardcode* the path construction as much as possible inside the task-controller. For e.g. in this case, if we send the selected base directory (mapred-local-dir), user and random-id, we can easily construct the full path from the pattern such as $mapred-local-dir/taskTracker/$user/distcache/$random-id. This is just another safeguard in case the task-controller is compromised somehow. This naturally means the C code changes a bit as well, so the code will need to construct the path in initialize_distributed_cache and then call secure_path. Please follow the pattern we use in other methods of task-controller while doing this. Specifically: -- We need to validate that the passed in mapred-local-dir is valid. Please take a look at check_tt_root -- We also need to construct the path using methods like concatenate. Please take a look at get_attempt_directory for an example. - It would be very helpful to add a comment on why testDeleteCache needs to override configuration to set a single mapred local dir. - I would like to see a regression test for this, that ensures that file permissions are being set only for the file being localized. So can we do something like this ? Localize a file. After localization is complete, create a file under the directory where the file is localized and ensure that it has permissions different from what is set by default. Then, localize another file. After checking that the second file has the right permissions, verify that the file created after the first localization has the permissions you set and not the default permissions. This should work for both the LinuxTaskController case and the default case. Also, I would just like you to watch for changes in MAPREDUCE-896, which I suspect will most likely clash with this patch. So, some merge may be required. > While localizing a DistributedCache file, TT sets permissions recursively on > the whole base-dir > ----------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-1186 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1186 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker > Affects Versions: 0.21.0 > Reporter: Vinod K V > Assignee: Amareshwari Sriramadasu > Fix For: 0.22.0 > > Attachments: patch-1186-1.txt, patch-1186-2.txt, > patch-1186-3-ydist.txt, patch-1186-3-ydist.txt, patch-1186-ydist.txt, > patch-1186-ydist.txt, patch-1186.txt > > > This is a performance problem. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.