[ https://issues.apache.org/jira/browse/HADOOP-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HADOOP-1027: -------------------------------- Attachment: 1027-new3.patch Attached is a patch that applies cleanly to the current trunk. Also, set the merge threshold count (for inmem merge) to 1000 files in hadoop-default.xml. I think it makes sense to have merge based also on the count of the number of files accumulated in ramfs, by default. We will then have a better control of the memory usage. It will avoid the various datastructures growing out of bounds (and thereby giving OutOfMemory errors) when there are too many maps (generating small outputs) and the size of the ramfs is large (so many many map outputs can be accomodated). Hopefully, since we can accomodate so many map outputs in the ramfs in the first place, the sizes of the outputs are, quite likely, small and hence the on-disk spills are small too. So although disk IO would be there, think it would not be the major bottleneck. Thoughts? > Fix the RAM FileSystem/Merge problems (reported in HADOOP-1014) > --------------------------------------------------------------- > > Key: HADOOP-1027 > URL: https://issues.apache.org/jira/browse/HADOOP-1027 > Project: Hadoop > Issue Type: Bug > Components: mapred > Reporter: Devaraj Das > Assigned To: Devaraj Das > Attachments: 1027-new.patch, 1027-new2.patch, 1027-new3.patch > > > 1) Merge algorithm implementation does not delete empty segments (sequence > files with no key/val data) in cases where single level merges don't happen > on those segments (due to the check "numberOfSegmentsRemaining <= factor" > returning true). This affected the in-mem merge in a subtle way :- > For the in-mem merge, the merge-spill-file is given the same name as the name > of the 0th entry file in the ramfs. If this file was an empty file, then it > would not get deleted from the ramfs, and if the subsequent merge on ramfs > chose the same name for the merge-spill-file, it would overwrite the > previously created spill. This led to the inconsistent output sizes. > 2) The InMemoryFileSystem has a "close" method which is not protected (only > method where pathToFileAttribs map is modified without first locking the > InMemoryFileSystem instance) and that quite likely leads to > ConcurrentModificationException if some thread calls InMemoryFileSystem.close > (due to some exception) and some other thread is in the process of doing > InMemoryFileSystem.getFiles(). However, this problem will not affect the > correctness of the merge process (anyway the task is going to fail) and the > more important thing is that some other exception happened (like insufficient > disk space and so map outputs could not be written) which may not be related > to the merge process at all. > 3) The number of outputs that is merged at once in RAM should be limited. > This is to prevent OutOfMemory errors. Consider a case where there are 10s of > thousands of maps and all maps generate empty outputs. Given the default size > of the RAM FS as 75 MB, we can possibly accomodate lots of map outputs in RAM > without doing any merge but it also results in the various other data > structures exploding in size. We have to do a trade off here especially > because the inmem-merging is done in the TaskTracker process which already is > under a good amount of memory pressure. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.