[ https://issues.apache.org/jira/browse/HADOOP-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1193: ---------------------------------- Attachment: HADOOP-1193_1_20070517.patch Here is a patch while I continue further testing... Hairong could you try to see if it works for you? Thanks! Basically I went ahead and implemented a 'codec pool' to reuse the direct-buffer based codecs so as to not create too many of them... Results while trying to sort 1Million records via TestSequenceFile with RECORD compression: trunk H-1193 Compressors: 1382 3 Decompressors: 1520 12 ----------------------------------------------------- Total: 2902 15 Results are even more dramatic for BLOCK compression (we need 4 codecs per Reader with BLOCK compression for key, keyLen, val & valLen) ... in fact I have gone ahead and bumped up the default direct buffer size for zlib to 64K from 1K which should lead to improved performance too, on the back of this patch. Appreciate any review/feedback. > Map/reduce job gets OutOfMemoryException when set map out to be compressed > -------------------------------------------------------------------------- > > Key: HADOOP-1193 > URL: https://issues.apache.org/jira/browse/HADOOP-1193 > Project: Hadoop > Issue Type: Bug > Components: mapred > Affects Versions: 0.12.2 > Reporter: Hairong Kuang > Assigned To: Arun C Murthy > Attachments: HADOOP-1193_1_20070517.patch > > > One of my jobs quickly fails with the OutOfMemoryException when I set the map > out to be compressed. But it worked fine with release 0.10. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.