[ http://issues.apache.org/jira/browse/HADOOP-538?page=comments#action_12444771 ] Owen O'Malley commented on HADOOP-538: --------------------------------------
A couple more comments: The buffer sized in all the new code should use the conf.getInt("io.file.buffer.size") for the size rather than 4k. The CompressionCodec sub-classes (in particular DefaultCodec) need to be Configurable and use the Configuration that is given to them rather than a default Configuraiton. The built in gzip codec is still going to be broken when used inside of SequenceFiles. Do the C source directories really need to be nested so deeply? Since we already have src/c++, wouldn't it make more sense to use src/c++/libhadoop/.. instead of src/native/.. > Implement a nio's 'direct buffer' based wrapper over zlib to improve > performance of java.util.zip.{De|In}flater as a 'custom codec' > ----------------------------------------------------------------------------------------------------------------------------------- > > Key: HADOOP-538 > URL: http://issues.apache.org/jira/browse/HADOOP-538 > Project: Hadoop > Issue Type: Improvement > Affects Versions: 0.6.1 > Reporter: Arun C Murthy > Assigned To: Arun C Murthy > Fix For: 0.8.0 > > Attachments: HADOOP-538.patch, HADOOP-538_20061005.tgz, > HADOOP-538_20061011.tgz, HADOOP-538_20061026.tgz, HADOOP-538_benchmarks.tgz > > > There has been more than one instance where java.util.zip's {De|In}flater > classes perform unreliably, a simple wrapper over zlib-1.2.3 (latest stable) > using java.nio.ByteBuffer (i.e. direct buffers) should go a long way in > alleviating these woes. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira