[
https://issues.apache.org/jira/browse/HADOOP-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tao Li updated HADOOP-13849:
----------------------------
Description:
I tested bzip2 java-builtin and system-native compression, and I found the
compress speed is almost the same. (I think the system-native should have
better compress speed than java-builtin)
My test case:
1. input file: 2.7GB text file without compression
2. after bzip2 java-builtin compress: 457MB, 12min 4sec
3. after bzip2 system-native compress: 457MB, 12min 19sec
My MapReduce Config:
conf.set("mapreduce.fileoutputcommitter.marksuccessfuljobs", "false");
conf.set("mapreduce.output.fileoutputformat.compress", "true");
conf.set("mapreduce.output.fileoutputformat.compress.type", "BLOCK");
conf.set("mapreduce.output.fileoutputformat.compress.codec",
"org.apache.hadoop.io.compress.BZip2Codec");
conf.set("io.compression.codec.bzip2.library", "java-builtin"); // for
java-builtin
conf.set("io.compression.codec.bzip2.library", "system-native"); // for
system-native
was:
I tested bzip2 java-builtin and system-native compression, and I found the
compress speed is almost the same. (I think the system-native should have
better compress speed than java-builtin)
My test case:
input: 2.7GB text file without compression
bzip2 java-builtin compress: 457MB, 12min 4sec
bzip2 system-native compress: 457MB, 12min 19sec
My MapReduce Config:
conf.set("mapreduce.fileoutputcommitter.marksuccessfuljobs", "false");
conf.set("mapreduce.output.fileoutputformat.compress", "true");
conf.set("mapreduce.output.fileoutputformat.compress.type", "BLOCK");
conf.set("mapreduce.output.fileoutputformat.compress.codec",
"org.apache.hadoop.io.compress.BZip2Codec");
conf.set("io.compression.codec.bzip2.library", "java-builtin"); // for
java-builtin
conf.set("io.compression.codec.bzip2.library", "system-native"); // for
system-native
> Bzip2 java-builtin and system-native have almost the same compress speed
> ------------------------------------------------------------------------
>
> Key: HADOOP-13849
> URL: https://issues.apache.org/jira/browse/HADOOP-13849
> Project: Hadoop Common
> Issue Type: Bug
> Components: common
> Affects Versions: 2.6.0
> Environment: os version: redhat6
> hadoop version: 2.6.0
> native bzip2 version: bzip2-devel-1.0.5-7.el6_0.x86_64
> Reporter: Tao Li
>
> I tested bzip2 java-builtin and system-native compression, and I found the
> compress speed is almost the same. (I think the system-native should have
> better compress speed than java-builtin)
> My test case:
> 1. input file: 2.7GB text file without compression
> 2. after bzip2 java-builtin compress: 457MB, 12min 4sec
> 3. after bzip2 system-native compress: 457MB, 12min 19sec
> My MapReduce Config:
> conf.set("mapreduce.fileoutputcommitter.marksuccessfuljobs", "false");
> conf.set("mapreduce.output.fileoutputformat.compress", "true");
> conf.set("mapreduce.output.fileoutputformat.compress.type", "BLOCK");
> conf.set("mapreduce.output.fileoutputformat.compress.codec",
> "org.apache.hadoop.io.compress.BZip2Codec");
> conf.set("io.compression.codec.bzip2.library", "java-builtin"); // for
> java-builtin
> conf.set("io.compression.codec.bzip2.library", "system-native"); // for
> system-native
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]