Todd Lipcon created MAPREDUCE-5125:
--------------------------------------
Summary: TestDFSIO should write less compressible data
Key: MAPREDUCE-5125
URL: https://issues.apache.org/jira/browse/MAPREDUCE-5125
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: test
Affects Versions: 1.1.2, 2.0.3-alpha
Reporter: Todd Lipcon
Priority: Minor
Currently, TestDFSIO writes a short repeating string of sequential (byte)0
through (byte)50. This makes its output very compressible (I measured 250:1 by
LZOing the resulting file). This makes the results of TestDFSIO very hard to
compare when running on HDFS vs other file systems which may include some
compression on the network, disk, or both -- what is ostensibly a benchmark of
IO throughput yields completely skewed results towards the system with
compression.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira