[
https://issues.apache.org/jira/browse/MAPREDUCE-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111830#comment-14111830
]
Todd Lipcon commented on MAPREDUCE-6054:
----------------------------------------
I found a few things here:
- it currently tests all combinations of key and value writables. As far as I
can think, there isn't any interaction between the key type and the value type
-- ie if key type K works, and value type V works, then it's not possible that
(K,V) together would be broken. So, I think it's superfluous to test all
combinations of K and V (which makes for 144 test runs or something). Let's
instead just make sure that each of the keys and each of the values has been
tested at least once when we generate our parameter list.
- the generation of the test data files is currently super slow. For each
writable that it writes to the sequence file, it's constructing new objects,
rather than resetting a single Writable object. The String.format() call in
BytesUtil.toStringBinary is also very slow. Let's improve the performance of
this code path, even if we do the above.
- I also noticed a bug in BytesFactory - it's calling byte[].hashCode()
expending that to be unique in some way. In fact, we'll get better random data
by not re-seeding the RNG at all.
> native-task: speed up test runs
> -------------------------------
>
> Key: MAPREDUCE-6054
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6054
> Project: Hadoop Map/Reduce
> Issue Type: Sub-task
> Components: task
> Reporter: Todd Lipcon
>
> Currently the KVTest compatibility test takes so long on my machine that it
> regularly times out maven. We should speed it up.
--
This message was sent by Atlassian JIRA
(v6.2#6252)