[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111830#comment-14111830
 ] 

Todd Lipcon commented on MAPREDUCE-6054:
----------------------------------------

I found a few things here:

- it currently tests all combinations of key and value writables. As far as I 
can think, there isn't any interaction between the key type and the value type 
-- ie if key type K works, and value type V works, then it's not possible that 
(K,V) together would be broken. So, I think it's superfluous to test all 
combinations of K and V (which makes for 144 test runs or something). Let's 
instead just make sure that each of the keys and each of the values has been 
tested at least once when we generate our parameter list.

- the generation of the test data files is currently super slow. For each 
writable that it writes to the sequence file, it's constructing new objects, 
rather than resetting a single Writable object. The String.format() call in 
BytesUtil.toStringBinary is also very slow. Let's improve the performance of 
this code path, even if we do the above.

- I also noticed a bug in BytesFactory - it's calling byte[].hashCode() 
expending that to be unique in some way. In fact, we'll get better random data 
by not re-seeding the RNG at all.

> native-task: speed up test runs
> -------------------------------
>
>                 Key: MAPREDUCE-6054
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6054
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: task
>            Reporter: Todd Lipcon
>
> Currently the KVTest compatibility test takes so long on my machine that it 
> regularly times out maven. We should speed it up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to