[
https://issues.apache.org/jira/browse/HADOOP-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689453#action_12689453
]
Jingkei Ly commented on HADOOP-5571:
------------------------------------
I was also thinking of raising a separate JIRA on replacing the written field
in TupleWritable with a java.util.BitSet so that you can do joins over 64
datasets - do you have an opinion on this?
> TupleWritable can return incorrect results if it contains more than 32 values
> -----------------------------------------------------------------------------
>
> Key: HADOOP-5571
> URL: https://issues.apache.org/jira/browse/HADOOP-5571
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.19.1
> Reporter: Jingkei Ly
> Assignee: Jingkei Ly
> Attachments: HADOOP-5571-1.patch
>
>
> When attempting to do an outer join on 45 files with the
> CompositeInputFormat, I've been encountering unexpected results in the
> TupleWritable returned by the record reader. On closer inspection, it seems
> to be because TupleWritable.setWritten(int) is incorrectly setting some tuple
> positions as written, i.e when you set setWritten(42), it also sets position
> 10.
> The following Junit test demonstrates the problem:
> {code}
> public void testWideTuple() throws Exception {
> Text emptyText = new Text("Should be empty");
> Writable[] values = new Writable[64];
> Arrays.fill(values,emptyText);
> values[42] = new Text("Number 42");
>
> TupleWritable tuple = new TupleWritable(values);
> tuple.setWritten(42);
>
> for (int pos=0; pos<tuple.size();pos++) {
> boolean has = tuple.has(pos);
> if (pos == 42) {
> assertTrue(has);
> }
> else {
> assertFalse("Tuple position is incorrectly labelled as set: " + pos,
> has);
> }
> }
> }
> {code}
> Similarly, TupleWritable.setWritten(9) also causes TupleWritable.has(41) to
> incorrectly return true.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.