[
https://issues.apache.org/jira/browse/HADOOP-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Douglas updated HADOOP-3630:
----------------------------------
Attachment: 3630-0.patch
bq. I think this occurs because CompositeRecordReader#add() [Hadoop 0.17.0,
line 138] doesn't call rr.hasNext() to check if the RecordReader has any
records before it adds it to the PriorityQueue. Is this a bug or expected
behaviour?
This is definitely a bug, and your diagnosis is exactly right.
Attaching a fix and a testcase.
> CompositeRecordReader: key and values can be in uninitialized state if files
> being joined have no records
> ---------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3630
> URL: https://issues.apache.org/jira/browse/HADOOP-3630
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.17.0
> Reporter: Jingkei Ly
> Fix For: 0.19.0
>
> Attachments: 3630-0.patch
>
>
> I am using org.apache.hadoop.mapred.join.CompositeInputFormat to do an
> outer-join across a number of SequenceFiles. This works fine in most
> circumstances, but I get NullPointerExceptions/uninitialized data (where
> Writable#readFields() has not been called) when some of the files being
> joined have no records in them.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.