Hi,
I am glad that the new version supports Map-side joins. However I see
one limitation: CompositeInputFormat cannot join KeyValueTextInputFormat
due to null pointer exception (see below for details).
It works fine with SequenceFileInputFormat so this is not a
critical problem to me for now...


Here is the problem: In org.apache.hadoop.mapred.join.Parser.WNode,
InputFormat inf is instantiated without specifying Configuration.
It causes NullPointerException when KeyValueTextInputFormat#isSplitable is called:
(KeyValueTextInputFormat needs JobConf to see if the file
is compressed (hence not splittable).)
---
Exception in thread "main" java.lang.NullPointerException
at org.apache.hadoop.mapred.KeyValueTextInputFormat.isSplitable(KeyValueTextInputFormat.java:44) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:185) at org.apache.hadoop.mapred.join.Parser$WNode.getSplits(Parser.java:304) at org.apache.hadoop.mapred.join.Parser$CNode.getSplits(Parser.java:374) at org.apache.hadoop.mapred.join.CompositeInputFormat.getSplits(CompositeInputFormat.java:129)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:544)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:805)
---

Best regards,
Jun Tatemura
NEC Laboratories America

Reply via email to