Hi, I am using map-side-join to join three data sets. I am not getting output as expected. Please guide me. Hadoop-Version: 0.20.1
a.txt ==== 9000000000,Dhana 9000000001,Sridhar 9000000002,Mani b.txt ==== 9000000000,Chennai 9000000001,Bangalore 9000000002,Madurai c.txt ==== 9000000000,Dev 9000000001,Mgr 9000000002,Lead part-00000 ======== 9000000000 [Chennai] 9000000000 [Dhana] 9000000000 [Dev] 9000000001 [Mgr] 9000000001 [Bangalore] 9000000001 [Sridhar] 9000000002 [Mani] 9000000002 [Lead] 9000000002 [Madurai] Expected Output ============= 9000000000 [Dhana,Chennai,Dev] 9000000001 [Sridhar,Bangalore,Mgr] 9000000002 [Mani,Madurai,Lead] This is the command I ran. Am I missing something? # hadoop jar hadoop-*-examples.jar join -D key.value.separator.in.input.line=',' -inFormat org.apache.hadoop.mapred.KeyValueTextInputFormat -outKey org.apache.hadoop.io.Text -joinOp outer mapred/join/ joinout Also if I give unsorted input, I am getting the same output. Is it not mandatory to give sorted data? Please Guide. Thanks in Advance. Dhana -- There are only 10 types of people in the world: Those who understand binary, and those who don't.
