Hi Harshit,
PFA
Thanks
Ravikant
On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <[email protected]>
wrote:
> Can you share PALReducer also?
>
> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
> [email protected]> wrote:
>
>> Adding source code for more clarity
>>
>> Problem statement is simple
>>
>> PartitionFileMapper : it takes input file which has tab separated value V
>> , P
>> It emits (V, -1#P)
>>
>> ALFileMapper : It takes input file which has tab separated values V, EL
>> It emits (V, E#-1)
>>
>> in reducer I want to emit
>> (V,E#P)
>>
>> Thanks
>> Ravikant
>>
>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>> [email protected]> wrote:
>>
>>> By custom key, did you meant some class object ? then no.
>>>
>>> I have two map methods each having different file as input. And both map
>>> methods emit *Longwritable key* type. But As in stdout of container
>>> file I can see,
>>>
>>> key & value separated by ':'
>>>
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>> :3278620528725786624:5352454#-1
>>>
>>> for key 391 reducer is called twice. , one for value from first map
>>> while one for value from other map.
>>>
>>> In map method I parse the string from input file as Long variable and
>>> then emit it as LongWritable.
>>>
>>> Is there something I am missing when I use multipleInput
>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>
>>> Thanks
>>> Ravikant
>>>
>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <[email protected]>
>>> wrote:
>>>
>>>> As per Map Reduce, it is not possible that two different reducers will
>>>> get same keys.
>>>> I think you have created some custom key type? If that is the case then
>>>> there should be some issue with the comparator.
>>>>
>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi Hadoop user,
>>>>>
>>>>> I have two map classes processing two different input files. Both map
>>>>> functions have same key,value format to emit.
>>>>>
>>>>> But Reducer called twice for same key , one for value from first map
>>>>> while one for value from other map.
>>>>>
>>>>> I am printing (key ,value) pairs in reducer :
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>
>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>
>>>>> both maps emit Longwritable key and Text value.
>>>>>
>>>>>
>>>>> Any idea why this is happening?
>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>> emitted by mapper?
>>>>>
>>>>> Thanks
>>>>> Ravikant
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harshit Mathur
>>>>
>>>
>>>
>>
>
>
> --
> Harshit Mathur
>
package in.dream_lab.hadoopPipeline.cc;
import java.io.IOException;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.Reducer.Context;
/*
* Job ID : 2
* Job Name : VP_AP_to_PAL
* Job Description: Concatenate partition Id with vertex adjacency list
* Map Input File: VP, EL
* Map Input Format :V_id, [P_id] V_src, [<E_id,V_sink>+]
* Map Emit :V_id, [-1, P_id] V_src, [V_sink, -1]
* Reducer Emit: V_id, P_id, <E_id, V_sink>+
* Reducer Output File :PAL
* Note :Separator between P_id, <E_id, V_sink>+ is ":"
* Separator between Vid & Pid is '#'
*/
public class PALReducer extends Reducer< LongWritable, Text,Text,Text> {
protected void reduce(LongWritable key, Iterable<Text> values , Context context)
throws IOException, InterruptedException {
String partitionId="";
String adjList="";
StringBuilder sb=new StringBuilder();
for(Text value:values){
System.out.println("Reduce:" +key+":"+ value);
String line = value.toString();
String[] strs = line.trim().split("#");
if(strs[1]!= "-1"){ /*This has partitionId*/
partitionId =strs[1];
}
else{ /*This has adjacency List*/
adjList=strs[0];
}
}
sb.append(key).append("#").append(partitionId);
context.write(new Text(sb.toString()), new Text(adjList));
}
}