Re: Basic Question

2012-08-07 Thread Harsh J
Each write call registers (writes) a KV pair to the output. The output
collector does not look for similarities nor does it try to de-dupe
it, and even if the object is the same, its value is copied so that
doesn't matter.

So you will get two KV pairs in your output - since duplication is
allowed and is normal in several MR cases. Think of wordcount, where a
map() call may emit lots of (is, 1) pairs if there are multiple is
in the line it processes, and can use set() calls to its benefit to
avoid too many object creation.

On Tue, Aug 7, 2012 at 11:56 PM, Mohit Anchlia mohitanch...@gmail.com wrote:
 In Mapper I often use a Global Text object and througout the map processing
 I just call set on it. My question is, what happens if collector receives
 similar byte array value. Does the last one overwrite the value in
 collector? So if I did

 Text zip = new Text();
 zip.set(9099);
 collector.write(zip,value);
 zip.set(9099);
 collector.write(zip,value1);

 Should I expect to receive both values in reducer or just one?



-- 
Harsh J


Re: Basic Question

2012-08-07 Thread Mohit Anchlia
On Tue, Aug 7, 2012 at 11:33 AM, Harsh J ha...@cloudera.com wrote:

 Each write call registers (writes) a KV pair to the output. The output
 collector does not look for similarities nor does it try to de-dupe
 it, and even if the object is the same, its value is copied so that
 doesn't matter.

 So you will get two KV pairs in your output - since duplication is
 allowed and is normal in several MR cases. Think of wordcount, where a
 map() call may emit lots of (is, 1) pairs if there are multiple is
 in the line it processes, and can use set() calls to its benefit to
 avoid too many object creation.


Thanks!


 On Tue, Aug 7, 2012 at 11:56 PM, Mohit Anchlia mohitanch...@gmail.com
 wrote:
  In Mapper I often use a Global Text object and througout the map
 processing
  I just call set on it. My question is, what happens if collector
 receives
  similar byte array value. Does the last one overwrite the value in
  collector? So if I did
 
  Text zip = new Text();
  zip.set(9099);
  collector.write(zip,value);
  zip.set(9099);
  collector.write(zip,value1);
 
  Should I expect to receive both values in reducer or just one?



 --
 Harsh J



A basic question-Hadoop input from memory

2010-09-15 Thread lamchith.chathukutty
Hi,



A basic question. The input for the map program is usually from the a
file in HDFS. Is it possible to use input from memory, like from a xml
dom constructed in memory from by Dom pareser ?



Thanks in advance.



Regards,

Lamchith




Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email. 

www.wipro.com


Basic question

2010-08-25 Thread Mark

 job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

Does this mean the input to the reducer should be Text/IntWritable or 
the output of the reducer is Text/IntWritable?


What is the inverse of this.. setInputKeyClass/setInputValueClass? Is 
this inferred by the JobInputFormatClass? Would someone mind briefly 
explaining?


Thanks


Re: Basic question

2010-08-25 Thread James Seigel
The output of the reducer is Text/IntWritable. 

To set the input to the reducer you set the mapper output classes. 

Cheers
James

Sent from my mobile. Please excuse the typos.

On 2010-08-25, at 8:13 PM, Mark static.void@gmail.com wrote:

  job.setOutputKeyClass(Text.class);
 job.setOutputValueClass(IntWritable.class);
 
 Does this mean the input to the reducer should be Text/IntWritable or 
 the output of the reducer is Text/IntWritable?
 
 What is the inverse of this.. setInputKeyClass/setInputValueClass? Is 
 this inferred by the JobInputFormatClass? Would someone mind briefly 
 explaining?
 
 Thanks