Re: Basic Question
Each write call registers (writes) a KV pair to the output. The output collector does not look for similarities nor does it try to de-dupe it, and even if the object is the same, its value is copied so that doesn't matter. So you will get two KV pairs in your output - since duplication is allowed and is normal in several MR cases. Think of wordcount, where a map() call may emit lots of (is, 1) pairs if there are multiple is in the line it processes, and can use set() calls to its benefit to avoid too many object creation. On Tue, Aug 7, 2012 at 11:56 PM, Mohit Anchlia mohitanch...@gmail.com wrote: In Mapper I often use a Global Text object and througout the map processing I just call set on it. My question is, what happens if collector receives similar byte array value. Does the last one overwrite the value in collector? So if I did Text zip = new Text(); zip.set(9099); collector.write(zip,value); zip.set(9099); collector.write(zip,value1); Should I expect to receive both values in reducer or just one? -- Harsh J
Re: Basic Question
On Tue, Aug 7, 2012 at 11:33 AM, Harsh J ha...@cloudera.com wrote: Each write call registers (writes) a KV pair to the output. The output collector does not look for similarities nor does it try to de-dupe it, and even if the object is the same, its value is copied so that doesn't matter. So you will get two KV pairs in your output - since duplication is allowed and is normal in several MR cases. Think of wordcount, where a map() call may emit lots of (is, 1) pairs if there are multiple is in the line it processes, and can use set() calls to its benefit to avoid too many object creation. Thanks! On Tue, Aug 7, 2012 at 11:56 PM, Mohit Anchlia mohitanch...@gmail.com wrote: In Mapper I often use a Global Text object and througout the map processing I just call set on it. My question is, what happens if collector receives similar byte array value. Does the last one overwrite the value in collector? So if I did Text zip = new Text(); zip.set(9099); collector.write(zip,value); zip.set(9099); collector.write(zip,value1); Should I expect to receive both values in reducer or just one? -- Harsh J
A basic question-Hadoop input from memory
Hi, A basic question. The input for the map program is usually from the a file in HDFS. Is it possible to use input from memory, like from a xml dom constructed in memory from by Dom pareser ? Thanks in advance. Regards, Lamchith Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com
Basic question
job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); Does this mean the input to the reducer should be Text/IntWritable or the output of the reducer is Text/IntWritable? What is the inverse of this.. setInputKeyClass/setInputValueClass? Is this inferred by the JobInputFormatClass? Would someone mind briefly explaining? Thanks
Re: Basic question
The output of the reducer is Text/IntWritable. To set the input to the reducer you set the mapper output classes. Cheers James Sent from my mobile. Please excuse the typos. On 2010-08-25, at 8:13 PM, Mark static.void@gmail.com wrote: job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); Does this mean the input to the reducer should be Text/IntWritable or the output of the reducer is Text/IntWritable? What is the inverse of this.. setInputKeyClass/setInputValueClass? Is this inferred by the JobInputFormatClass? Would someone mind briefly explaining? Thanks