Emitting Java Collection as mapper output

2012-07-10 Thread Mohammad Tariq
.JobClient: Job complete: job_local_0001 12/07/10 16:41:47 INFO mapred.JobClient: Counters: 0 Need some guidance from the experts. Please let me know where I am going wrong. Many thanks. Regards, Mohammad Tariq

Re: Emitting Java Collection as mapper output

2012-07-10 Thread Mohammad Tariq
Hello Harsh, Thank you so much for the valuable response.I'll proceed as suggested by you. Regards, Mohammad Tariq On Tue, Jul 10, 2012 at 5:05 PM, Harsh J wrote: > Short answer: Yes. > > With Writable serialization, there's *some* support for collection >

WholeFileInputFormat format

2012-07-10 Thread Mohammad Tariq
Hello list, What could be the approximate maximum size of the files that can be handled using WholeFileInputFormat format??I mean, if the file is very big, then is it feasible to use WholeFileInputFormat as the entire load will go to one mapper??Many thanks. Regards, Mohammad Tariq

Re: WholeFileInputFormat format

2012-07-10 Thread Mohammad Tariq
I have to compare the results coming for both the mappers and generate the final result. Need your guidance. Many thanks. Regards, Mohammad Tariq On Tue, Jul 10, 2012 at 6:55 PM, Harsh J wrote: > It depends on what you need. If your file is not splittable, or if you > need to read the

Re: WholeFileInputFormat format

2012-07-10 Thread Mohammad Tariq
ot able to tackle with the situation.(Pardon my ignorance) May thanks. Regards, Mohammad Tariq On Tue, Jul 10, 2012 at 8:34 PM, Harsh J wrote: > I don't see why you'd have to use the WholeFileInputFormat for such a > task. Your task is very similar to joins, and you can

Re: Mapper basic question

2012-07-11 Thread Mohammad Tariq
. Regards, Mohammad Tariq On Wed, Jul 11, 2012 at 5:59 PM, Manoj Babu wrote: > Hi, > > The no of mappers is depends on the no of blocks. Is it possible to limit > the no of mappers size without increasing the HDFS block size? > > Thanks in advance. > > Cheers! > Manoj. >

Re: WholeFileInputFormat format

2012-07-11 Thread Mohammad Tariq
Hello Harsh, Does Hadoop-0.20.205.0(new API) has Avro support?? Regards, Mohammad Tariq On Wed, Jul 11, 2012 at 1:57 AM, Mohammad Tariq wrote: > Hello Harsh, > > I am sorry to be a pest of questions. Actually I am kinda > stuck. I have to write my MapRed

KeyValueTextInputFormat absent in hadoop-0.20.205

2012-07-25 Thread Mohammad Tariq
Hello list, I am trying to run a small MapReduce job that includes KeyValueTextInputFormat with the new API(hadoop-0.20.205.0), but it seems KeyValueTextInputFormat is not included in the new API. Am I correct??? Regards, Mohammad Tariq

Re: KeyValueTextInputFormat absent in hadoop-0.20.205

2012-07-25 Thread Mohammad Tariq
Hello Bejoy, Thank you so much for the quick response. Regards, Mohammad Tariq On Wed, Jul 25, 2012 at 8:30 PM, Bejoy Ks wrote: > Hi Tariq > > KeyValueTextInputFormat is available from hadoop 1.0.1 version on > wards for the new mapreduce API > > http://hadoop.apac

Re: What are the right/best Java classes to use with Hadoop?

2012-07-30 Thread Mohammad Tariq
suits your requirements. Regards, Mohammad Tariq On Mon, Jul 30, 2012 at 8:32 PM, wrote: > Hi, > > > > I am a beginner in using Hadoop, and I would like to know what are the right > Java classes to use with Hadoop? In other words, which Java classes should > be used a

Reading fields from a Text line

2012-08-01 Thread Mohammad Tariq
hings in correct way. Need some guidance. Many thanks. Regards, Mohammad Tariq

Re: Reading fields from a Text line

2012-08-02 Thread Mohammad Tariq
ontains entire lines. Could you guys please point out the the mistake I might have made. (Pardon my ignorance, as I am not very good at MapReduce).Many thanks. Regards, Mohammad Tariq On Thu, Aug 2, 2012 at 10:58 AM, Sriram Ramachandrasekaran wrote: > Wouldn't it be better if you cou

Re: Reading fields from a Text line

2012-08-02 Thread Mohammad Tariq
class.newInstance().nextInt())); System.exit(job.waitForCompletion(true) ? 0 : 1); Bejoy : I have observed one strange thing. When I am using IntWritable, the output file contains the entire content of the input file, but if I am using LongWritable, the output file is empty. Sri, C

Handling files with unclear boundaries

2012-08-06 Thread Mohammad Tariq
read 107 bytes from the line. Is it possible to use this length as a delimiter for creating splits some how??And if so which InputFormat would be appropriate??Many thanks. Regards, Mohammad Tariq

Re: Handling files with unclear boundaries

2012-08-06 Thread Mohammad Tariq
Thanku guys. Syed : thanku for the pointer Regards, Mohammad Tariq On Mon, Aug 6, 2012 at 11:54 PM, syed kather wrote: > Hi tariq , > >Have a look on this link which can guide you .. > There was discussion happen previously for the same type of issue > > s

Re: Map output files and partitions.

2012-12-13 Thread Mohammad Tariq
u can have your own implementation of getPartion() to write your custom Partitioner. HTH Regards, Mohammad Tariq On Fri, Dec 14, 2012 at 12:59 PM, Harsh J wrote: > Map output files, by which you perhaps mean intermediate data files > for temporary K/V persistence, are stored in IFi

Re: How does mapper process partial records?

2013-01-24 Thread Mohammad Tariq
Hello Praveen, Do you mean the InputFormat splits the file across record boundaries??I actually didn't get your question. What do you mean by 'record' with respect to HDFS. Did you mean HDFS block? Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Thu, Jan 24, 2013 a

MapReduce on Local files

2013-04-02 Thread Mohammad Tariq
Hello list, Is a MR job capable of reading even the hidden temp files present inside a directory located on my local FS?I have noticed this thing today for the first time because till now I never tried running MR jobs on local files. Thank you so much for your time? Warm Regards, Tari

How DBReocrdReader works

2013-04-27 Thread Mohammad Tariq
Hello list, Due to some need arose lately I had to start looking into DBInputFormat. In order to get myself familiar with it I started to goolge but could not find much about how records get created from splits in case of DBInputFormat. I went through this post