Set number Reducer per machines.

2010-10-05 Thread Pramy Bhats
Hi, I am trying to run a job on my hadoop cluster, where I get consistently get heap space error. I increased the heap-space to 4 GB in hadoop-env.sh and reboot the cluster. However, I still get the heap space error. One of things, I want to try is to reduce the number of map / reduce process

Reducers stucks in copy phase

2010-09-29 Thread Pramy Bhats
Hi, While trying to run a MapReduce job, the reducers stucks in the copy phase indefinitely. Though, all the Mapper have been finished the reducers stucks at 15-20% completion. The log available at the Reducers is as follows: 2010-09-29 11:33:24,535 INFO org.apache.hadoop.mapred.ReduceTask:

Too many fetch-failures

2010-09-27 Thread Pramy Bhats
Hello, I am trying to run a biagram count on a 12-node cluster setup. For an input file of 135 splits (around 7.5 GB), the job fails for some of the runs. The error that I get on the jobtracker that out of 135 mappers, 1 of the mapper fails because of Too many fetch-failures Too many

Re: Too many fetch-failures

2010-09-27 Thread Pramy Bhats
that it cannot contact the datanodes it is trying to get data from. /* Joe Stein, 973-944-0094 http://www.medialets.com Twitter: @allthingshadoop */ On Sep 27, 2010, at 2:50 PM, Pramy Bhats pramybh...@googlemail.com wrote: Hello, I am trying to run a biagram count on a 12-node cluster setup

why getSplits() is called twice ?

2010-08-13 Thread Pramy Bhats
Hi, I am trying to modify the spliter for mappers. But while looking at the code base -- I fail to understand, why getSplits is called twice before the Mapper is launched. It essentially does the same thing both the times ( which is calling the getSplit(JobContext job ) method in the file

Debuging hadoop core in distributed settings

2010-08-08 Thread Pramy Bhats
AM, Pramy Bhats pramybh...@googlemail.comwrote: Hi, I am trying to debug the new built hadoop-core-dev.jar in Eclipse. To simplify the debug process, firstly I setup the Hadoop in single-node mode on my localhost. a) configure debug in eclipse, under tab main: project

Debuging hadoop core

2010-07-13 Thread Pramy Bhats
Hi, I am trying to debug the new built hadoop-core-dev.jar in Eclipse. To simplify the debug process, firstly I setup the Hadoop in single-node mode on my localhost. a) configure debug in eclipse, under tab main: project : hadoop-all main-class: org.apache.hadoop.util.RunJar

Re: Debuging hadoop core

2010-07-13 Thread Pramy Bhats
and output filename, and therefore while debugging the hadoop reads and writes from the local file system. How can i specify the path of input and output filename as absolute hdfs path for debugging purpose ? thanks, --PB * * On Wed, Jul 14, 2010 at 12:07 AM, Pramy Bhats pramybh...@googlemail.comwrote

newly built Jar file

2010-07-11 Thread Pramy Bhats
Hi, I recently started playing with hadoop source code, when I compiled the source code and built a new jar file using the class files in the build folder. However, when i tried to run the world count example using the new jar file -- it gives me the following error. However, i didnt modify the

Re: Intermediate files generated.

2010-07-08 Thread Pramy Bhats
for writing the results of your processing. On Fri, Jul 2, 2010 at 5:17 AM, Pramy Bhats pramybh...@googlemail.com wrote: Hi, Isn't possible to hack-in the intermediate files generated ? I am writing a compilation framework, so i dont want to mess up with existing programming framework

Re: Intermediate files generated.

2010-07-02 Thread Pramy Bhats
wrote: You could use the HDFS API from within your mapper, and run with 0 reducers. Alex On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats pramybh...@googlemail.com wrote: Hi, I am using hadoop framework for writing MapReduce jobs. I want to redirect the output of Map

Re: Intermediate files generated.

2010-07-02 Thread Pramy Bhats
by radiation. - Original Message - From: Pramy Bhats pramybh...@googlemail.com To: common-user@hadoop.apache.org common-user@hadoop.apache.org Sent: Fri Jul 02 01:05:25 2010 Subject: Re: Intermediate files generated. Hi Hemanth, I need to use the output of the mapper for some other

Intermediate files generated.

2010-07-01 Thread Pramy Bhats
Hi, I am using hadoop framework for writing MapReduce jobs. I want to redirect the output of Map into files of my choice and later use those files as input for Reduce phase. Could you please suggest, how to proceed for it ? thanks, --PB.

Re: Intermediate files generated.

2010-07-01 Thread Pramy Bhats
that can be used as temp files. thanks, --PB. On Fri, Jul 2, 2010 at 12:14 AM, Alex Loddengaard a...@cloudera.com wrote: You could use the HDFS API from within your mapper, and run with 0 reducers. Alex On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats pramybh...@googlemail.com wrote: Hi, I