Ok, For the ones that faces the problem, here is how I solved the problem: First of all, there was a task created for that on hadoop: https://issues.apache.org/jira/browse/HADOOP-4927
and http://hadoop.apache.org/mapreduce/docs/current/mapred_tutorial.html#Lazy+Output+Creation explains how to solve that. So hadoop does indeed create empty part-00x files irrespective what you do in the mapper class. So you have to call the following static method of the lazyoutputformat: LazyOutputFormat.setOutputFormatClass(job, SequenceFileOutputFormat.class); Be aware, from my experience, this method should be called after you set the outputformat class: job.setOutputFormatClass(SequenceFileOutputFormat.class); On Mon, Jun 4, 2012 at 2:48 PM, murat migdisoglu <murat.migdiso...@gmail.com > wrote: > Hi, > Thanks for your answer. After I've read your emails, I decided to clear > completely my mapper method to see If I can disable the output of the > mapper class at all, but it seems it did not work > So, here is my mapper method: > > @Override > public void map(ByteBuffer key, SortedMap<ByteBuffer, IColumn> > columns, Context context) > throws IOException, InterruptedException > { > > } > > when I execute hadoop fs -ls, I still see many small output files as > following: > > -rw-r--r-- 3 mmigdiso supergroup 87 2012-06-04 12:44 > /user/mmigdiso/output/part-m-00034 > -rw-r--r-- 3 mmigdiso supergroup 87 2012-06-04 12:45 > /user/mmigdiso/output/part-m-00037 > -rw-r--r-- 3 mmigdiso supergroup 87 2012-06-04 12:45 > /user/mmigdiso/output/part-m-00039 > -rw-r--r-- 3 mmigdiso supergroup 87 2012-06-04 12:45 > /user/mmigdiso/output/part-m-00040 > -rw-r--r-- 3 mmigdiso supergroup 87 2012-06-04 12:45 > /user/mmigdiso/output/part-m-00042 > > Do you know If I have to put something special to the context to specify > the "empty" output? > > Regards > Murat > > > > > On Mon, Jun 4, 2012 at 2:38 PM, Devaraj k <devara...@huawei.com> wrote: > >> Hi Murat, >> >> As Praveenesh explained, you can control the map outputs as you want. >> >> map() function will be called for each input i.e map() function invokes >> multiple times with different inputs in the same mapper. You can check by >> having the logs in the map function what is happening in it. >> >> >> Thanks >> Devaraj >> >> ________________________________________ >> From: praveenesh kumar [praveen...@gmail.com] >> Sent: Monday, June 04, 2012 5:57 PM >> To: common-user@hadoop.apache.org >> Subject: Re: What happens when I do not output anything from my mapper >> >> You can control your map outputs based on any condition you want. I have >> done that - it worked for me. >> It could be your code problem that its not working for you. >> Can you please share your map code or cross-check whether your conditions >> are correct ? >> >> Regards, >> Praveenesh >> >> On Mon, Jun 4, 2012 at 5:52 PM, murat migdisoglu < >> murat.migdiso...@gmail.com >> > wrote: >> >> > Hi, >> > I have a small application where I have only mapper class defined(no >> > reducer, no combiner). >> > Within the mapper class, I have an if condition according to which I >> decide >> > If I want to put something in the context or not. >> > If my condition is not match, I want that mapper does not give any >> output >> > to the hdfs. >> > But apparently, this does not worj as I expected. Once I run my job, a >> file >> > per mapper in the hdfs with 87 kb of size. >> > >> > the if block that I'm using in the map method is as following: >> > if (ip == null || ip.equals(cip)) { >> > Text value = new Text(mwrapper.toJson()); >> > word.set(ip); >> > context.write( word, value); >> > } else { >> > log.info("ip not match [" + ip + "]"); >> > } >> > } >> > }//end of mapper method >> > >> > How can I manage that? Does mapper always need to have an output? >> > >> > -- >> > "Find a job you enjoy, and you'll never work a day in your life." >> > Confucius >> > >> > > > > -- > "Find a job you enjoy, and you'll never work a day in your life." > Confucius > > -- "Find a job you enjoy, and you'll never work a day in your life." Confucius