Re: merging small files in HDFS

2017-01-09 Thread Gabriel Balan
2016 7:24 PM To: Piyush Mukati <piyush.muk...@gmail.com <mailto:piyush.muk...@gmail.com>>; user@hadoop.apache.org <mailto:user@hadoop.apache.org> Subject: Re: merging small files in HDFS Hi, if I correctly understand your request you need only to merge s

Re: merging small files in HDFS

2016-12-30 Thread Chris Nauroth
ticular directory to single file .. >> >> hadoop fs -getmerge >> >> --Senthil >> -Original Message- >> From: Giovanni Mascari [mailto:giovanni.masc...@polito.it] >> Sent: Thursday, November 03, 2016 7:24 PM >> To: Piyush Mukati <piyush.muk...@gmail.com

Re: merging small files in HDFS

2016-11-03 Thread Piyush Mukati
rectory to single file .. > > hadoop fs -getmerge > > --Senthil > -Original Message- > From: Giovanni Mascari [mailto:giovanni.masc...@polito.it] > Sent: Thursday, November 03, 2016 7:24 PM > To: Piyush Mukati <piyush.muk...@gmail.com>; user@hadoop.apache.org >

Re: merging small files in HDFS

2016-11-03 Thread dileep kumar
Hi , You need to write a map method to just parse input file and pass it to reducer.. use only reducer..so that all maps output will go to one reducer and one file gets created,which is merge of input files.. On 03-Nov-2016 8:54 pm, "Piyush Mukati" wrote: > Hi, > I

Re: merging small files in HDFS

2016-11-03 Thread Madhav Sharan
al Message- > > From: Giovanni Mascari [mailto:giovanni.masc...@polito.it] > > Sent: Thursday, November 03, 2016 7:24 PM > > To: Piyush Mukati <piyush.muk...@gmail.com>; user@hadoop.apache.org > > Subject: Re: merging small files in HDFS > > > > Hi, > > if I c

RE: merging small files in HDFS

2016-11-03 Thread kumar, Senthil(AWF)
<piyush.muk...@gmail.com>; user@hadoop.apache.org Subject: Re: merging small files in HDFS Hi, if I correctly understand your request you need only to merge some data resulting from an hdfs write operation. In this case, I suppose that your best option is to use hadoop-stream with 'cat' c

Re: merging small files in HDFS

2016-11-03 Thread Giovanni Mascari
Hi, if I correctly understand your request you need only to merge some data resulting from an hdfs write operation. In this case, I suppose that your best option is to use hadoop-stream with 'cat' command. take a look here: https://hadoop.apache.org/docs/r1.2.1/streaming.html Regards Il

merging small files in HDFS

2016-11-03 Thread Piyush Mukati
Hi, I want to merge multiple files in one HDFS dir to one file. I am planning to write a map only job using input format which will create only one inputSplit per dir. this way my job don't need to do any shuffle/sort.(only read and write back to disk) Is there any such file format already