Hi, Could you kindly provide the pros and cons of Multifile, combilefile, sequencefile input format?
Thanks in Advance. Cheers! Manoj. On Fri, Jul 13, 2012 at 10:15 AM, Bejoy KS <bejoy.had...@gmail.com> wrote: > ** > Hi Manoj > > If you are looking at a scheduler and a work flow manager to carry out > this task you can have a look at oozie. > > If your xml files are smaller(smaller than hdfs block size) then > definitely it is a better practice to combine them to form larger files. > Combining into Sequence Files should be good. > Regards > Bejoy KS > > Sent from handheld, please excuse typos. > ------------------------------ > *From: * Manoj Babu <manoj...@gmail.com> > *Date: *Fri, 13 Jul 2012 08:59:51 +0530 > *To: *<mapreduce-user@hadoop.apache.org> > *ReplyTo: * mapreduce-user@hadoop.apache.org > *Subject: *suggest Best way to upload xml files to HDFS > > Hi, > > I need to upload large xml files files daily. Right now am having a small > program to read all the files from local folder and writing it to HDFS as a > single file. Is this a right way? > If there any best practices or optimized way to achieve this Kindly let me > know. > > Thanks in advance! > > Cheers! > Manoj. > >