Did you installed Hive on your Hadoop cluster? If yes, use Hive SQL may be simple and efficiency. Otherwise, you can write a MapReduce program with org.apache.hadoop.mapred.lib.MultiOuputFormat, and the output from the Reducer can be written to more than one file.
2013/12/27 Nitin Pawar <[email protected]> > 1)if you have a csv file and do it often without writing a lot of code > then create a hive table with "," delimiter and then select from table > columns you want and write to the file > > 2) you are good at script, then look at pig scripting, and then write to > files > > 3) you want to do it through mapreduce program of your own, take a look at > multioutputformat and textinputformat > > > On Fri, Dec 27, 2013 at 6:56 PM, Ranjini Rathinam > <[email protected]>wrote: > >> Hi, >> >> I have a file with 16 fields such as >> id,name,sa,dept,exp,address,company,phone,mobile,project,redk,........ so on >> >> My scenaraio is to split the first eight attributes in one file and >> another eight attributes in another file using MapReduce program. >> >> so first eight attributes and its value in one file as >> id,name,sa,dept,exp,address,company,phone >> >> and the rest of attributes and its value in another file. Using Mapreduce >> Program. >> >> I am using Hadoop 0.20 version and java 1.6 >> Thanks in advance >> >> Regards, >> Ranjini.R >> >> >> >> > > > > -- > Nitin Pawar >
