Re: Split the File using mapreduce

Yanbo Liang Fri, 27 Dec 2013 06:54:08 -0800

Did you installed Hive on your Hadoop cluster?
If yes, use Hive SQL may be simple and efficiency.
Otherwise, you can write a MapReduce program with
org.apache.hadoop.mapred.lib.MultiOuputFormat, and the output from the
Reducer can be written to more than one file.



2013/12/27 Nitin Pawar <[email protected]>

> 1)if you have a csv file and do it often without writing a lot of code
> then create a hive table with "," delimiter and then select from table
> columns you want and write to the file
>
> 2) you are good at script, then look at pig scripting, and then write to
> files
>
> 3) you want to do it through mapreduce program of your own, take a look at
> multioutputformat and textinputformat
>
>
> On Fri, Dec 27, 2013 at 6:56 PM, Ranjini Rathinam 
> <[email protected]>wrote:
>
>> Hi,
>>
>> I have a file with 16 fields such as
>> id,name,sa,dept,exp,address,company,phone,mobile,project,redk,........ so on
>>
>> My scenaraio is to split the first eight attributes in one file and
>> another eight attributes in another file using MapReduce program.
>>
>> so first eight attributes and its value in one file as
>> id,name,sa,dept,exp,address,company,phone
>>
>> and the rest of attributes and its value in another file. Using Mapreduce
>> Program.
>>
>> I am using Hadoop 0.20 version and java 1.6
>> Thanks in advance
>>
>> Regards,
>> Ranjini.R
>>
>>
>>
>>
>
>
>
> --
> Nitin Pawar
>

Re: Split the File using mapreduce

Reply via email to