If I use FileInputFormat it gives instantiation error since FileInputFormat is abstract class.
On Sat, Dec 18, 2010 at 3:21 AM, Aman <[email protected]> wrote: > > Use FileInputFormat > > > You mapper will look something like this > > public class MyMapper extends Mapper<....>{ > int sum=0; > > @Override > public void map(LongWritable key, Text values, Context context){ > sum = sum+Integer.parseInt(values.toString()); > } > > @Override > public void cleanup(Mapper.Context context) throws IOException, > InterruptedException { > context.write("sum",new Text(sum+"")); > } > } > > Your reducer will look something like > public class MyReducer extends Reducer<Text, Text, Text, NullWritable>{ > private NullWritable outputValue = NullWritable.get(); > > > public void reduce(Text key, Iterable<Text> values, Context context){ > int sum = 0; > for (Text value : values) { > sum = sum + Integer.parseInt(value.toString()); > } > context.write(new Text(sum+""), outputValue); > > } > > > } > > > madhu phatak wrote: > > > > Hi > > I have a very large file of size 1.4 GB. Each line of the file is a > number > > . > > I want to find the sum all those numbers. > > I wanted to use NLineInputFormat as a InputFormat but it sends only one > > line > > to the Mapper which is very in efficient. > > So can you guide me to write a InputFormat which splits the file > > into multiple Splits and each mapper can read multiple > > line from each split > > > > Regards > > Madhukar > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/InputFormat-for-a-big-file-tp2105461p2107514.html > Sent from the Hadoop lucene-users mailing list archive at Nabble.com. >
