Re: fine granularity operation on HDFS

Amogh Vasekar Wed, 27 Jan 2010 10:41:06 -0800

Hi,
>>now that I can get the splits of a file in hadoop, is it possible to name 
>>some splits (not all) as the input to mapper?
I'm assuming when you say "splits of a file in hadoop" you mean splits 
generated from the inputformat and not the blocks stored in HDFS.
The [File]InputFormat you use gives you access to splits, locations etc. You 
can use this to add only a few splits you need to mapper and discard the others 
( something you can do on files as a whole using PathFilters ).


>>Or can I manually read some of these splits (not the whole file) using HDFS 
>>api?
You mean you list these splits somewhere in a file beforehand so individual 
mappers can read one line (split) ?

Amogh

Re: fine granularity operation on HDFS

Reply via email to