Hi,
 I am exploring hadoop and using it for one of my machine learning
application.
 I have a problem in which I need to route a particular input to each map
task separately. For example, I have list of <key, value >pairs sorted on
some condition in an input file. I want to split the input file on some
condition (for example, all key,value pairs which have the same key should
be given as input to a particular map task). I want to do this, so that all
the necessary extra information related to that input can be loaded into
memory once in that map task so that my map procedure will be faster.

So, Can some one please tell me is there a way to specify a condition in
such a way that a particular input will be given to a particular map task?
Or, I can split the files before hand and is there a way I can specify each
map task to work on each file separately with out any splits. Please
clarify.

Thanks in advance,
-- 
View this message in context: 
http://www.nabble.com/Specifying-Input-conditon-to-split-file-or-specifying-map-tasks-to-work-on-assigned-files-individually-tf4133840.html#a11756898
Sent from the Hadoop Users mailing list archive at Nabble.com.

Reply via email to