Is hadoop right for my problem

cdwillie76 Tue, 03 Feb 2009 06:35:27 -0800

I have an application I would like to apply hadoop to but I'm not sure if the
tasking is too small.  I have a file that contains between 70,000 - 400,000
records.  All the records can be processed in parallel and I can currently
process them at 400 records a second single threaded (give or take).  I
thought I read somewhere (one of the tutorials) that the mapper tasks should
run at least for a minute to offset the overhead in creating them.  Is this
really the case?  I am pretty sure that a one to one record to mapper is
overkill but I am wondering if I batching them up for the mapper is still a
way to go or if I should look at some other framework to help split up the
processing.


Any insight would be appreciated.

Thanks
Chris
-- 
View this message in context: 
http://www.nabble.com/Is-hadoop-right-for-my-problem-tp21811122p21811122.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Is hadoop right for my problem

Reply via email to