i had a test with a log file analysis that was written with java and ran on Hadoop. i ran my log file analysis on a Intel Quad Core processor with 2 GB of memory. i set the map task to 40 and reduce task to 8.
the size of the log files i had test are 1GB to 4GB because i ran out of storage resource. i compare it with webalizer on a computer w/ Celeron processor and 256MB of memory. webalizer ran about 10x more faster. what i did was just a small experiment and maybe still have to config things more. can anyone share the experience about the smallest file size limit for hadoop to run faster than other application? thanks..
