So I managed to get my fast InputFormat working.... it does still use the FS, but in such a way that it improves mapper startup by over 2X. And last night I got a prototype working that allows the map task to run under the JVM of the TaskTracker, rather than spawing a new JVM.
The initial performance look really, really good. I just ran a 1000 map single input record job, (mappers doing no work however), in a one master, one slave setup... on my laptop.... It completed in a couple thousand seconds, or a couple seconds per map. Earlier I did a smaller 100 map job with a stable quieced system and it came in at about 130 seconds. So this prototype can start and end map jobs in 1-2 seconds, and should scale flatly with respect to nodes in the setup. "Owen O'Malley" <[EMAIL PROTECTED] m> To hadoop-user@lucene.apache.org 10/24/2007 01:05 cc PM Subject Re: InputFiles, Splits, Maps, Tasks Please respond to Questions 1.3 Base [EMAIL PROTECTED] e.apache.org On Oct 24, 2007, at 12:42 PM, Doug Cutting wrote: > Lance Amundsen wrote: >> OK, that is encouraging. I'll take another pass at it. I succeeded >> yesterday with an in-memory only InputFormat, but only after I >> commented >> out some of the split referencing code, like the following in >> MapTask.java >> if (instantiatedSplit instanceof FileSplit) { >> FileSplit fileSplit = (FileSplit) instantiatedSplit; >> job.set("map.input.file", fileSplit.getPath().toString()); >> job.setLong("map.input.start", fileSplit.getStart()); >> job.setLong("map.input.length", fileSplit.getLength()); >> } > > Yes, that code should not exist, but it shouldn't affect you > either. You should be subclassing InputSplit, not FileSplit, so > this code shouldn't operate on your splits. That code doesn't do anything if they are non file-splits, so it absolutely shouldn't break anything. Applications depend on those attributes to know which split they are working on and there isn't a better fix until we move to context objects. I know that non- filesplits work because there are units tests to make sure they don't break anything. -- Owen