Please, forget my question. I was looking to the wrong code. On Wed, May 11, 2011 at 5:52 PM, Pedro Costa <[email protected]> wrote: > Hi, > > I was looking to the mapred code, searching for the moment where the > split location is passed to the MapTask, and I've found this line in > TaskInProgress class. > [code] > t = new MapTask(jobFile, taskid, partition, splitClass, split, > rawSplit.getFileName(), rawSplit.getLocations()); > [/code] > > The split variable is the split. > > [code] > BytesWritable split; > if (!jobSetup && !jobCleanup) { > splitClass = rawSplit.getClassName(); > split = rawSplit.getBytes(); > } else { > split = new BytesWritable(); > } > [/code] > > The "rawSplit.getFileName()" is the full URL to the split file > (hdfs://chicon-7.fr:54310/user/xxx/gutenberg/A.txt), the locations are > the servers where the split is ([chicon-7.fr, chinqchint-21.fr, > chinqchint-38.fr]). > > > 1 - Why during the creation of a MapTask is passed the split and the > filename and the set of locations? If the split is passed, I deduce > that the map task already contains the split bytes, that it will use. > So, why not just pass the split, and ignore the the filename and the > set of locations? > > > > Thanks > > -- > --------------------------- > PSC >
-- --------------------------- Pedro Sá da Costa @: [email protected] @: [email protected]
