subject:"Re\: Get Current Block or Split ID, and using it, the Block Path"

Re: Get Current Block or Split ID, and using it, the Block Path

2012-04-08 Thread Harsh J

Hi, The "part" in the default filename stands for "partition". In some cases I agree you would not mind viewing them as a singular file instead of having to read directories - but there are also use cases where you would want each partition file to be unique cause you partitioned and processed the

Re: Get Current Block or Split ID, and using it, the Block Path

2012-04-08 Thread JAX

I have a related question about blocks related to thisNormally, a reduce job outputs several files, all in the same directory. But why? Since we know that Hadoop is abstracting our file for us, shouldn't the part-r- outputs ultimately be thought of as a single file? What is the corres

Re: Get Current Block or Split ID, and using it, the Block Path

2012-04-08 Thread Harsh J

Deepak On Sun, Apr 8, 2012 at 9:46 PM, Deepak Nettem wrote: > Hi, > > Is it possible to get the 'id' of the currently executing split or block > from within the mapper? Using this block Id / split id, I want to be able > to query the namenode to get the names of hosts having that block / spllit,

Re: Get Current Block or Split ID, and using it, the Block Path

2012-04-08 Thread Mohit Anchlia

I think if you called getInputFormat on JobConf and then called getSplits you would atleast get the locations. http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/InputSplit.html On Sun, Apr 8, 2012 at 9:16 AM, Deepak Nettem wrote: > Hi, > > Is it possible to get the 'id' o