Hello everyone, I’m in the process of implementing a custom StorageHandler and I had some questions.
1) What is the difference between org.apache.Hadoop.mapred.InputFormat and org.apache.hadoop.mapreduce.InputFormat? 2) How is numSpits calculated in org.apache.Hadoop.mapred.InputFormat.getSplits(JobConf job, int numSplits)? 3) Is there a way to enforce a maximum number of splits? What would happen if I ignore numSplits and just returned an array of splits that was the actual maximum number of splits? 4) How is InputSplit.getLocations() used? If I’m accessing non hfds resources should what should I return? Currently I’m just returning an empty array. Thanks for your time, Andrew Long
