> 1. I am referring to org.apache.hama.bsp.PartitioningRunner, it's named > as so in the HEAD (1429573) of trunk. It isn't removed but it isn't > referred to anywhere else. I can't find any references to it in the > workspace. >
It is referred in BSPJob#waitForCompletion function as a separate BSP job to create the specified splits. > 2. job.setPartitioner is the same as setting > "bsp.input.partitioner.class" . Anyways , So acc. to me partitions are > not > being created because of which the following happens. > If I am running the task on local fs and not hdfs, there's just one > input split and even if I set a partitioner to create two partitions and > set bsp.setNumTasks(2) , this is overriden and only one task is > executed. > See BSPJobClient#submitJobInternal() > where it does the following > job.setNumBspTask(writeSplits(job, submitSplitFile, maxTasks)); Line > 326. > > This job is set to run if the number of splits != number of Tasks or if forced by the configuration. I can share my HAMA-700 current state of patch with you. > 3. So here is what I think is happening, Partitioner is not in the > codepath (try putting a breakpoint inside the partitioner and executing > and > non graph bsp task), so partitions are not being created and > writeSplits() > is returning 1. > [ writeSplits() returns the number of splits in the input. ] > Probably because it is running as a separate process?
