1. I am referring to org.apache.hama.bsp.PartitioningRunner, it's named as so in the HEAD (1429573) of trunk. It isn't removed but it isn't referred to anywhere else. I can't find any references to it in the workspace. 2. job.setPartitioner is the same as setting "bsp.input.partitioner.class" . Anyways , So acc. to me partitions are not being created because of which the following happens. If I am running the task on local fs and not hdfs, there's just one input split and even if I set a partitioner to create two partitions and set bsp.setNumTasks(2) , this is overriden and only one task is executed. See BSPJobClient#submitJobInternal() where it does the following job.setNumBspTask(writeSplits(job, submitSplitFile, maxTasks)); Line 326.
3. So here is what I think is happening, Partitioner is not in the codepath (try putting a breakpoint inside the partitioner and executing and non graph bsp task), so partitions are not being created and writeSplits() is returning 1. [ writeSplits() returns the number of splits in the input. ] -- Regards, Apurv Verma On Sun, Jan 6, 2013 at 9:05 PM, Suraj Menon <[email protected]> wrote: > Are you referring to org.apache.hama.bsp.PartitionRunner ? I don't see a > commit removing the class. > PartitionRunner is designed to be a Hama job in itself to create the > expected splits before starting the submitted job. > You can use your own Partitioner in the config using > "bsp.input.partitioner.class" . Hopefully I answered your question. > > I am trying to make things backward compatible[ HAMA-700 ], but facing some > problems. The goal is to have runtime partitioning of graphs done by > PartitionRunner itself. > > -Suraj > > On Sun, Jan 6, 2013 at 9:54 AM, Apurv Verma <[email protected]> wrote: > > > Hey all, > > I found that PartitioningRunner has been removed from the codepath, I > > guess this is the right way to make jobs faster. > > But in the current scenario is it possible to have something all > > follows. I want that all values < some integer are designated to peer > > index 0, all values in range 0-a to peer index 1, and so on and so > > forth. > > With the partitioning removed would i need to use an additional > > superstep to do this classification of input records. > > > > > > -- > > Regards, > > Apurv Verma > > >
