> Keeping run-time (network-based) partitioning within GraphJobRunner is > not good idea.
It is not. I think I got testSubmitGraph to runtime partition (in preprocessing step) the single file into 2 files in the unit tests in my current state of patch.. > >> - the number of splits found are not equal to the number of BSP tasks > >> configured for the job. OR > > I have a question. If the input is unsorted map and I want to > re-partition by hashing but the numbers of blocks and desired tasks > are same, then what happens? Do you mean run-time partitioning? You will have runtime partitioner class defined and partitioning flag on by default. For case of HAMA-561 a user can switch off partitioning using the same flag. > On Wed, Jan 9, 2013 at 7:07 AM, Suraj Menon <[email protected]> > wrote: > > Hi Apurv, yes, those are pending test cases to be fixed. GraphJobRunner > is > > expecting the input in the format of Vertex, but we have input files as > > well as record key, values defined as Text. I have fixed only one unit > test > > case yet. > > > > On Tue, Jan 8, 2013 at 4:45 PM, Apurv Verma <[email protected]> wrote: > > > >> Hey all, > >> I got the problem, the partitioner was not being set for the > >> PartitionerRunner bsp task. :P I have fixed the partitioner with > portions > >> from your patch Suraj. Now after this commit partitioner will obey what > you > >> specified earlier, just to recapitulate. > >> > >> Repartitioning is done if : > >> - the number of splits found are not equal to the number of BSP tasks > >> configured for the job. OR > >> - the flag is set to true by the user > ("bsp.input.runtime.partitioning") OR > >> - user has specified a Runtime Partitioner class and enabled runtime > >> partitioning > >> > >> There was one special thing that I discovered about partitioner , just > >> sharing with you guys. Suppose I implement a partitioner which returns 0 > >> for a record, then it isn't necessary that this record will go to peer > with > >> index 0. It might go to peer 1. The only certitude which partitioner's > >> provide is that all records returning 0 will go to the same peer. I > needed > >> partitioner to work for PrefixSum I was implementing. > >> > >> Things to do next. > >> 1) RecordConverter , which Suraj is implementing in HAMA-700. (Please > >> update Suraj) > >> > >> B.T.W there are problems in mvn test. > >> *java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast > to > >> org.apache.hadoop.io.ArrayWritable* > >> * at > >> > org.apache.hama.graph.GraphJobRunner.loadVertices(GraphJobRunner.java:287)* > >> * > >> * > >> I don't think my commit is breaking this. > >> > >> Thanks > >> > >> > >> -- > >> Regards, > >> Apurv Verma > >> > >> > >> > >> > >> On Tue, Jan 8, 2013 at 11:07 PM, Suraj Menon <[email protected]> > >> wrote: > >> > >> > Please explain the nature of problems you are facing with Partitioner? > >> > > >> > >Any reasons for deciding to move the > >> > > PartitioningJob inside BSPJobClient from BSPJob? > >> > > >> > Twofold, BSPJob was just a configuration holder object, didn't want to > >> add > >> > the partitioning responsibility to the class. > >> > And also I wanted to know the number of splits, before taking the > >> decision > >> > whether to repartition or not. > >> > Repartitioning is done if : > >> > - the number of splits found are not equal to the number of BSP tasks > >> > configured for the job. OR > >> > - the flag is set to true by the user > ("bsp.input.runtime.partitioning") > >> OR > >> > - user has specified a Runtime Partitioner class and enabled runtime > >> > partitioning > >> > > >> > Thanks, > >> > Suraj > >> > > >> > On Tue, Jan 8, 2013 at 11:31 AM, Apurv Verma <[email protected]> > wrote: > >> > > >> > > Thanks, let me have a careful look at it. On a cursory look, I seem > to > >> > > understand the basic idea. Any reasons for deciding to move the > >> > > PartitioningJob inside BSPJobClient from BSPJob? > >> > > BTW the current partitioner doesn't work as intended, only the > default > >> > > partitioner HashPartitioner works fine, if I try to put some custom > >> > > partitioner there are problems. > >> > > > >> > > Let's resolve the partitioning completely before the spilling > message > >> > > queue. > >> > > > >> > > > >> > > -- > >> > > Regards, > >> > > Apurv Verma > >> > > > >> > > > >> > > >
