I've tried and it works with a small no of tasks (< 19) but it fails if it's not set (so getting the default behavior). I'm not sure I understand the rationale of the fix without going deeper into the code, I'm just concerned if this is just a corner case or may affect some others which would be bad. I see that adding some more lines to my test file the error doesn't occur anymore ...
If that is not a major issue but just a corner case then it's ok otherwise I think it'd be better to fix before releasing. Regards, Tommaso 2012/11/15 Edward J. Yoon <[email protected]> > > Tommaso, your job works with different 'tasknum' correctly for same > input? > > Not working. (and I found HAMA-476) > > Let's release 0.6 first. I'll fix this problem ASAP, then release 0.6.1. > > What do you think? > > On Thu, Nov 15, 2012 at 7:10 PM, Edward J. Yoon <[email protected]> > wrote: > > I've changed only computeGoalSize(). > > > > protected long computeGoalSize(int numSplits, long totalSize) { > > - return totalSize / (numSplits == 0 ? 1 : numSplits); > > + // The minus 1 is for the remainder. > > + return totalSize / (numSplits <= 1 ? 1 : numSplits - 1); > > } > > > > I don't remember exactly what happens if a split is not on a record > boundary? > > > > Tommaso, your job works with different 'tasknum' correctly for same > input? > > > > On Thu, Nov 15, 2012 at 6:23 PM, Thomas Jungblut > > <[email protected]> wrote: > >> Edward changed something to the split behavious last night. Maybe it > broke > >> it. > >> > >> 2012/11/15 Tommaso Teofili <[email protected]> > >> > >>> Hi guys, > >>> > >>> I was just running a couple of tests with GradientDescentBSP when I > >>> realized that using the newly installed RC5 the algorithm fails at its > very > >>> beginning because it seems it cannot read from input. > >>> > >>> java.io.IOException: cannot read input vector size > >>> at > >>> > >>> > org.apache.hama.ml.regression.GradientDescentBSP.getXSize(GradientDescentBSP.java:268) > >>> at > >>> > >>> > org.apache.hama.ml.regression.GradientDescentBSP.getInitialTheta(GradientDescentBSP.java:244) > >>> at > >>> > >>> > org.apache.hama.ml.regression.GradientDescentBSP.bsp(GradientDescentBSP.java:72) > >>> at > >>> > org.apache.hama.bsp.LocalBSPRunner$BSPRunner.run(LocalBSPRunner.java:254) > >>> at > >>> > org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:284) > >>> at > >>> > org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:211) > >>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > >>> at java.util.concurrent.FutureTask.run(FutureTask.java:138) > >>> at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > >>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > >>> at java.util.concurrent.FutureTask.run(FutureTask.java:138) > >>> at > >>> > >>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > >>> at > >>> > >>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > >>> at java.lang.Thread.run(Thread.java:680) > >>> > >>> > >>> Since I didn't change anything on that side and it works with > >>> 0.6.0-SNAPSHOT I wonder if the latest stuff related to input split > caused > >>> problems. > >>> > >>> WDYT? > >>> > >>> Tommaso > >>> > >>> p.s.: > >>> I noticed this just after my +1 on the RC vote but please keep it on > hold > >>> while we track this issue > >>> > > > > > > > > -- > > Best Regards, Edward J. Yoon > > @eddieyoon > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon >
