Re: Errors in SSVD

Dmitriy Lyubimov Tue, 16 Aug 2011 11:04:50 -0700

I guess technically it's a subject for another patch, front end can just set
upper limit for -r (block height) to be no less than k+p automatically in
the front end. Right now, if that's not the case,only backend catches it and
backend should have a meaningful message about it, but not the frontend.


But in practice, it is so much of a corner case that i did not really
consider this verification.

Indeed, with large inputs, your optimal block size is determined by your
split size (by default, 32mb worth of sequence file data), degree of your
input sparsity, and memory available to mappers. Mappers allocate dense
buffers for bottom-up Q accumulation that are blockHeight x (k+p). That is,
with blocks of 30,000 rows and k+p=500 you are looking at 30k*500*8 = 120
mb. Which should be quite o.k. (for computattions of k+p=200, which is what
i actucally running it with, it works even with default hadoop -Xmx which is
200mb -- in fact, i never saw it break down due to memory so far). So, given
that memory is never an issue here, it makes sense to use larger blocks
(they reduce flops needed a little, smallr blocks will need more flops and 2
passes over data in the mappers).

Optimal situation is when you preallocate just enough Q buffer so that your
single split always fits into one block of Q after projection & QR applied.
In anoher words, -r should be maximum of input rows in any split or more
(but if it is more, it will require more memory that will not be used).
That's why i say that -r 30000 is quite a good number in practice regardless
of input sparsity.




On Tue, Aug 16, 2011 at 10:35 AM, Dmitriy Lyubimov <[email protected]>wrote:

> This is unusually small input. What's the block size? Use large blocks
> (such as 30,000). Block size can't be less than k+p.
>
> Can you please cut and paste actual log of qjob tasks that failed? This is
> front end error, but the actual problem is actually in the backend ranging
> anywhere from hadoop problems to algorithm problems.
> On Aug 16, 2011 9:44 AM, "Eshwaran Vijaya Kumar" <[email protected]>
> wrote:
> > Thanks again. I am using 0.5 right now. We will try to patch it up and
> see how it performs. In the mean time, I am having another (possibly user?)
> error: I have a 260 X 230 matrix. I set k+p = 40, it fails with
> >
> > Exception in thread "main" java.io.IOException: Q job unsuccessful.
> > at org.apache.mahout.math.hadoop.stochasticsvd.QJob.run(QJob.java:349)
> > at
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.run(SSVDSolver.java:262)
> > at
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDCli.run(SSVDCli.java:91)
> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> > at
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDCli.main(SSVDCli.java:131)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> > at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:187)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> >
> >
> > Suppose I set k+p to be much lesser say around 20, it works fine. Is it
> just that my dataset is of low rank or is there something else going on
> here?
> >
> > Thanks
> > Esh
> >
> >
> >
> > On Aug 14, 2011, at 1:47 PM, Dmitriy Lyubimov wrote:
> >
> >> ... i need to let some time for review before pushing to ASF repo )..
> >>
> >>
> >> On Sun, Aug 14, 2011 at 1:47 PM, Dmitriy Lyubimov <[email protected]>
> wrote:
> >>
> >>> patch is posted as MAHOUT -786.
> >>>
> >>> also 0.6 trunk with patch applied is here :
> >>> https://github.com/dlyubimov/mahout-commits/tree/MAHOUT-786
> >>>
> >>> <https://github.com/dlyubimov/mahout-commits/tree/MAHOUT-786>I will
> commit
> >>> to ASF repo tomorrow night (even that it is extremely simple, i need
> >>>
> >>>
> >>> On Sat, Aug 13, 2011 at 1:48 PM, Eshwaran Vijaya Kumar <
> >>> [email protected]> wrote:
> >>>
> >>>> Dmitriy,
> >>>> That sounds great. I eagerly await the patch.
> >>>> Thanks
> >>>> Esh
> >>>> On Aug 13, 2011, at 1:37 PM, Dmitriy Lyubimov wrote:
> >>>>
> >>>>> Ok, i got u0 working.
> >>>>>
> >>>>> The problem is of course that something called BBt job is to be
> coerced
> >>>> to
> >>>>> have 1 reducer (it's fine, every mapper won't yeld more than
> >>>>> upper-triangular matrix of k+p x k+p geometry, so even if you end up
> >>>> having
> >>>>> thousands of them, reducer would sum them up just fine.
> >>>>>
> >>>>> it worked before apparently because configuration hold 1 reducer by
> >>>> default
> >>>>> if not set explicitly, i am not quite sure if that's something in
> hadoop
> >>>> mr
> >>>>> client or mahout change that now precludes it from working.
> >>>>>
> >>>>> anyway, i got a patch (really a one-liner) and an example equivalent
> to
> >>>>> yours worked fine for me with 3 reducers.
> >>>>>
> >>>>> Also, in the tests, it also requests 3 reducers, but the reason it
> works
> >>>> in
> >>>>> tests and not in distributed mapred is because local mapred doesn't
> >>>> support
> >>>>> multiple reducers. I investigated this issue before and apparently
> there
> >>>>> were a couple of patches floating around but for some reason those
> >>>> changes
> >>>>> did not take hold in cdh3u0.
> >>>>>
> >>>>> I will publish patch in a jira shortly and will commit it Sunday-ish.
> >>>>>
> >>>>> Thanks.
> >>>>> -d
> >>>>>
> >>>>>
> >>>>> On Fri, Aug 5, 2011 at 7:06 PM, Eshwaran Vijaya Kumar <
> >>>>> [email protected]> wrote:
> >>>>>
> >>>>>> OK. So to add more info to this, I tried setting the number of
> reducers
> >>>> to
> >>>>>> 1 and now I don't get that particular error. The singular values and
> >>>> left
> >>>>>> and right singular vectors appear to be correct though (verified
> using
> >>>>>> Matlab).
> >>>>>>
> >>>>>> On Aug 5, 2011, at 1:55 PM, Eshwaran Vijaya Kumar wrote:
> >>>>>>
> >>>>>>> All,
> >>>>>>> I am trying to test Stochastic SVD and am facing some errors where
> it
> >>>>>> would be great if someone could clarifying what is going on. I am
> >>>> trying to
> >>>>>> feed the solver a DistributedRowMatrix with the exact same
> parameters
> >>>> that
> >>>>>> the test in LocalSSVDSolverSparseSequentialTest uses, i.e, Generate
> a
> >>>> 1000
> >>>>>> X 100 DRM with SequentialSparseVectors and then ask for blockHeight
> >>>> 251, p
> >>>>>> (oversampling) = 60, k (rank) = 40. I get the following error:
> >>>>>>>
> >>>>>>> Exception in thread "main" java.io.IOException: Unexpected overrun
> in
> >>>>>> upper triangular matrix files
> >>>>>>> at
> >>>>>>
> >>>>
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.loadUpperTriangularMatrix(SSVDSolver.java:471)
> >>>>>>> at
> >>>>>>
> >>>>
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.run(SSVDSolver.java:268)
> >>>>>>> at com.mozilla.SSVDCli.run(SSVDCli.java:89)
> >>>>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>>>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>>>>>> at com.mozilla.SSVDCli.main(SSVDCli.java:129)
> >>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>>>>>> at
> >>>>>>
> >>>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >>>>>>> at
> >>>>>>
> >>>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
> >>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> >>>>>>>
> >>>>>>> Also, I am using CDH3 with Mahout recompiled to work with CDH3
> jars.
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>> Esh
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>
> >
>

Re: Errors in SSVD

Reply via email to