Re: Errors in SSVD

Ted Dunning Wed, 17 Aug 2011 15:34:55 -0700

out-of-core = not-in-memory.

The origins are *very* old.  Nobody I know has used core memory since the
70's.


http://en.wikipedia.org/wiki/Out-of-core_algorithm

On Wed, Aug 17, 2011 at 3:00 PM, Dmitriy Lyubimov <[email protected]> wrote:

> Ted,
> sorry for my stupid question of the day: what does the "out-of-core" term
> mean?
>  On Aug 16, 2011 2:18 PM, "Ted Dunning" <[email protected]> wrote:
> > I have several in-memory implementations almost ready to publish.
> >
> > These provide straightforward implementation of the original SSVD
> algorithm
> > from the Martinsson and Halko paper, a version that avoids QR and LQ
> > decompositions and an out-of-core version that only keeps a moderate
> sized
> > amount of data in memory at any time.
> >
> > My hangup at this point is getting my Cholesky decomposition reliable for
> > rank-deficient inputs.
> >
> > On Tue, Aug 16, 2011 at 1:57 PM, Eshwaran Vijaya Kumar <
> > [email protected]> wrote:
> >
> >> I have decided to do something similar: Do the pipeline in memory and
> not
> >> invoke map-reduce for small datasets which I think will handle the
> issue.
> >> Thanks again for clearing that up.
> >> Esh
> >>
> >> Aug 16, 2011, at 1:45 PM, Dmitriy Lyubimov wrote:
> >>
> >> > PPS Mahout also has in-memory SVD Colt-migrated solver which is BTW
> what
> >> i
> >> > am using int local tests to assert SSVD results. Although it starts to
> >> feel
> >> > slow pretty quickly and sometimes produces errors (i think i starts
> >> feeling
> >> > slow at 10k x 1k inputs)
> >> >
> >> > On Tue, Aug 16, 2011 at 12:52 PM, Dmitriy Lyubimov <[email protected]
> >> >wrote:
> >> >
> >> >> also, with data as small as this, stochastic noise ratio would be
> >> >> significant (as in 'big numbers' law) so if you really think you
> might
> >> need
> >> >> to handle inputs that small, you better write a pipeline that detects
> >> this
> >> >> as a corner case and just runs in-memory decomposition. In fact, i
> think
> >> >> dense matrices up to 100,000 rows can be quite comfortably computed
> >> >> in-memory (Ted knows much more on practical limits of tools like R or
> >> even
> >> >> as simple as apache.math)
> >> >>
> >> >> -d
> >> >>
> >> >>
> >> >> On Tue, Aug 16, 2011 at 12:46 PM, Dmitriy Lyubimov <
> [email protected]
> >> >wrote:
> >> >>
> >> >>> yep that's what i figured. you have 193 rows or so but distributed
> >> between
> >> >>> 7 files so they are small and would generate several mappers and
> there
> >> are
> >> >>> probably some there with a small row count.
> >> >>>
> >> >>> See my other email. This method is for big data, big files. If you
> want
> >> to
> >> >>> automate handling of small files, you can probably do some
> intermediate
> >> step
> >> >>> with some heuristic that merges together all files say shorter than
> >> 1Mb.
> >> >>>
> >> >>> -d
> >> >>>
> >> >>>
> >> >>>
> >> >>> On Tue, Aug 16, 2011 at 12:43 PM, Eshwaran Vijaya Kumar <
> >> >>> [email protected]> wrote:
> >> >>>
> >> >>>> Number of mappers is 7. DFS block size is 128 MB, the reason I
> think
> >> >>>> there are 7 mappers being used is that I am using a Pig script to
> >> generate
> >> >>>> the sequence file of Vectors and that script generates 7 reducers.
> I
> >> am not
> >> >>>> setting minSplitSize though.
> >> >>>>
> >> >>>> On Aug 16, 2011, at 12:15 PM, Dmitriy Lyubimov wrote:
> >> >>>>
> >> >>>>> Hm. This is not common at all.
> >> >>>>>
> >> >>>>> This error would surface if map split can't accumulate at least
> k+p
> >> >>>> rows.
> >> >>>>>
> >> >>>>> That's another requirement which usually is non-issue -- any
> >> >>>> precomputed
> >> >>>>> split must contain at least k+p rows, which normally would not be
> the
> >> >>>> case
> >> >>>>> only if matrix is extra wide and dense, in which case
> --minSplitSize
> >> >>>> must be
> >> >>>>> used to avoid this.
> >> >>>>>
> >> >>>>> But in your case, the matrix is so small it must fit in one split.
> >> Can
> >> >>>> you
> >> >>>>> please verify how many mappers the job generates?
> >> >>>>>
> >> >>>>> if it's more than 1 than something is going fishy with hadoop.
> >> >>>> Otherwise,
> >> >>>>> something is fishy with input (it's either not 293 rows, or k+p is
> >> more
> >> >>>> than
> >> >>>>> 293).
> >> >>>>>
> >> >>>>> -d
> >> >>>>>
> >> >>>>> On Tue, Aug 16, 2011 at 11:39 AM, Eshwaran Vijaya Kumar <
> >> >>>>> [email protected]> wrote:
> >> >>>>>
> >> >>>>>>
> >> >>>>>> On Aug 16, 2011, at 10:35 AM, Dmitriy Lyubimov wrote:
> >> >>>>>>
> >> >>>>>>> This is unusually small input. What's the block size? Use large
> >> >>>> blocks
> >> >>>>>> (such
> >> >>>>>>> as 30,000). Block size can't be less than k+p.
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>> I did set blockSize to 30,000 (as recommended in the PDF that you
> >> >>>> wrote
> >> >>>>>> up). As far as input size, the reason to do that is because it is
> >> >>>> easier to
> >> >>>>>> test and verify the map-reduce pipeline with my in-memory
> >> >>>> implementation of
> >> >>>>>> the algorithm.
> >> >>>>>>
> >> >>>>>>> Can you please cut and paste actual log of qjob tasks that
> failed?
> >> >>>> This
> >> >>>>>> is
> >> >>>>>>> front end error, but the actual problem is actually in the
> backend
> >> >>>>>> ranging
> >> >>>>>>> anywhere from hadoop problems to algorithm problems.
> >> >>>>>> Sure. Refer http://esh.pastebin.mozilla.org/1302059
> >> >>>>>> Input is a DistributedRowMatrix 293 X 236, k = 4, p = 40,
> >> >>>> numReduceTasks =
> >> >>>>>> 1, blockHeight = 30,000. Reducing p = 20 ensures job goes
> through...
> >> >>>>>>
> >> >>>>>> Thanks again
> >> >>>>>> Esh
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>> On Aug 16, 2011 9:44 AM, "Eshwaran Vijaya Kumar" <
> >> >>>>>> [email protected]>
> >> >>>>>>> wrote:
> >> >>>>>>>> Thanks again. I am using 0.5 right now. We will try to patch it
> up
> >> >>>> and
> >> >>>>>> see
> >> >>>>>>> how it performs. In the mean time, I am having another (possibly
> >> >>>> user?)
> >> >>>>>>> error: I have a 260 X 230 matrix. I set k+p = 40, it fails with
> >> >>>>>>>>
> >> >>>>>>>> Exception in thread "main" java.io.IOException: Q job
> >> unsuccessful.
> >> >>>>>>>> at
> >> >>>> org.apache.mahout.math.hadoop.stochasticsvd.QJob.run(QJob.java:349)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.run(SSVDSolver.java:262)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>
> >> org.apache.mahout.math.hadoop.stochasticsvd.SSVDCli.run(SSVDCli.java:91)
> >> >>>>>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >> >>>>>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDCli.main(SSVDCli.java:131)
> >> >>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >> >>>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> >> >>>>>>>> at
> >> >>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> >> >>>>>>>> at
> >> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:187)
> >> >>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >> >>>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
> >> >>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> Suppose I set k+p to be much lesser say around 20, it works
> fine.
> >> Is
> >> >>>> it
> >> >>>>>>> just that my dataset is of low rank or is there something else
> >> going
> >> >>>> on
> >> >>>>>>> here?
> >> >>>>>>>>
> >> >>>>>>>> Thanks
> >> >>>>>>>> Esh
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> On Aug 14, 2011, at 1:47 PM, Dmitriy Lyubimov wrote:
> >> >>>>>>>>
> >> >>>>>>>>> ... i need to let some time for review before pushing to ASF
> repo
> >> >>>> )..
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> On Sun, Aug 14, 2011 at 1:47 PM, Dmitriy Lyubimov <
> >> >>>> [email protected]>
> >> >>>>>>> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>> patch is posted as MAHOUT -786.
> >> >>>>>>>>>>
> >> >>>>>>>>>> also 0.6 trunk with patch applied is here :
> >> >>>>>>>>>> https://github.com/dlyubimov/mahout-commits/tree/MAHOUT-786
> >> >>>>>>>>>>
> >> >>>>>>>>>> <https://github.com/dlyubimov/mahout-commits/tree/MAHOUT-786
> >I
> >> >>>> will
> >> >>>>>>> commit
> >> >>>>>>>>>> to ASF repo tomorrow night (even that it is extremely simple,
> i
> >> >>>> need
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> On Sat, Aug 13, 2011 at 1:48 PM, Eshwaran Vijaya Kumar <
> >> >>>>>>>>>> [email protected]> wrote:
> >> >>>>>>>>>>
> >> >>>>>>>>>>> Dmitriy,
> >> >>>>>>>>>>> That sounds great. I eagerly await the patch.
> >> >>>>>>>>>>> Thanks
> >> >>>>>>>>>>> Esh
> >> >>>>>>>>>>> On Aug 13, 2011, at 1:37 PM, Dmitriy Lyubimov wrote:
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>> Ok, i got u0 working.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> The problem is of course that something called BBt job is
> to
> >> be
> >> >>>>>>> coerced
> >> >>>>>>>>>>> to
> >> >>>>>>>>>>>> have 1 reducer (it's fine, every mapper won't yeld more
> than
> >> >>>>>>>>>>>> upper-triangular matrix of k+p x k+p geometry, so even if
> you
> >> >>>> end up
> >> >>>>>>>>>>> having
> >> >>>>>>>>>>>> thousands of them, reducer would sum them up just fine.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> it worked before apparently because configuration hold 1
> >> reducer
> >> >>>> by
> >> >>>>>>>>>>> default
> >> >>>>>>>>>>>> if not set explicitly, i am not quite sure if that's
> something
> >> >>>> in
> >> >>>>>>> hadoop
> >> >>>>>>>>>>> mr
> >> >>>>>>>>>>>> client or mahout change that now precludes it from working.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> anyway, i got a patch (really a one-liner) and an example
> >> >>>> equivalent
> >> >>>>>>> to
> >> >>>>>>>>>>>> yours worked fine for me with 3 reducers.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Also, in the tests, it also requests 3 reducers, but the
> >> reason
> >> >>>> it
> >> >>>>>>> works
> >> >>>>>>>>>>> in
> >> >>>>>>>>>>>> tests and not in distributed mapred is because local mapred
> >> >>>> doesn't
> >> >>>>>>>>>>> support
> >> >>>>>>>>>>>> multiple reducers. I investigated this issue before and
> >> >>>> apparently
> >> >>>>>>> there
> >> >>>>>>>>>>>> were a couple of patches floating around but for some
> reason
> >> >>>> those
> >> >>>>>>>>>>> changes
> >> >>>>>>>>>>>> did not take hold in cdh3u0.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> I will publish patch in a jira shortly and will commit it
> >> >>>>>> Sunday-ish.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Thanks.
> >> >>>>>>>>>>>> -d
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> On Fri, Aug 5, 2011 at 7:06 PM, Eshwaran Vijaya Kumar <
> >> >>>>>>>>>>>> [email protected]> wrote:
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>> OK. So to add more info to this, I tried setting the
> number
> >> of
> >> >>>>>>> reducers
> >> >>>>>>>>>>> to
> >> >>>>>>>>>>>>> 1 and now I don't get that particular error. The singular
> >> >>>> values
> >> >>>>>> and
> >> >>>>>>>>>>> left
> >> >>>>>>>>>>>>> and right singular vectors appear to be correct though
> >> >>>> (verified
> >> >>>>>>> using
> >> >>>>>>>>>>>>> Matlab).
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> On Aug 5, 2011, at 1:55 PM, Eshwaran Vijaya Kumar wrote:
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> All,
> >> >>>>>>>>>>>>>> I am trying to test Stochastic SVD and am facing some
> errors
> >> >>>> where
> >> >>>>>>> it
> >> >>>>>>>>>>>>> would be great if someone could clarifying what is going
> on.
> >> I
> >> >>>> am
> >> >>>>>>>>>>> trying to
> >> >>>>>>>>>>>>> feed the solver a DistributedRowMatrix with the exact same
> >> >>>>>> parameters
> >> >>>>>>>>>>> that
> >> >>>>>>>>>>>>> the test in LocalSSVDSolverSparseSequentialTest uses, i.e,
> >> >>>> Generate
> >> >>>>>> a
> >> >>>>>>>>>>> 1000
> >> >>>>>>>>>>>>> X 100 DRM with SequentialSparseVectors and then ask for
> >> >>>> blockHeight
> >> >>>>>>>>>>> 251, p
> >> >>>>>>>>>>>>> (oversampling) = 60, k (rank) = 40. I get the following
> >> error:
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Exception in thread "main" java.io.IOException:
> Unexpected
> >> >>>> overrun
> >> >>>>>>> in
> >> >>>>>>>>>>>>> upper triangular matrix files
> >> >>>>>>>>>>>>>> at
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.loadUpperTriangularMatrix(SSVDSolver.java:471)
> >> >>>>>>>>>>>>>> at
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.run(SSVDSolver.java:268)
> >> >>>>>>>>>>>>>> at com.mozilla.SSVDCli.run(SSVDCli.java:89)
> >> >>>>>>>>>>>>>> at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >> >>>>>>>>>>>>>> at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >> >>>>>>>>>>>>>> at com.mozilla.SSVDCli.main(SSVDCli.java:129)
> >> >>>>>>>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >> Method)
> >> >>>>>>>>>>>>>> at
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >> >>>>>>>>>>>>>> at
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >> >>>>>>>>>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
> >> >>>>>>>>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Also, I am using CDH3 with Mahout recompiled to work with
> >> CDH3
> >> >>>>>> jars.
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Thanks
> >> >>>>>>>>>>>>>> Esh
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> >>
>

Re: Errors in SSVD

Reply via email to