How do you see a difference between "sequence file of dense vectors" and
"distributed row matrix"?

DRM (at least for the purposes of SSVD/PCA) is a set of sequence files with
keys any writable and values o.a.m...VectorWritable. There's no difference
if you use embedded solver or CLI, this definition of input is the same.

-d


On Tue, Apr 16, 2013 at 9:36 AM, Chirag Lakhani <[email protected]> wrote:

> Thanks, the CCLI code seems to help a great deal.  I am still confused
> about the distributed row format.  When I have used the command line in
> Mahout I had a sequence file of dense vectors and that seemed to be fine.
>  Is it possible to use that as an input or do I need to take that file and
> make it into a distributed row matrix type?
>
>
>
> On Fri, Apr 12, 2013 at 1:19 PM, Dmitriy Lyubimov <[email protected]>
> wrote:
>
> > On Fri, Apr 12, 2013 at 8:42 AM, Dmitriy Lyubimov <[email protected]>
> > wrote:
> >
> > > No,this is not right.
> > >
> > > I will explain later when i have a moment.
> > > On Apr 12, 2013 8:08 AM, "Chirag Lakhani" <[email protected]> wrote:
> > >
> > >> I am having trouble understanding whether the following code is
> > sufficient
> > >> for running PCA
> > >>
> > >> I have a sequence file of dense vectors that I am calling and then I
> am
> > >> trying to run the following code
> > >>
> > >> SSVDSolver pcaFactory = new SSVDSolver(conf, new Path(vectorsFolder),
> > new
> > >> Path(pcaOutput),18,5,3,10);
> > >>
> > >>
> > >>         pcaFactory.setPcaMeanPath(pcaFactory.getPcaMeanPath());
> > >>
> > > ssvd solver doesn't compute pca mean -- it requires it. this line
> > therefore achieves nothing
> >
> > SSVDCli.java  computes PCA mean using DistributedRowMatrix and passes it
> > over to SSVD Solver. This behavior is switched on by -pca option. See the
> > SSVDCli code for details.
> >
> > -d
> >
> >
> > >>         pcaFactory.run();
> > >>
> > >>
> > >> Is this enough for PCA or does anyone have example code they are
> willing
> > >> to
> > >> share to see how PCA works using the SSVD solver.
> > >>
> > >
> >
>

Reply via email to