How do you see a difference between "sequence file of dense vectors" and "distributed row matrix"?
DRM (at least for the purposes of SSVD/PCA) is a set of sequence files with keys any writable and values o.a.m...VectorWritable. There's no difference if you use embedded solver or CLI, this definition of input is the same. -d On Tue, Apr 16, 2013 at 9:36 AM, Chirag Lakhani <[email protected]> wrote: > Thanks, the CCLI code seems to help a great deal. I am still confused > about the distributed row format. When I have used the command line in > Mahout I had a sequence file of dense vectors and that seemed to be fine. > Is it possible to use that as an input or do I need to take that file and > make it into a distributed row matrix type? > > > > On Fri, Apr 12, 2013 at 1:19 PM, Dmitriy Lyubimov <[email protected]> > wrote: > > > On Fri, Apr 12, 2013 at 8:42 AM, Dmitriy Lyubimov <[email protected]> > > wrote: > > > > > No,this is not right. > > > > > > I will explain later when i have a moment. > > > On Apr 12, 2013 8:08 AM, "Chirag Lakhani" <[email protected]> wrote: > > > > > >> I am having trouble understanding whether the following code is > > sufficient > > >> for running PCA > > >> > > >> I have a sequence file of dense vectors that I am calling and then I > am > > >> trying to run the following code > > >> > > >> SSVDSolver pcaFactory = new SSVDSolver(conf, new Path(vectorsFolder), > > new > > >> Path(pcaOutput),18,5,3,10); > > >> > > >> > > >> pcaFactory.setPcaMeanPath(pcaFactory.getPcaMeanPath()); > > >> > > > ssvd solver doesn't compute pca mean -- it requires it. this line > > therefore achieves nothing > > > > SSVDCli.java computes PCA mean using DistributedRowMatrix and passes it > > over to SSVD Solver. This behavior is switched on by -pca option. See the > > SSVDCli code for details. > > > > -d > > > > > > >> pcaFactory.run(); > > >> > > >> > > >> Is this enough for PCA or does anyone have example code they are > willing > > >> to > > >> share to see how PCA works using the SSVD solver. > > >> > > > > > >
