I am sorry, i meant 'subtract a mean', not median. That's for PCA.
On Tue, Sep 6, 2011 at 10:50 AM, Dmitriy Lyubimov <[email protected]> wrote: > You need to massage your data to compute (and subract) a median first, > as far as i understand. That should be relatively easy to do. Then you > can run a distributed SVD on it ('bin/mahout ssvd' command from trunk > should be quite good to try). > > -d > > > On Tue, Sep 6, 2011 at 5:33 AM, Amr Desoky <[email protected]> wrote: >> Hi, >> It is mentioned on the web site : >> https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms >> That you implement the following algorithms within Mahout : >> Gaussian Discriminative Analysis >> Independent Component Analysis >> Principal Components Analysis >> >> But unfortunately, I could not find any help or documentation on how to use >> these algorithms!! >> specially I would like to try PCA on a huge data set of ~10Million vectors >> of 400 components each. >> >> Please give me some help on how to run PCA (and also ICA, GDA) whatever >> available. >> >> Best regards, >> Amr >> >> >> Amr Ibrahim El-Desoky, Mousa >> PhD Student, Computer Science (i6), >> RWTH-Aachen University, >> Aachen, Germany >> Cel. : +49 0176 56418470 >> Office : +49 241 8021620 >> Fax : +49 241 8022219 >
