many people also use PCA options workflow with SSVD and then try clusterize the output U*Sigma which is dimensionally reduced representation of original row-wise dataset. To enable PCA and U*Sigma output, use
ssvd -pca true -us true -u false -v false -k=... -q=1 ... -q=1 recommended for accuracy. On Wed, Jul 31, 2013 at 5:09 AM, Stuti Awasthi <[email protected]> wrote: > Hi All, > > I wanted to group the documents with same context but which belongs to one > single domain together. I have tried KMeans and LDA provided in Mahout to > perform the clustering but the groups which are generated are not very > good. Hence I thought to use LSA to indentify the context related to the > word and then perform the Clustering. > > I am able to run SSVD of Mahout and generated 3 files : Sigma,U,V as > output of SSVD. > I am not sure how to use the output of SSVD to fed to the Clustering > Algorithm so that we can generate the clusters of the documents which might > be talking about same context. > > Any pointers how can I achieve this ? > > Regards > Stuti Awasthi > > > ::DISCLAIMER:: > > ---------------------------------------------------------------------------------------------------------------------------------------------------- > > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. > E-mail transmission is not guaranteed to be secure or error-free as > information could be intercepted, corrupted, > lost, destroyed, arrive late or incomplete, or may contain viruses in > transmission. The e mail and its contents > (with or without referred errors) shall therefore not attach any liability > on the originator or HCL or its affiliates. > Views or opinions, if any, presented in this email are solely those of the > author and may not necessarily reflect the > views or opinions of HCL or its affiliates. Any form of reproduction, > dissemination, copying, disclosure, modification, > distribution and / or publication of this message without the prior > written consent of authorized representative of > HCL is strictly prohibited. If you have received this email in error > please delete it and notify the sender immediately. > Before opening any email and/or attachments, please check them for viruses > and other defects. > > > ---------------------------------------------------------------------------------------------------------------------------------------------------- >
