Hi All, I wanted to group the documents with same context but which belongs to one single domain together. I have tried KMeans and LDA provided in Mahout to perform the clustering but the groups which are generated are not very good. Hence I thought to use LSA to indentify the context related to the word and then perform the Clustering.
I am able to run SSVD of Mahout and generated 3 files : Sigma,U,V as output of SSVD. I am not sure how to use the output of SSVD to fed to the Clustering Algorithm so that we can generate the clusters of the documents which might be talking about same context. Any pointers how can I achieve this ? Regards Stuti Awasthi ::DISCLAIMER:: ---------------------------------------------------------------------------------------------------------------------------------------------------- The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ----------------------------------------------------------------------------------------------------------------------------------------------------
