it should. i worked on the issue and last time it was checked it was still working with name propagation. if not, then it is a bug
On Fri, Aug 2, 2013 at 3:33 AM, Stuti Awasthi <[email protected]> wrote: > Hey Ted, > > As suggested, I tried SSVD with Mahout 0.8. I think the issue of > NamedVector not propagating to the output U ,still persists. > Here is what I have done : > > 1. Created featureVector using "seq2sparse" with -nv option. Checked the > output, named vector created. > 2. Provided this featureVector to "ssvd" with params " -k 100 -U true -V > true". After execution, 3 output got generated namely : sigma, U, V > 3. I dumped "U" to check the output if it contained namedVectors: > > mahout-distribution-0.7/bin/mahout seqdumper -i /stuti/SSVD/Output/U | more > Output: > Key: /File_1: Value: > {0:0.027019746696983288,1:0.006124424321845726,2:0.0334311500858222,.....} > > I did not see the NamedVector getting created in the output of ssvd. > Please point out if I have missed any step in between. > > As I wanted to perform the Clustering, I took the output of "U" and > generated the NamedVector with custom code. The output looks like this : > Key: /File_1: Value: > /File_1:{0:0.027019746696983288,1:0.006124424321845726,2:0.0334311500858222,.....} > > Then I fed this namedvector file to KMeans to generate 10 Clusters. In > this I have used Random Centroid selection with KMeans. > Finally I dumped the ClusterOutput as : > <ClusterId>,<DocumentID1>,<DocumentId2>..... > > Please let me know if I have performed any mistake in the end to end > execution as well Im not sure Why SSVD output is not generating the named > vectors as the issue id fixed.. > > Please suggest > > Regards > Stuti Awasthi > > > > > > -----Original Message----- > From: Ted Dunning [mailto:[email protected]] > Sent: Thursday, August 01, 2013 8:37 PM > To: [email protected] > Subject: Re: How to SSVD output to generate Clusters > > On Thu, Aug 1, 2013 at 5:49 AM, Stuti Awasthi <[email protected]> > wrote: > > > I think there is a problem because of NamedVector as after some search > > I get this Jira. https://issues.apache.org/jira/browse/MAHOUT-1067 > > > > Note also that this bug is fixed in 0.8 > > > ::DISCLAIMER:: > > ---------------------------------------------------------------------------------------------------------------------------------------------------- > > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. > E-mail transmission is not guaranteed to be secure or error-free as > information could be intercepted, corrupted, > lost, destroyed, arrive late or incomplete, or may contain viruses in > transmission. The e mail and its contents > (with or without referred errors) shall therefore not attach any liability > on the originator or HCL or its affiliates. > Views or opinions, if any, presented in this email are solely those of the > author and may not necessarily reflect the > views or opinions of HCL or its affiliates. Any form of reproduction, > dissemination, copying, disclosure, modification, > distribution and / or publication of this message without the prior > written consent of authorized representative of > HCL is strictly prohibited. If you have received this email in error > please delete it and notify the sender immediately. > Before opening any email and/or attachments, please check them for viruses > and other defects. > > > ---------------------------------------------------------------------------------------------------------------------------------------------------- >
