No its zero mean uniform of course. A murmur scaled to -1...1 range. I used to use normal too but you advised there were not much difference and i actually did not see much either.
I also think that in this case me moving the input to R via decimals actually created precision errors too. I will double check. And my synthetic test input has a flat tale in the lower singular numbers which of course messes up some singular vectors in the tale but doesnt affect singular values. I will check for these things and look again. But i dont see a fundamental problems with the resuls i see, they are the same down to eighth digit after the dot, so there is no fundamental problem here. On Sep 1, 2012 1:03 AM, "Ted Dunning" <[email protected]> wrote: > Oho... > > If the uniform randoms have non-zero means, then this could be a > significant effect that leads to some loss of significance in the results. > For small matrices the resulting difference shouldn't be huge but it might > well be observable. > > On Sat, Sep 1, 2012 at 3:45 AM, Dmitriy Lyubimov <[email protected]> > wrote: > > > sorry, i meant "random trinary" > > > > On Sat, Sep 1, 2012 at 12:39 AM, Dmitriy Lyubimov <[email protected]> > > wrote: > > > Hm. there is slight error between R full rank SVD and Mahout MR SSVD > > > for my unit test modified for 100x100 k= 3 p=10. > > > > > > First left vector (R/SSVD) : > > >> s$u[,1] > > > [1] -0.050741660 -0.083985411 0.078767108 -0.044487425 -0.010380367 > > > [6] 0.069635451 0.158337400 0.029102044 -0.168156173 -0.127921554 > > > [11] 0.012698809 -0.027140724 0.069357925 -0.015605283 0.076614201 > > > [16] -0.158582188 0.143656275 0.033886221 -0.055111330 -0.029299261 > > > [21] 0.059667350 0.039205405 0.042027376 0.048541162 0.158267382 > > > [26] -0.045441433 0.044529295 -0.038681358 -0.024035611 -0.054543123 > > > [31] 0.027365365 -0.054029635 -0.021845631 0.053124795 0.050475680 > > > [36] -0.093776477 0.094699229 -0.030911885 -0.169810667 0.149075410 > > > [41] 0.102150407 0.165651229 0.175798233 -0.048390507 0.175243690 > > > [46] -0.170793896 0.059918820 -0.132466003 -0.131783388 -0.178422266 > > > [51] 0.079304233 -0.054428953 0.057820900 0.120791505 0.095287617 > > > [56] 0.036671894 -0.081203386 0.153768112 0.014849405 0.027470798 > > > [61] -0.064944829 -0.007538214 0.069034637 -0.133978151 -0.022290433 > > > [66] -0.038094067 0.168947231 -0.100797474 -0.054253041 -0.040255069 > > > [71] 0.124817481 -0.059689202 0.018821181 -0.131237426 -0.141223359 > > > [76] 0.128026731 -0.170388319 0.080445852 0.071966615 -0.029745918 > > > [81] 0.049479520 -0.121362268 -0.077338205 -0.061950828 -0.168851635 > > > [86] -0.073192796 0.087453086 -0.085166577 0.160026655 -0.060816556 > > > [91] 0.015420973 0.117780809 0.083415819 -0.160806975 0.171932591 > > > [96] 0.170064367 0.001479280 -0.161878123 0.129685305 -0.104231610 > > >> U[,1] > > > 1 2 3 4 5 > > 6 > > > 0.050741634 0.083985464 -0.078767344 0.044487660 0.010380470 > > -0.069635561 > > > 7 8 9 10 11 > > 12 > > > -0.158337117 -0.029102012 0.168156073 0.127921760 -0.012698756 > > 0.027140487 > > > 13 14 15 16 17 > > 18 > > > -0.069358074 0.015605295 -0.076614050 0.158582091 -0.143656127 > > -0.033886485 > > > 19 20 21 22 23 > > 24 > > > 0.055111560 0.029299084 -0.059667201 -0.039205182 -0.042027356 > > -0.048541087 > > > 25 26 27 28 29 > > 30 > > > -0.158267335 0.045441521 -0.044529241 0.038681577 0.024035604 > > 0.054543106 > > > 31 32 33 34 35 > > 36 > > > -0.027365256 0.054029674 0.021845620 -0.053124833 -0.050475677 > > 0.093776656 > > > 37 38 39 40 41 > > 42 > > > -0.094699463 0.030911730 0.169810791 -0.149075076 -0.102150266 > > -0.165651017 > > > 43 44 45 46 47 > > 48 > > > -0.175798375 0.048390265 -0.175243708 0.170793758 -0.059918703 > > 0.132465938 > > > 49 50 51 52 53 > > 54 > > > 0.131783579 0.178422152 -0.079304282 0.054428751 -0.057820999 > > -0.120791565 > > > 55 56 57 58 59 > > 60 > > > -0.095287586 -0.036671995 0.081203324 -0.153767938 -0.014849361 > > -0.027471027 > > > 61 62 63 64 65 > > 66 > > > 0.064944979 0.007538413 -0.069034788 0.133978044 0.022290513 > > 0.038094051 > > > 67 68 69 70 71 > > 72 > > > -0.168947352 0.100797649 0.054253165 0.040255237 -0.124817480 > > 0.059689502 > > > 73 74 75 76 77 > > 78 > > > -0.018821295 0.131237429 0.141223597 -0.128027116 0.170388135 > > -0.080445760 > > > 79 80 81 82 83 > > 84 > > > -0.071966482 0.029745819 -0.049479559 0.121362303 0.077338278 > > 0.061950724 > > > 85 86 87 88 89 > > 90 > > > 0.168851648 0.073193002 -0.087453189 0.085166809 -0.160026464 > > 0.060816590 > > > 91 92 93 94 95 > > 96 > > > -0.015421147 -0.117780975 -0.083415727 0.160806958 -0.171932343 > > -0.170064514 > > > 97 98 99 100 > > > -0.001479434 0.161878089 -0.129685379 0.104231530 > > > > > > Same thing for the right singular vectors. The only thing is that they > > > seem to change the sign between R and Mahout's version but otherwise > > > they fit more or less exactly. > > > > > > So yeah i am seeing some stochastic effects in these for k and p being > > > so low -- so are you saying your errors are greater than those? I did > > > not test sequential version with similar parameters. > > > > > > One significant difference between MR and sequential version is that > > > sequential version is using ternary random matrix (instead of uniform > > > one), perhaps that may affect accuracy a little bit. > > > > > > On Fri, Aug 31, 2012 at 10:55 PM, Ted Dunning <[email protected]> > > wrote: > > >> Can you provide your test code? > > >> > > >> What difference did you observe? > > >> > > >> Did you account for the fact that your matrix is small enough that it > > >> probably wasn't divided correctly? > > >> > > >> On Sat, Sep 1, 2012 at 1:27 AM, Ahmed Elgohary <[email protected]> > > wrote: > > >> > > >>> Hi, > > >>> > > >>> I used mahout's stochastic svd implementation to find the singular > > vectors > > >>> and the singular vectors of a small matrix 99x100. Then, I compared > the > > >>> results to the singular values and the singular vectors obtained > using > > the > > >>> svd function in matlab and the single threaded version of the ssvd. I > > got > > >>> pretty much the same singular values using the 3 implementations. > > however, > > >>> the singular vectors of mahout's ssvd were significantly different. I > > tried > > >>> multiple values for the parameters P and Q but, that does not seem to > > solve > > >>> the problem. Does MR implementation of the SSVD do extra > approximations > > >>> over the single threaded ssvd so their results might not be the same? > > Any > > >>> advice how I can tune mahout's ssvd to get the same singular vectors > > of the > > >>> single threaded ssvd? > > >>> > > >>> thanks, > > >>> > > >>> --ahmed > > >>> > > >
