Changes in sign are, of course, not important. On Sat, Sep 1, 2012 at 3:45 AM, Dmitriy Lyubimov <[email protected]> wrote:
> sorry, i meant "random trinary" > > On Sat, Sep 1, 2012 at 12:39 AM, Dmitriy Lyubimov <[email protected]> > wrote: > > Hm. there is slight error between R full rank SVD and Mahout MR SSVD > > for my unit test modified for 100x100 k= 3 p=10. > > > > First left vector (R/SSVD) : > >> s$u[,1] > > [1] -0.050741660 -0.083985411 0.078767108 -0.044487425 -0.010380367 > > [6] 0.069635451 0.158337400 0.029102044 -0.168156173 -0.127921554 > > [11] 0.012698809 -0.027140724 0.069357925 -0.015605283 0.076614201 > > [16] -0.158582188 0.143656275 0.033886221 -0.055111330 -0.029299261 > > [21] 0.059667350 0.039205405 0.042027376 0.048541162 0.158267382 > > [26] -0.045441433 0.044529295 -0.038681358 -0.024035611 -0.054543123 > > [31] 0.027365365 -0.054029635 -0.021845631 0.053124795 0.050475680 > > [36] -0.093776477 0.094699229 -0.030911885 -0.169810667 0.149075410 > > [41] 0.102150407 0.165651229 0.175798233 -0.048390507 0.175243690 > > [46] -0.170793896 0.059918820 -0.132466003 -0.131783388 -0.178422266 > > [51] 0.079304233 -0.054428953 0.057820900 0.120791505 0.095287617 > > [56] 0.036671894 -0.081203386 0.153768112 0.014849405 0.027470798 > > [61] -0.064944829 -0.007538214 0.069034637 -0.133978151 -0.022290433 > > [66] -0.038094067 0.168947231 -0.100797474 -0.054253041 -0.040255069 > > [71] 0.124817481 -0.059689202 0.018821181 -0.131237426 -0.141223359 > > [76] 0.128026731 -0.170388319 0.080445852 0.071966615 -0.029745918 > > [81] 0.049479520 -0.121362268 -0.077338205 -0.061950828 -0.168851635 > > [86] -0.073192796 0.087453086 -0.085166577 0.160026655 -0.060816556 > > [91] 0.015420973 0.117780809 0.083415819 -0.160806975 0.171932591 > > [96] 0.170064367 0.001479280 -0.161878123 0.129685305 -0.104231610 > >> U[,1] > > 1 2 3 4 5 > 6 > > 0.050741634 0.083985464 -0.078767344 0.044487660 0.010380470 > -0.069635561 > > 7 8 9 10 11 > 12 > > -0.158337117 -0.029102012 0.168156073 0.127921760 -0.012698756 > 0.027140487 > > 13 14 15 16 17 > 18 > > -0.069358074 0.015605295 -0.076614050 0.158582091 -0.143656127 > -0.033886485 > > 19 20 21 22 23 > 24 > > 0.055111560 0.029299084 -0.059667201 -0.039205182 -0.042027356 > -0.048541087 > > 25 26 27 28 29 > 30 > > -0.158267335 0.045441521 -0.044529241 0.038681577 0.024035604 > 0.054543106 > > 31 32 33 34 35 > 36 > > -0.027365256 0.054029674 0.021845620 -0.053124833 -0.050475677 > 0.093776656 > > 37 38 39 40 41 > 42 > > -0.094699463 0.030911730 0.169810791 -0.149075076 -0.102150266 > -0.165651017 > > 43 44 45 46 47 > 48 > > -0.175798375 0.048390265 -0.175243708 0.170793758 -0.059918703 > 0.132465938 > > 49 50 51 52 53 > 54 > > 0.131783579 0.178422152 -0.079304282 0.054428751 -0.057820999 > -0.120791565 > > 55 56 57 58 59 > 60 > > -0.095287586 -0.036671995 0.081203324 -0.153767938 -0.014849361 > -0.027471027 > > 61 62 63 64 65 > 66 > > 0.064944979 0.007538413 -0.069034788 0.133978044 0.022290513 > 0.038094051 > > 67 68 69 70 71 > 72 > > -0.168947352 0.100797649 0.054253165 0.040255237 -0.124817480 > 0.059689502 > > 73 74 75 76 77 > 78 > > -0.018821295 0.131237429 0.141223597 -0.128027116 0.170388135 > -0.080445760 > > 79 80 81 82 83 > 84 > > -0.071966482 0.029745819 -0.049479559 0.121362303 0.077338278 > 0.061950724 > > 85 86 87 88 89 > 90 > > 0.168851648 0.073193002 -0.087453189 0.085166809 -0.160026464 > 0.060816590 > > 91 92 93 94 95 > 96 > > -0.015421147 -0.117780975 -0.083415727 0.160806958 -0.171932343 > -0.170064514 > > 97 98 99 100 > > -0.001479434 0.161878089 -0.129685379 0.104231530 > > > > Same thing for the right singular vectors. The only thing is that they > > seem to change the sign between R and Mahout's version but otherwise > > they fit more or less exactly. > > > > So yeah i am seeing some stochastic effects in these for k and p being > > so low -- so are you saying your errors are greater than those? I did > > not test sequential version with similar parameters. > > > > One significant difference between MR and sequential version is that > > sequential version is using ternary random matrix (instead of uniform > > one), perhaps that may affect accuracy a little bit. > > > > On Fri, Aug 31, 2012 at 10:55 PM, Ted Dunning <[email protected]> > wrote: > >> Can you provide your test code? > >> > >> What difference did you observe? > >> > >> Did you account for the fact that your matrix is small enough that it > >> probably wasn't divided correctly? > >> > >> On Sat, Sep 1, 2012 at 1:27 AM, Ahmed Elgohary <[email protected]> > wrote: > >> > >>> Hi, > >>> > >>> I used mahout's stochastic svd implementation to find the singular > vectors > >>> and the singular vectors of a small matrix 99x100. Then, I compared the > >>> results to the singular values and the singular vectors obtained using > the > >>> svd function in matlab and the single threaded version of the ssvd. I > got > >>> pretty much the same singular values using the 3 implementations. > however, > >>> the singular vectors of mahout's ssvd were significantly different. I > tried > >>> multiple values for the parameters P and Q but, that does not seem to > solve > >>> the problem. Does MR implementation of the SSVD do extra approximations > >>> over the single threaded ssvd so their results might not be the same? > Any > >>> advice how I can tune mahout's ssvd to get the same singular vectors > of the > >>> single threaded ssvd? > >>> > >>> thanks, > >>> > >>> --ahmed > >>> >
