sorry, i meant "random trinary"
On Sat, Sep 1, 2012 at 12:39 AM, Dmitriy Lyubimov <[email protected]> wrote: > Hm. there is slight error between R full rank SVD and Mahout MR SSVD > for my unit test modified for 100x100 k= 3 p=10. > > First left vector (R/SSVD) : >> s$u[,1] > [1] -0.050741660 -0.083985411 0.078767108 -0.044487425 -0.010380367 > [6] 0.069635451 0.158337400 0.029102044 -0.168156173 -0.127921554 > [11] 0.012698809 -0.027140724 0.069357925 -0.015605283 0.076614201 > [16] -0.158582188 0.143656275 0.033886221 -0.055111330 -0.029299261 > [21] 0.059667350 0.039205405 0.042027376 0.048541162 0.158267382 > [26] -0.045441433 0.044529295 -0.038681358 -0.024035611 -0.054543123 > [31] 0.027365365 -0.054029635 -0.021845631 0.053124795 0.050475680 > [36] -0.093776477 0.094699229 -0.030911885 -0.169810667 0.149075410 > [41] 0.102150407 0.165651229 0.175798233 -0.048390507 0.175243690 > [46] -0.170793896 0.059918820 -0.132466003 -0.131783388 -0.178422266 > [51] 0.079304233 -0.054428953 0.057820900 0.120791505 0.095287617 > [56] 0.036671894 -0.081203386 0.153768112 0.014849405 0.027470798 > [61] -0.064944829 -0.007538214 0.069034637 -0.133978151 -0.022290433 > [66] -0.038094067 0.168947231 -0.100797474 -0.054253041 -0.040255069 > [71] 0.124817481 -0.059689202 0.018821181 -0.131237426 -0.141223359 > [76] 0.128026731 -0.170388319 0.080445852 0.071966615 -0.029745918 > [81] 0.049479520 -0.121362268 -0.077338205 -0.061950828 -0.168851635 > [86] -0.073192796 0.087453086 -0.085166577 0.160026655 -0.060816556 > [91] 0.015420973 0.117780809 0.083415819 -0.160806975 0.171932591 > [96] 0.170064367 0.001479280 -0.161878123 0.129685305 -0.104231610 >> U[,1] > 1 2 3 4 5 6 > 0.050741634 0.083985464 -0.078767344 0.044487660 0.010380470 -0.069635561 > 7 8 9 10 11 12 > -0.158337117 -0.029102012 0.168156073 0.127921760 -0.012698756 0.027140487 > 13 14 15 16 17 18 > -0.069358074 0.015605295 -0.076614050 0.158582091 -0.143656127 -0.033886485 > 19 20 21 22 23 24 > 0.055111560 0.029299084 -0.059667201 -0.039205182 -0.042027356 -0.048541087 > 25 26 27 28 29 30 > -0.158267335 0.045441521 -0.044529241 0.038681577 0.024035604 0.054543106 > 31 32 33 34 35 36 > -0.027365256 0.054029674 0.021845620 -0.053124833 -0.050475677 0.093776656 > 37 38 39 40 41 42 > -0.094699463 0.030911730 0.169810791 -0.149075076 -0.102150266 -0.165651017 > 43 44 45 46 47 48 > -0.175798375 0.048390265 -0.175243708 0.170793758 -0.059918703 0.132465938 > 49 50 51 52 53 54 > 0.131783579 0.178422152 -0.079304282 0.054428751 -0.057820999 -0.120791565 > 55 56 57 58 59 60 > -0.095287586 -0.036671995 0.081203324 -0.153767938 -0.014849361 -0.027471027 > 61 62 63 64 65 66 > 0.064944979 0.007538413 -0.069034788 0.133978044 0.022290513 0.038094051 > 67 68 69 70 71 72 > -0.168947352 0.100797649 0.054253165 0.040255237 -0.124817480 0.059689502 > 73 74 75 76 77 78 > -0.018821295 0.131237429 0.141223597 -0.128027116 0.170388135 -0.080445760 > 79 80 81 82 83 84 > -0.071966482 0.029745819 -0.049479559 0.121362303 0.077338278 0.061950724 > 85 86 87 88 89 90 > 0.168851648 0.073193002 -0.087453189 0.085166809 -0.160026464 0.060816590 > 91 92 93 94 95 96 > -0.015421147 -0.117780975 -0.083415727 0.160806958 -0.171932343 -0.170064514 > 97 98 99 100 > -0.001479434 0.161878089 -0.129685379 0.104231530 > > Same thing for the right singular vectors. The only thing is that they > seem to change the sign between R and Mahout's version but otherwise > they fit more or less exactly. > > So yeah i am seeing some stochastic effects in these for k and p being > so low -- so are you saying your errors are greater than those? I did > not test sequential version with similar parameters. > > One significant difference between MR and sequential version is that > sequential version is using ternary random matrix (instead of uniform > one), perhaps that may affect accuracy a little bit. > > On Fri, Aug 31, 2012 at 10:55 PM, Ted Dunning <[email protected]> wrote: >> Can you provide your test code? >> >> What difference did you observe? >> >> Did you account for the fact that your matrix is small enough that it >> probably wasn't divided correctly? >> >> On Sat, Sep 1, 2012 at 1:27 AM, Ahmed Elgohary <[email protected]> wrote: >> >>> Hi, >>> >>> I used mahout's stochastic svd implementation to find the singular vectors >>> and the singular vectors of a small matrix 99x100. Then, I compared the >>> results to the singular values and the singular vectors obtained using the >>> svd function in matlab and the single threaded version of the ssvd. I got >>> pretty much the same singular values using the 3 implementations. however, >>> the singular vectors of mahout's ssvd were significantly different. I tried >>> multiple values for the parameters P and Q but, that does not seem to solve >>> the problem. Does MR implementation of the SSVD do extra approximations >>> over the single threaded ssvd so their results might not be the same? Any >>> advice how I can tune mahout's ssvd to get the same singular vectors of the >>> single threaded ssvd? >>> >>> thanks, >>> >>> --ahmed >>>
