Ok, got a little farther. Now, what is useful about projecting with the factorization? My outputs looked maybe a little more regular. One crazy thing turned up: the vector set has a clump and 3 far outlier points. The fact that these are outliers did not turn up in other visualizations. But when projected with the QR.q, the 3 outliers jumped far away from the clump.
What exactly is going on here? Why does rotating with QR.q create very segregated clumps? On Sun, Jul 10, 2011 at 8:02 PM, Lance Norskog <[email protected]> wrote: > I've tried the QR factorization trick describe above, and posted the > new pictures: > http://ultrawhizbang.blogspot.com/2011/07/dimensional-reduction-via-random_10.html > > I really need to see movie names to get farther with this. > > On 7/9/11, Ted Dunning <[email protected]> wrote: >> The problem is still not precisely stated. You have 95% confidence in what? >> Do you know that 95% of >> The entries are exact and the otherwise be arbitrarily corrupted? If so, >> you can say nothing because the error is unbounded in terms of squared >> error. >> >> In order to say something stronger you have to phrase your error in terms of >> something that has invariant properties. This is why invariants are such >> important concepts in such a variety of quantitative endeavors. >> >> Sent from my iPhone >> >> On Jul 9, 2011, at 15:51, Lance Norskog <[email protected]> wrote: >> >>> Yes, I misphrased it. Let's say I have a dataset in which I have 95% >>> confidence of its correctness, and I multiply it by a matrix which >>> does not lose information, I will have 95% confidence in the output >>> data. >>> >>> Now, let's downsize the matrix by the above numbers. The remaining >>> matrix contains the singular vectors up to 0.9, with 0.1 percent of >>> the singular vectors missing. What is my confidence in the output >>> data? >>> >>> On Sat, Jul 9, 2011 at 11:46 AM, Ted Dunning <[email protected]> >>> wrote: >>>> I don't understand the question. >>>> >>>> A rotation leaves the Frobenius norm unchanged. Period. Any >>>> rank-limited >>>> optimal least-squares approximation in one rotated/translated frame is >>>> optimal in any other such frame. What do you mean by 99% confidence >>>> here? >>>> The approximation is optimal or not. >>>> >>>> On Fri, Jul 8, 2011 at 7:49 PM, Lance Norskog <[email protected]> wrote: >>>> >>>>> Getting closer: if I have a matrix that does a particular >>>>> rotation/translation etc. with a 99% confidence interval, and I do the >>>>> above trimming operation, is it possible to translate this into a new >>>>> confidence level? Are there specific ways to translate these numbers >>>>> into probabilistic estimates? Is it just way too hairy? >>>>> >>>>> Lance >>>>> >>>>> On Thu, Jul 7, 2011 at 10:15 PM, Ted Dunning <[email protected]> >>>>> wrote: >>>>>> This means that the rank 2 reconstruction of your matrix is close to >>>>>> your >>>>>> original in the sense that the Frobenius norm of the difference will be >>>>>> small. >>>>>> >>>>>> In particular, the Frobenius norm of the delta should be the same as >>>>>> the >>>>>> Frobenius norm of the missing singular values (root sum squared missing >>>>>> values, that is). >>>>>> >>>>>> Here is an example. First I use a random 20x7 matrix to get an SVD >>>>>> into >>>>>> which I transplant your singular values. This gives me a matrix whose >>>>>> decomposition is the same as the one you are using. >>>>>> >>>>>> I then do that decomposition and truncate the singular values to get an >>>>>> approximate matrix. The Frobenius norm of the difference is the same >>>>>> as >>>>> the >>>>>> Frobenius norm of the missing singular values. >>>>>> >>>>>>> m = matrix(rnorm(20*7), nrow=20) >>>>>>> svd1 = svd(m) >>>>>>> length(svd1$d) >>>>>> [1] 7 >>>>>>> d = c(0.7, 0.2,0.05, 0.02, 0.01, 0.01, 0.01) >>>>>>> m2 = svd1$u %*% diag(d) %*% t(svd1$v) >>>>>>> svd = svd(m2) >>>>>>> svd$d >>>>>> [1] 0.70 0.20 0.05 0.02 0.01 0.01 0.01 >>>>>>> m3 = svd$u[,1:2] %*% diag(svd$d[1:2]) %*% t(svd$v[,1:2]) >>>>>>> dim(m3) >>>>>> [1] 20 7 >>>>>>> m2-m3 >>>>>> [,1] [,2] [,3] [,4] >>>>> [,5] >>>>>> [,6] [,7] >>>>>> [1,] 0.0069233794 0.0020467352 -0.0071659763 -4.099546e-03 >>>>> 0.0056399256 >>>>>> -0.0023953930 -0.0119370905 >>>>>> [2,] -0.0018546491 0.0011631030 0.0013261685 -1.193252e-03 >>>>> 0.0002839689 >>>>>> 0.0014320601 0.0036207164 >>>>>> [3,] 0.0011612177 0.0027845827 -0.0023368373 -4.240565e-03 >>>>> 0.0009362635 >>>>>> -0.0032987483 -0.0019110953 >>>>>> [4,] -0.0061414783 0.0070092709 0.0066429461 2.240401e-03 >>>>> -0.0003033182 >>>>>> -0.0031444510 0.0027860257 >>>>>> [5,] 0.0004910556 -0.0057660226 0.0014586550 -3.383020e-03 >>>>> -0.0015763103 >>>>>> 0.0011357677 0.0101147998 >>>>>> [6,] 0.0016672016 -0.0043701670 -0.0002311687 -1.706181e-04 >>>>> -0.0032324629 >>>>>> -0.0033587690 0.0018471306 >>>>>> [7,] -0.0024146270 0.0007510238 0.0052282604 7.724380e-04 >>>>> -0.0004411600 >>>>>> -0.0026622302 0.0050655693 >>>>>> [8,] 0.0036106469 0.0028629467 -0.0007957853 1.333764e-03 >>>>> 0.0074933368 >>>>>> 0.0008158132 -0.0091284389 >>>>>> [9,] 0.0013369776 0.0036364763 -0.0009691292 -2.050044e-03 >>>>> 0.0021208815 >>>>>> -0.0042241753 -0.0043885229 >>>>>> [10,] 0.0031153692 0.0003852343 -0.0053822410 -6.538480e-04 >>>>> 0.0005221515 >>>>>> -0.0003594550 -0.0077290438 >>>>>> [11,] -0.0012286952 0.0026373981 0.0017958449 4.693112e-05 >>>>> 0.0003753286 >>>>>> -0.0024000476 -0.0001261246 >>>>>> [12,] -0.0024890888 -0.0018374670 0.0048781861 -1.065282e-03 >>>>> -0.0018902396 >>>>>> -0.0013280442 0.0096305420 >>>>>> [13,] 0.0099545328 -0.0012843802 -0.0035220130 1.599559e-03 >>>>> 0.0081592109 >>>>>> -0.0047310711 -0.0158840779 >>>>>> [14,] -0.0029835169 0.0046807105 0.0016607724 4.339315e-03 >>>>> -0.0015926183 >>>>>> -0.0026172305 -0.0048268815 >>>>>> [15,] -0.0102632616 0.0033271770 0.0104700407 2.728651e-03 >>>>> -0.0037697307 >>>>>> 0.0016053567 0.0145899365 >>>>>> [16,] -0.0074888872 -0.0002277379 0.0068370652 -4.688963e-05 >>>>> -0.0044921343 >>>>>> 0.0024889009 0.0150436991 >>>>>> [17,] -0.0068501952 -0.0017733601 0.0076497285 1.743932e-03 >>>>> -0.0055472565 >>>>>> 0.0006109667 0.0142443162 >>>>>> [18,] -0.0020245716 -0.0011431425 0.0044837803 3.219527e-04 >>>>> 0.0007887701 >>>>>> 0.0019836205 0.0070585228 >>>>>> [19,] -0.0016059867 -0.0028328316 0.0032223649 2.025061e-03 >>>>> -0.0019194344 >>>>>> 0.0009643023 0.0052452638 >>>>>> [20,] 0.0042324909 -0.0063013966 -0.0041269199 -9.848214e-04 >>>>> -0.0029591571 >>>>>> -0.0015911580 -0.0012584919 >>>>>>> sqrt(sum((m2-m3)^2)) >>>>>> [1] 0.05656854 >>>>>>> sqrt(sum(d[3:7]^2)) >>>>>> [1] 0.05656854 >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Jul 7, 2011 at 8:46 PM, Lance Norskog <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Rats "My 3D coordinates" should be 'My 2D coordinates'. The there is a >>>>>>> preposition missing in the first sentence. >>>>>>> >>>>>>> On Thu, Jul 7, 2011 at 8:44 PM, Lance Norskog <[email protected]> >>>>> wrote: >>>>>>>> The singular values in my experiments drop like a rock. What is >>>>>>>> information/probability loss formula when dropping low-value vectors? >>>>>>>> >>>>>>>> That is, I start with a 7D vector set, go through this random >>>>>>>> projection/svd exercise, and get these singular vectors: [0.7, 0.2, >>>>>>>> 0.05, 0.02, 0.01, 0.01, 0.01]. I drop the last five to create a >>>>>>>> matrix >>>>>>>> that gives 2D coordinates. The sum of the remaining singular values >>>>>>>> is >>>>>>>> 0.9. My 3D vectors will be missing 0.10 of *something* compared to >>>>>>>> the >>>>>>>> original 7D vectors. What is this something and what other concepts >>>>>>>> does it plug into? >>>>>>>> >>>>>>>> Lance >>>>>>>> >>>>>>>> On Sat, Jul 2, 2011 at 11:54 PM, Lance Norskog <[email protected]> >>>>>>> wrote: >>>>>>>>> The singular values on my recommender vectors come out: 90, 10, 1.2, >>>>>>>>> 1.1, 1.0, 0.95..... This was playing with your R code. Based on >>>>>>>>> this, >>>>>>>>> I'm adding the QR stuff to my visualization toolkit. >>>>>>>>> >>>>>>>>> Lance >>>>>>>>> >>>>>>>>> On Sat, Jul 2, 2011 at 10:15 PM, Lance Norskog <[email protected]> >>>>>>> wrote: >>>>>>>>>> All pairwise distances are preserved? There must be a qualifier on >>>>>>>>>> pairwise distance algorithms. >>>>>>>>>> >>>>>>>>>> On Sat, Jul 2, 2011 at 6:49 PM, Lance Norskog <[email protected]> >>>>>>> wrote: >>>>>>>>>>> Cool. The plots are fun. The way gaussian spots project into >>>>> spinning >>>>>>>>>>> chains is very educational about entropy. >>>>>>>>>>> >>>>>>>>>>> For full Random Projection, a lame random number generator >>>>>>>>>>> (java.lang.Random) will generate a higher standard deviation than >>>>>>>>>>> a >>>>>>>>>>> high-quality one like MurmurHash. >>>>>>>>>>> >>>>>>>>>>> On Fri, Jul 1, 2011 at 5:25 PM, Ted Dunning <[email protected] >>>>>> >>>>>>> wrote: >>>>>>>>>>>> Here is R code that demonstrates what I mean by stunning (aka 15 >>>>>>> significant >>>>>>>>>>>> figures). Note that I only check dot products here. From the >>>>> fact >>>>>>> that the >>>>>>>>>>>> final transform is orthonormal we know that all distances are >>>>>>> preserved. >>>>>>>>>>>> >>>>>>>>>>>> # make a big random matrix with rank 20 >>>>>>>>>>>>> a = matrix(rnorm(20000), ncol=20) %*% matrix(rnorm(20000), >>>>> nrow=20); >>>>>>>>>>>>> dim(a) >>>>>>>>>>>> [1] 1000 1000 >>>>>>>>>>>> # random projection >>>>>>>>>>>>> y = a %*% matrix(rnorm(30000), ncol=30); >>>>>>>>>>>> # get basis for y >>>>>>>>>>>>> q = qr.Q(qr(y)) >>>>>>>>>>>>> dim(q) >>>>>>>>>>>> [1] 1000 30 >>>>>>>>>>>> # re-project b, do svd on result >>>>>>>>>>>>> b = t(q) %*% a >>>>>>>>>>>>> v = svd(b)$v >>>>>>>>>>>>> d = svd(b)$d >>>>>>>>>>>> # note how singular values drop like a stone at index 21 >>>>>>>>>>>>> plot(d) >>>>>>>>>>>>> dim(v) >>>>>>>>>>>> [1] 1000 30 >>>>>>>>>>>> # finish svd just for fun >>>>>>>>>>>>> u = q %*% svd(b)$u >>>>>>>>>>>>> dim(u) >>>>>>>>>>>> [1] 1000 30 >>>>>>>>>>>> # u is orthogonal, right? >>>>>>>>>>>>> diag(t(u)%*% u) >>>>>>>>>>>> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 >>>>>>>>>>>> # and u diag(d) v' reconstructs a very precisely, right? >>>>>>>>>>>>> max(abs(a-u %*% diag(d) %*% t(v))) >>>>>>>>>>>> [1] 1.293188e-12 >>>>>>>>>>>> >>>>>>>>>>>> # now project a into the reduced dimensional space >>>>>>>>>>>>> aa = a%*%v >>>>>>>>>>>>> dim(aa) >>>>>>>>>>>> [1] 1000 30 >>>>>>>>>>>> # check a few dot products >>>>>>>>>>>>> sum(aa[1,] %*% aa[2,]) >>>>>>>>>>>> [1] 6835.152 >>>>>>>>>>>>> sum(a[1,] %*% a[2,]) >>>>>>>>>>>> [1] 6835.152 >>>>>>>>>>>>> sum(a[1,] %*% a[3,]) >>>>>>>>>>>> [1] 3337.248 >>>>>>>>>>>>> sum(aa[1,] %*% aa[3,]) >>>>>>>>>>>> [1] 3337.248 >>>>>>>>>>>> >>>>>>>>>>>> # wow, that's close. let's try a hundred dot products >>>>>>>>>>>>> dot1 = rep(0,100);dot2 = rep(0,100) >>>>>>>>>>>>> for (i in 1:100) {dot1[i] = sum(a[1,] * a[i,]); dot2[i] = >>>>>>> sum(aa[1,]* >>>>>>>>>>>> aa[i,])} >>>>>>>>>>>> >>>>>>>>>>>> # how close to the same are those? >>>>>>>>>>>>> max(abs(dot1-dot2)) >>>>>>>>>>>> # VERY >>>>>>>>>>>> [1] 3.45608e-11 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jul 1, 2011 at 4:54 PM, Ted Dunning < >>>>> [email protected]> >>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Yes. Been there. Done that. >>>>>>>>>>>>> >>>>>>>>>>>>> The correlation is stunningly good. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jul 1, 2011 at 4:22 PM, Lance Norskog <[email protected] >>>>>> >>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Is this true? - "Preserving pairwise distances" means the >>>>> relative >>>>>>>>>>>>>> distances. So the ratios of new distance:old distance should be >>>>>>>>>>>>>> similar. The standard deviation of the ratios gives a >>>>> rough&ready >>>>>>>>>>>>>> measure of the fidelity of the reduction. The standard >>>>>>>>>>>>>> deviation >>>>> of >>>>>>>>>>>>>> simple RP should be highest, then this RP + orthogonalization, >>>>> then >>>>>>>>>>>>>> MDS. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jul 1, 2011 at 11:03 AM, Ted Dunning < >>>>>>> [email protected]> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> Lance, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> You would get better results from the random projection if you >>>>>>> did the >>>>>>>>>>>>>> first >>>>>>>>>>>>>>> part of the stochastic SVD. Basically, you do the random >>>>>>> projection: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Y = A \Omega >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> where A is your original data, R is the random matrix and Y is >>>>>>> the >>>>>>>>>>>>>> result. >>>>>>>>>>>>>>> Y will be tall and skinny. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Then, find an orthogonal basis Q of Y: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Q R = Y >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This orthogonal basis will be very close to the orthogonal >>>>> basis >>>>>>> of A. >>>>>>>>>>>>>> In >>>>>>>>>>>>>>> fact, there are strong probabilistic guarantees on how good Q >>>>> is >>>>>>> as a >>>>>>>>>>>>>> basis >>>>>>>>>>>>>>> of A. Next, you project A using the transpose of Q: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> B = Q' A >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This gives you a short fat matrix that is the projection of A >>>>>>> into a >>>>>>>>>>>>>> lower >>>>>>>>>>>>>>> dimensional space. Since this is a left projection, it isn't >>>>>>> quite what >>>>>>>>>>>>>> you >>>>>>>>>>>>>>> want in your work, but it is the standard way to phrase >>>>> things. >>>>>>> The >>>>>>>>>>>>>> exact >>>>>>>>>>>>>>> same thing can be done with left random projection: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Y = \Omega A >>>>>>>>>>>>>>> L Q = Y >>>>>>>>>>>>>>> B = A Q' >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In this form, B is tall and skinny as you would like and Q' is >>>>>>>>>>>>>> essentially >>>>>>>>>>>>>>> an orthogonal reformulation of of the random projection. This >>>>>>>>>>>>>> projection is >>>>>>>>>>>>>>> about as close as you are likely to get to something that >>>>> exactly >>>>>>>>>>>>>> preserves >>>>>>>>>>>>>>> distances. As such, you should be able to use MDS on B to get >>>>>>> exactly >>>>>>>>>>>>>> the >>>>>>>>>>>>>>> same results as you want. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Additionally, if you start with the original form and do an >>>>> SVD >>>>>>> of B >>>>>>>>>>>>>> (which >>>>>>>>>>>>>>> is fast), you will get a very good approximation of the >>>>> prominent >>>>>>> right >>>>>>>>>>>>>>> singular vectors of A. IF you do that, the first few of these >>>>>>> should be >>>>>>>>>>>>>>> about as good as MDS for visualization purposes. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Jul 1, 2011 at 2:44 AM, Lance Norskog < >>>>> [email protected] >>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I did some testing and make a lot of pretty charts: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://ultrawhizbang.blogspot.com/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If you want to get quick visualizations of your clusters, >>>>> this >>>>>>> is a >>>>>>>>>>>>>>>> great place to start. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Lance Norskog >>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Lance Norskog >>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Lance Norskog >>>>>>>>>>> [email protected] >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Lance Norskog >>>>>>>>>> [email protected] >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Lance Norskog >>>>>>>>> [email protected] >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Lance Norskog >>>>>>>> [email protected] >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Lance Norskog >>>>>>> [email protected] >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Lance Norskog >>>>> [email protected] >>>>> >>>> >>> >>> >>> >>> -- >>> Lance Norskog >>> [email protected] >> > > > -- > Lance Norskog > [email protected] > -- Lance Norskog [email protected]
