I don't understand the question. A rotation leaves the Frobenius norm unchanged. Period. Any rank-limited optimal least-squares approximation in one rotated/translated frame is optimal in any other such frame. What do you mean by 99% confidence here? The approximation is optimal or not.
On Fri, Jul 8, 2011 at 7:49 PM, Lance Norskog <[email protected]> wrote: > Getting closer: if I have a matrix that does a particular > rotation/translation etc. with a 99% confidence interval, and I do the > above trimming operation, is it possible to translate this into a new > confidence level? Are there specific ways to translate these numbers > into probabilistic estimates? Is it just way too hairy? > > Lance > > On Thu, Jul 7, 2011 at 10:15 PM, Ted Dunning <[email protected]> > wrote: > > This means that the rank 2 reconstruction of your matrix is close to your > > original in the sense that the Frobenius norm of the difference will be > > small. > > > > In particular, the Frobenius norm of the delta should be the same as the > > Frobenius norm of the missing singular values (root sum squared missing > > values, that is). > > > > Here is an example. First I use a random 20x7 matrix to get an SVD into > > which I transplant your singular values. This gives me a matrix whose > > decomposition is the same as the one you are using. > > > > I then do that decomposition and truncate the singular values to get an > > approximate matrix. The Frobenius norm of the difference is the same as > the > > Frobenius norm of the missing singular values. > > > >> m = matrix(rnorm(20*7), nrow=20) > >> svd1 = svd(m) > >> length(svd1$d) > > [1] 7 > >> d = c(0.7, 0.2,0.05, 0.02, 0.01, 0.01, 0.01) > >> m2 = svd1$u %*% diag(d) %*% t(svd1$v) > >> svd = svd(m2) > >> svd$d > > [1] 0.70 0.20 0.05 0.02 0.01 0.01 0.01 > >> m3 = svd$u[,1:2] %*% diag(svd$d[1:2]) %*% t(svd$v[,1:2]) > >> dim(m3) > > [1] 20 7 > >> m2-m3 > > [,1] [,2] [,3] [,4] > [,5] > > [,6] [,7] > > [1,] 0.0069233794 0.0020467352 -0.0071659763 -4.099546e-03 > 0.0056399256 > > -0.0023953930 -0.0119370905 > > [2,] -0.0018546491 0.0011631030 0.0013261685 -1.193252e-03 > 0.0002839689 > > 0.0014320601 0.0036207164 > > [3,] 0.0011612177 0.0027845827 -0.0023368373 -4.240565e-03 > 0.0009362635 > > -0.0032987483 -0.0019110953 > > [4,] -0.0061414783 0.0070092709 0.0066429461 2.240401e-03 > -0.0003033182 > > -0.0031444510 0.0027860257 > > [5,] 0.0004910556 -0.0057660226 0.0014586550 -3.383020e-03 > -0.0015763103 > > 0.0011357677 0.0101147998 > > [6,] 0.0016672016 -0.0043701670 -0.0002311687 -1.706181e-04 > -0.0032324629 > > -0.0033587690 0.0018471306 > > [7,] -0.0024146270 0.0007510238 0.0052282604 7.724380e-04 > -0.0004411600 > > -0.0026622302 0.0050655693 > > [8,] 0.0036106469 0.0028629467 -0.0007957853 1.333764e-03 > 0.0074933368 > > 0.0008158132 -0.0091284389 > > [9,] 0.0013369776 0.0036364763 -0.0009691292 -2.050044e-03 > 0.0021208815 > > -0.0042241753 -0.0043885229 > > [10,] 0.0031153692 0.0003852343 -0.0053822410 -6.538480e-04 > 0.0005221515 > > -0.0003594550 -0.0077290438 > > [11,] -0.0012286952 0.0026373981 0.0017958449 4.693112e-05 > 0.0003753286 > > -0.0024000476 -0.0001261246 > > [12,] -0.0024890888 -0.0018374670 0.0048781861 -1.065282e-03 > -0.0018902396 > > -0.0013280442 0.0096305420 > > [13,] 0.0099545328 -0.0012843802 -0.0035220130 1.599559e-03 > 0.0081592109 > > -0.0047310711 -0.0158840779 > > [14,] -0.0029835169 0.0046807105 0.0016607724 4.339315e-03 > -0.0015926183 > > -0.0026172305 -0.0048268815 > > [15,] -0.0102632616 0.0033271770 0.0104700407 2.728651e-03 > -0.0037697307 > > 0.0016053567 0.0145899365 > > [16,] -0.0074888872 -0.0002277379 0.0068370652 -4.688963e-05 > -0.0044921343 > > 0.0024889009 0.0150436991 > > [17,] -0.0068501952 -0.0017733601 0.0076497285 1.743932e-03 > -0.0055472565 > > 0.0006109667 0.0142443162 > > [18,] -0.0020245716 -0.0011431425 0.0044837803 3.219527e-04 > 0.0007887701 > > 0.0019836205 0.0070585228 > > [19,] -0.0016059867 -0.0028328316 0.0032223649 2.025061e-03 > -0.0019194344 > > 0.0009643023 0.0052452638 > > [20,] 0.0042324909 -0.0063013966 -0.0041269199 -9.848214e-04 > -0.0029591571 > > -0.0015911580 -0.0012584919 > >> sqrt(sum((m2-m3)^2)) > > [1] 0.05656854 > >> sqrt(sum(d[3:7]^2)) > > [1] 0.05656854 > >> > > > > > > > > > > > > > > > > On Thu, Jul 7, 2011 at 8:46 PM, Lance Norskog <[email protected]> wrote: > > > >> Rats "My 3D coordinates" should be 'My 2D coordinates'. The there is a > >> preposition missing in the first sentence. > >> > >> On Thu, Jul 7, 2011 at 8:44 PM, Lance Norskog <[email protected]> > wrote: > >> > The singular values in my experiments drop like a rock. What is > >> > information/probability loss formula when dropping low-value vectors? > >> > > >> > That is, I start with a 7D vector set, go through this random > >> > projection/svd exercise, and get these singular vectors: [0.7, 0.2, > >> > 0.05, 0.02, 0.01, 0.01, 0.01]. I drop the last five to create a matrix > >> > that gives 2D coordinates. The sum of the remaining singular values is > >> > 0.9. My 3D vectors will be missing 0.10 of *something* compared to the > >> > original 7D vectors. What is this something and what other concepts > >> > does it plug into? > >> > > >> > Lance > >> > > >> > On Sat, Jul 2, 2011 at 11:54 PM, Lance Norskog <[email protected]> > >> wrote: > >> >> The singular values on my recommender vectors come out: 90, 10, 1.2, > >> >> 1.1, 1.0, 0.95..... This was playing with your R code. Based on this, > >> >> I'm adding the QR stuff to my visualization toolkit. > >> >> > >> >> Lance > >> >> > >> >> On Sat, Jul 2, 2011 at 10:15 PM, Lance Norskog <[email protected]> > >> wrote: > >> >>> All pairwise distances are preserved? There must be a qualifier on > >> >>> pairwise distance algorithms. > >> >>> > >> >>> On Sat, Jul 2, 2011 at 6:49 PM, Lance Norskog <[email protected]> > >> wrote: > >> >>>> Cool. The plots are fun. The way gaussian spots project into > spinning > >> >>>> chains is very educational about entropy. > >> >>>> > >> >>>> For full Random Projection, a lame random number generator > >> >>>> (java.lang.Random) will generate a higher standard deviation than a > >> >>>> high-quality one like MurmurHash. > >> >>>> > >> >>>> On Fri, Jul 1, 2011 at 5:25 PM, Ted Dunning <[email protected] > > > >> wrote: > >> >>>>> Here is R code that demonstrates what I mean by stunning (aka 15 > >> significant > >> >>>>> figures). Note that I only check dot products here. From the > fact > >> that the > >> >>>>> final transform is orthonormal we know that all distances are > >> preserved. > >> >>>>> > >> >>>>> # make a big random matrix with rank 20 > >> >>>>>> a = matrix(rnorm(20000), ncol=20) %*% matrix(rnorm(20000), > nrow=20); > >> >>>>>> dim(a) > >> >>>>> [1] 1000 1000 > >> >>>>> # random projection > >> >>>>>> y = a %*% matrix(rnorm(30000), ncol=30); > >> >>>>> # get basis for y > >> >>>>>> q = qr.Q(qr(y)) > >> >>>>>> dim(q) > >> >>>>> [1] 1000 30 > >> >>>>> # re-project b, do svd on result > >> >>>>>> b = t(q) %*% a > >> >>>>>> v = svd(b)$v > >> >>>>>> d = svd(b)$d > >> >>>>> # note how singular values drop like a stone at index 21 > >> >>>>>> plot(d) > >> >>>>>> dim(v) > >> >>>>> [1] 1000 30 > >> >>>>> # finish svd just for fun > >> >>>>>> u = q %*% svd(b)$u > >> >>>>>> dim(u) > >> >>>>> [1] 1000 30 > >> >>>>> # u is orthogonal, right? > >> >>>>>> diag(t(u)%*% u) > >> >>>>> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 > >> >>>>> # and u diag(d) v' reconstructs a very precisely, right? > >> >>>>>> max(abs(a-u %*% diag(d) %*% t(v))) > >> >>>>> [1] 1.293188e-12 > >> >>>>> > >> >>>>> # now project a into the reduced dimensional space > >> >>>>>> aa = a%*%v > >> >>>>>> dim(aa) > >> >>>>> [1] 1000 30 > >> >>>>> # check a few dot products > >> >>>>>> sum(aa[1,] %*% aa[2,]) > >> >>>>> [1] 6835.152 > >> >>>>>> sum(a[1,] %*% a[2,]) > >> >>>>> [1] 6835.152 > >> >>>>>> sum(a[1,] %*% a[3,]) > >> >>>>> [1] 3337.248 > >> >>>>>> sum(aa[1,] %*% aa[3,]) > >> >>>>> [1] 3337.248 > >> >>>>> > >> >>>>> # wow, that's close. let's try a hundred dot products > >> >>>>>> dot1 = rep(0,100);dot2 = rep(0,100) > >> >>>>>> for (i in 1:100) {dot1[i] = sum(a[1,] * a[i,]); dot2[i] = > >> sum(aa[1,]* > >> >>>>> aa[i,])} > >> >>>>> > >> >>>>> # how close to the same are those? > >> >>>>>> max(abs(dot1-dot2)) > >> >>>>> # VERY > >> >>>>> [1] 3.45608e-11 > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> On Fri, Jul 1, 2011 at 4:54 PM, Ted Dunning < > [email protected]> > >> wrote: > >> >>>>> > >> >>>>>> Yes. Been there. Done that. > >> >>>>>> > >> >>>>>> The correlation is stunningly good. > >> >>>>>> > >> >>>>>> > >> >>>>>> On Fri, Jul 1, 2011 at 4:22 PM, Lance Norskog <[email protected] > > > >> wrote: > >> >>>>>> > >> >>>>>>> Thanks! > >> >>>>>>> > >> >>>>>>> Is this true? - "Preserving pairwise distances" means the > relative > >> >>>>>>> distances. So the ratios of new distance:old distance should be > >> >>>>>>> similar. The standard deviation of the ratios gives a > rough&ready > >> >>>>>>> measure of the fidelity of the reduction. The standard deviation > of > >> >>>>>>> simple RP should be highest, then this RP + orthogonalization, > then > >> >>>>>>> MDS. > >> >>>>>>> > >> >>>>>>> On Fri, Jul 1, 2011 at 11:03 AM, Ted Dunning < > >> [email protected]> > >> >>>>>>> wrote: > >> >>>>>>> > Lance, > >> >>>>>>> > > >> >>>>>>> > You would get better results from the random projection if you > >> did the > >> >>>>>>> first > >> >>>>>>> > part of the stochastic SVD. Basically, you do the random > >> projection: > >> >>>>>>> > > >> >>>>>>> > Y = A \Omega > >> >>>>>>> > > >> >>>>>>> > where A is your original data, R is the random matrix and Y is > >> the > >> >>>>>>> result. > >> >>>>>>> > Y will be tall and skinny. > >> >>>>>>> > > >> >>>>>>> > Then, find an orthogonal basis Q of Y: > >> >>>>>>> > > >> >>>>>>> > Q R = Y > >> >>>>>>> > > >> >>>>>>> > This orthogonal basis will be very close to the orthogonal > basis > >> of A. > >> >>>>>>> In > >> >>>>>>> > fact, there are strong probabilistic guarantees on how good Q > is > >> as a > >> >>>>>>> basis > >> >>>>>>> > of A. Next, you project A using the transpose of Q: > >> >>>>>>> > > >> >>>>>>> > B = Q' A > >> >>>>>>> > > >> >>>>>>> > This gives you a short fat matrix that is the projection of A > >> into a > >> >>>>>>> lower > >> >>>>>>> > dimensional space. Since this is a left projection, it isn't > >> quite what > >> >>>>>>> you > >> >>>>>>> > want in your work, but it is the standard way to phrase > things. > >> The > >> >>>>>>> exact > >> >>>>>>> > same thing can be done with left random projection: > >> >>>>>>> > > >> >>>>>>> > Y = \Omega A > >> >>>>>>> > L Q = Y > >> >>>>>>> > B = A Q' > >> >>>>>>> > > >> >>>>>>> > In this form, B is tall and skinny as you would like and Q' is > >> >>>>>>> essentially > >> >>>>>>> > an orthogonal reformulation of of the random projection. This > >> >>>>>>> projection is > >> >>>>>>> > about as close as you are likely to get to something that > exactly > >> >>>>>>> preserves > >> >>>>>>> > distances. As such, you should be able to use MDS on B to get > >> exactly > >> >>>>>>> the > >> >>>>>>> > same results as you want. > >> >>>>>>> > > >> >>>>>>> > Additionally, if you start with the original form and do an > SVD > >> of B > >> >>>>>>> (which > >> >>>>>>> > is fast), you will get a very good approximation of the > prominent > >> right > >> >>>>>>> > singular vectors of A. IF you do that, the first few of these > >> should be > >> >>>>>>> > about as good as MDS for visualization purposes. > >> >>>>>>> > > >> >>>>>>> > On Fri, Jul 1, 2011 at 2:44 AM, Lance Norskog < > [email protected] > >> > > >> >>>>>>> wrote: > >> >>>>>>> > > >> >>>>>>> >> I did some testing and make a lot of pretty charts: > >> >>>>>>> >> > >> >>>>>>> >> http://ultrawhizbang.blogspot.com/ > >> >>>>>>> >> > >> >>>>>>> >> If you want to get quick visualizations of your clusters, > this > >> is a > >> >>>>>>> >> great place to start. > >> >>>>>>> >> > >> >>>>>>> >> -- > >> >>>>>>> >> Lance Norskog > >> >>>>>>> >> [email protected] > >> >>>>>>> >> > >> >>>>>>> > > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> -- > >> >>>>>>> Lance Norskog > >> >>>>>>> [email protected] > >> >>>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> -- > >> >>>> Lance Norskog > >> >>>> [email protected] > >> >>>> > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> Lance Norskog > >> >>> [email protected] > >> >>> > >> >> > >> >> > >> >> > >> >> -- > >> >> Lance Norskog > >> >> [email protected] > >> >> > >> > > >> > > >> > > >> > -- > >> > Lance Norskog > >> > [email protected] > >> > > >> > >> > >> > >> -- > >> Lance Norskog > >> [email protected] > >> > > > > > > -- > Lance Norskog > [email protected] >
