Re: Dimensional Reduction via Random Projection: investigations

Ted Dunning Sat, 09 Jul 2011 17:34:21 -0700

The problem is still not precisely stated. You have 95% confidence in what?  Do 
you know that 95% of
The entries are exact and the otherwise be arbitrarily corrupted?  If so, you 
can say nothing because the error is unbounded in terms of squared error.


In order to say something stronger you have to phrase your error in terms of 
something that has invariant properties. This is why invariants are such 
important concepts in such a variety of quantitative endeavors. 

Sent from my iPhone

On Jul 9, 2011, at 15:51, Lance Norskog <[email protected]> wrote:

> Yes, I misphrased it. Let's say I have a dataset in which I have 95%
> confidence of its correctness, and I multiply it by a matrix which
> does not lose information, I will have 95% confidence in the output
> data.
> 
> Now, let's downsize the matrix by the above numbers. The remaining
> matrix contains the singular vectors up to 0.9, with 0.1 percent of
> the singular vectors missing. What is my confidence in the output
> data?
> 
> On Sat, Jul 9, 2011 at 11:46 AM, Ted Dunning <[email protected]> wrote:
>> I don't understand the question.
>> 
>> A rotation leaves the Frobenius norm unchanged.  Period.  Any rank-limited
>> optimal least-squares approximation in one rotated/translated frame is
>> optimal in any other such frame.  What do you mean by 99% confidence here?
>>  The approximation is optimal or not.
>> 
>> On Fri, Jul 8, 2011 at 7:49 PM, Lance Norskog <[email protected]> wrote:
>> 
>>> Getting closer: if I have a matrix that does a particular
>>> rotation/translation etc. with a 99% confidence interval, and I do the
>>> above trimming operation, is it possible to translate this into a new
>>> confidence level? Are there specific ways to translate these numbers
>>> into probabilistic estimates? Is it just way too hairy?
>>> 
>>> Lance
>>> 
>>> On Thu, Jul 7, 2011 at 10:15 PM, Ted Dunning <[email protected]>
>>> wrote:
>>>> This means that the rank 2 reconstruction of your matrix is close to your
>>>> original in the sense that the Frobenius norm of the difference will be
>>>> small.
>>>> 
>>>> In particular, the Frobenius norm of the delta should be the same as the
>>>> Frobenius norm of the missing singular values (root sum squared missing
>>>> values, that is).
>>>> 
>>>> Here is an example.  First I use a random 20x7 matrix to get an SVD into
>>>> which I transplant your singular values.  This gives me a matrix whose
>>>> decomposition is the same as the one you are using.
>>>> 
>>>> I then do that decomposition and truncate the singular values to get an
>>>> approximate matrix.  The Frobenius norm of the difference is the same as
>>> the
>>>> Frobenius norm of the missing singular values.
>>>> 
>>>>> m = matrix(rnorm(20*7), nrow=20)
>>>>> svd1 = svd(m)
>>>>> length(svd1$d)
>>>> [1] 7
>>>>> d = c(0.7, 0.2,0.05, 0.02, 0.01, 0.01, 0.01)
>>>>> m2 = svd1$u %*% diag(d) %*% t(svd1$v)
>>>>> svd = svd(m2)
>>>>> svd$d
>>>> [1] 0.70 0.20 0.05 0.02 0.01 0.01 0.01
>>>>> m3 = svd$u[,1:2] %*% diag(svd$d[1:2]) %*% t(svd$v[,1:2])
>>>>> dim(m3)
>>>> [1] 20  7
>>>>> m2-m3
>>>>               [,1]          [,2]          [,3]          [,4]
>>>  [,5]
>>>>         [,6]          [,7]
>>>>  [1,]  0.0069233794  0.0020467352 -0.0071659763 -4.099546e-03
>>>  0.0056399256
>>>> -0.0023953930 -0.0119370905
>>>>  [2,] -0.0018546491  0.0011631030  0.0013261685 -1.193252e-03
>>>  0.0002839689
>>>>  0.0014320601  0.0036207164
>>>>  [3,]  0.0011612177  0.0027845827 -0.0023368373 -4.240565e-03
>>>  0.0009362635
>>>> -0.0032987483 -0.0019110953
>>>>  [4,] -0.0061414783  0.0070092709  0.0066429461  2.240401e-03
>>> -0.0003033182
>>>> -0.0031444510  0.0027860257
>>>>  [5,]  0.0004910556 -0.0057660226  0.0014586550 -3.383020e-03
>>> -0.0015763103
>>>>  0.0011357677  0.0101147998
>>>>  [6,]  0.0016672016 -0.0043701670 -0.0002311687 -1.706181e-04
>>> -0.0032324629
>>>> -0.0033587690  0.0018471306
>>>>  [7,] -0.0024146270  0.0007510238  0.0052282604  7.724380e-04
>>> -0.0004411600
>>>> -0.0026622302  0.0050655693
>>>>  [8,]  0.0036106469  0.0028629467 -0.0007957853  1.333764e-03
>>>  0.0074933368
>>>>  0.0008158132 -0.0091284389
>>>>  [9,]  0.0013369776  0.0036364763 -0.0009691292 -2.050044e-03
>>>  0.0021208815
>>>> -0.0042241753 -0.0043885229
>>>> [10,]  0.0031153692  0.0003852343 -0.0053822410 -6.538480e-04
>>>  0.0005221515
>>>> -0.0003594550 -0.0077290438
>>>> [11,] -0.0012286952  0.0026373981  0.0017958449  4.693112e-05
>>>  0.0003753286
>>>> -0.0024000476 -0.0001261246
>>>> [12,] -0.0024890888 -0.0018374670  0.0048781861 -1.065282e-03
>>> -0.0018902396
>>>> -0.0013280442  0.0096305420
>>>> [13,]  0.0099545328 -0.0012843802 -0.0035220130  1.599559e-03
>>>  0.0081592109
>>>> -0.0047310711 -0.0158840779
>>>> [14,] -0.0029835169  0.0046807105  0.0016607724  4.339315e-03
>>> -0.0015926183
>>>> -0.0026172305 -0.0048268815
>>>> [15,] -0.0102632616  0.0033271770  0.0104700407  2.728651e-03
>>> -0.0037697307
>>>>  0.0016053567  0.0145899365
>>>> [16,] -0.0074888872 -0.0002277379  0.0068370652 -4.688963e-05
>>> -0.0044921343
>>>>  0.0024889009  0.0150436991
>>>> [17,] -0.0068501952 -0.0017733601  0.0076497285  1.743932e-03
>>> -0.0055472565
>>>>  0.0006109667  0.0142443162
>>>> [18,] -0.0020245716 -0.0011431425  0.0044837803  3.219527e-04
>>>  0.0007887701
>>>>  0.0019836205  0.0070585228
>>>> [19,] -0.0016059867 -0.0028328316  0.0032223649  2.025061e-03
>>> -0.0019194344
>>>>  0.0009643023  0.0052452638
>>>> [20,]  0.0042324909 -0.0063013966 -0.0041269199 -9.848214e-04
>>> -0.0029591571
>>>> -0.0015911580 -0.0012584919
>>>>> sqrt(sum((m2-m3)^2))
>>>> [1] 0.05656854
>>>>> sqrt(sum(d[3:7]^2))
>>>> [1] 0.05656854
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Thu, Jul 7, 2011 at 8:46 PM, Lance Norskog <[email protected]> wrote:
>>>> 
>>>>> Rats "My 3D coordinates" should be 'My 2D coordinates'. The there is a
>>>>> preposition missing in the first sentence.
>>>>> 
>>>>> On Thu, Jul 7, 2011 at 8:44 PM, Lance Norskog <[email protected]>
>>> wrote:
>>>>>> The singular values in my experiments drop like a rock. What is
>>>>>> information/probability loss formula when dropping low-value vectors?
>>>>>> 
>>>>>> That is, I start with a 7D vector set, go through this random
>>>>>> projection/svd exercise, and get these singular vectors: [0.7, 0.2,
>>>>>> 0.05, 0.02, 0.01, 0.01, 0.01]. I drop the last five to create a matrix
>>>>>> that gives 2D coordinates. The sum of the remaining singular values is
>>>>>> 0.9. My 3D vectors will be missing 0.10 of *something* compared to the
>>>>>> original 7D vectors. What is this something and what other concepts
>>>>>> does it plug into?
>>>>>> 
>>>>>> Lance
>>>>>> 
>>>>>> On Sat, Jul 2, 2011 at 11:54 PM, Lance Norskog <[email protected]>
>>>>> wrote:
>>>>>>> The singular values on my recommender vectors come out: 90, 10, 1.2,
>>>>>>> 1.1, 1.0, 0.95..... This was playing with your R code. Based on this,
>>>>>>> I'm adding the QR stuff to my visualization toolkit.
>>>>>>> 
>>>>>>> Lance
>>>>>>> 
>>>>>>> On Sat, Jul 2, 2011 at 10:15 PM, Lance Norskog <[email protected]>
>>>>> wrote:
>>>>>>>> All pairwise distances are preserved? There must be a qualifier on
>>>>>>>> pairwise distance algorithms.
>>>>>>>> 
>>>>>>>> On Sat, Jul 2, 2011 at 6:49 PM, Lance Norskog <[email protected]>
>>>>> wrote:
>>>>>>>>> Cool. The plots are fun. The way gaussian spots project into
>>> spinning
>>>>>>>>> chains is very educational about entropy.
>>>>>>>>> 
>>>>>>>>> For full Random Projection, a lame random number generator
>>>>>>>>> (java.lang.Random) will generate a higher standard deviation than a
>>>>>>>>> high-quality one like MurmurHash.
>>>>>>>>> 
>>>>>>>>> On Fri, Jul 1, 2011 at 5:25 PM, Ted Dunning <[email protected]
>>>> 
>>>>> wrote:
>>>>>>>>>> Here is R code that demonstrates what I mean by stunning (aka 15
>>>>> significant
>>>>>>>>>> figures).  Note that I only check dot products here.  From the
>>> fact
>>>>> that the
>>>>>>>>>> final transform is orthonormal we know that all distances are
>>>>> preserved.
>>>>>>>>>> 
>>>>>>>>>> # make a big random matrix with rank 20
>>>>>>>>>>> a = matrix(rnorm(20000), ncol=20) %*% matrix(rnorm(20000),
>>> nrow=20);
>>>>>>>>>>> dim(a)
>>>>>>>>>> [1] 1000 1000
>>>>>>>>>> # random projection
>>>>>>>>>>> y = a %*% matrix(rnorm(30000), ncol=30);
>>>>>>>>>> # get basis for y
>>>>>>>>>>> q = qr.Q(qr(y))
>>>>>>>>>>> dim(q)
>>>>>>>>>> [1] 1000   30
>>>>>>>>>> # re-project b, do svd on result
>>>>>>>>>>> b = t(q) %*% a
>>>>>>>>>>> v = svd(b)$v
>>>>>>>>>>> d = svd(b)$d
>>>>>>>>>> # note how singular values drop like a stone at index 21
>>>>>>>>>>> plot(d)
>>>>>>>>>>> dim(v)
>>>>>>>>>> [1] 1000   30
>>>>>>>>>> # finish svd just for fun
>>>>>>>>>>> u = q %*% svd(b)$u
>>>>>>>>>>> dim(u)
>>>>>>>>>> [1] 1000   30
>>>>>>>>>> # u is orthogonal, right?
>>>>>>>>>>> diag(t(u)%*% u)
>>>>>>>>>>  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
>>>>>>>>>> # and u diag(d) v' reconstructs a very precisely, right?
>>>>>>>>>>> max(abs(a-u %*% diag(d) %*% t(v)))
>>>>>>>>>> [1] 1.293188e-12
>>>>>>>>>> 
>>>>>>>>>> # now project a into the reduced dimensional space
>>>>>>>>>>> aa = a%*%v
>>>>>>>>>>> dim(aa)
>>>>>>>>>> [1] 1000   30
>>>>>>>>>> # check a few dot products
>>>>>>>>>>> sum(aa[1,] %*% aa[2,])
>>>>>>>>>> [1] 6835.152
>>>>>>>>>>> sum(a[1,] %*% a[2,])
>>>>>>>>>> [1] 6835.152
>>>>>>>>>>> sum(a[1,] %*% a[3,])
>>>>>>>>>> [1] 3337.248
>>>>>>>>>>> sum(aa[1,] %*% aa[3,])
>>>>>>>>>> [1] 3337.248
>>>>>>>>>> 
>>>>>>>>>> # wow, that's close.  let's try a hundred dot products
>>>>>>>>>>> dot1 = rep(0,100);dot2 = rep(0,100)
>>>>>>>>>>> for (i in 1:100) {dot1[i] = sum(a[1,] * a[i,]); dot2[i] =
>>>>> sum(aa[1,]*
>>>>>>>>>> aa[i,])}
>>>>>>>>>> 
>>>>>>>>>> # how close to the same are those?
>>>>>>>>>>> max(abs(dot1-dot2))
>>>>>>>>>> # VERY
>>>>>>>>>> [1] 3.45608e-11
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Fri, Jul 1, 2011 at 4:54 PM, Ted Dunning <
>>> [email protected]>
>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Yes.  Been there.  Done that.
>>>>>>>>>>> 
>>>>>>>>>>> The correlation is stunningly good.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Fri, Jul 1, 2011 at 4:22 PM, Lance Norskog <[email protected]
>>>> 
>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>> 
>>>>>>>>>>>> Is this true? - "Preserving pairwise distances" means the
>>> relative
>>>>>>>>>>>> distances. So the ratios of new distance:old distance should be
>>>>>>>>>>>> similar. The standard deviation of the ratios gives a
>>> rough&ready
>>>>>>>>>>>> measure of the fidelity of the reduction. The standard deviation
>>> of
>>>>>>>>>>>> simple RP should be highest, then this RP + orthogonalization,
>>> then
>>>>>>>>>>>> MDS.
>>>>>>>>>>>> 
>>>>>>>>>>>> On Fri, Jul 1, 2011 at 11:03 AM, Ted Dunning <
>>>>> [email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> Lance,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> You would get better results from the random projection if you
>>>>> did the
>>>>>>>>>>>> first
>>>>>>>>>>>>> part of the stochastic SVD.  Basically, you do the random
>>>>> projection:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>       Y = A \Omega
>>>>>>>>>>>>> 
>>>>>>>>>>>>> where A is your original data, R is the random matrix and Y is
>>>>> the
>>>>>>>>>>>> result.
>>>>>>>>>>>>>  Y will be tall and skinny.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Then, find an orthogonal basis Q of Y:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>       Q R = Y
>>>>>>>>>>>>> 
>>>>>>>>>>>>> This orthogonal basis will be very close to the orthogonal
>>> basis
>>>>> of A.
>>>>>>>>>>>>  In
>>>>>>>>>>>>> fact, there are strong probabilistic guarantees on how good Q
>>> is
>>>>> as a
>>>>>>>>>>>> basis
>>>>>>>>>>>>> of A.  Next, you project A using the transpose of Q:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>       B = Q' A
>>>>>>>>>>>>> 
>>>>>>>>>>>>> This gives you a short fat matrix that is the projection of A
>>>>> into a
>>>>>>>>>>>> lower
>>>>>>>>>>>>> dimensional space.  Since this is a left projection, it isn't
>>>>> quite what
>>>>>>>>>>>> you
>>>>>>>>>>>>> want in your work, but it is the standard way to phrase
>>> things.
>>>>>  The
>>>>>>>>>>>> exact
>>>>>>>>>>>>> same thing can be done with left random projection:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>      Y = \Omega A
>>>>>>>>>>>>>      L Q = Y
>>>>>>>>>>>>>      B = A Q'
>>>>>>>>>>>>> 
>>>>>>>>>>>>> In this form, B is tall and skinny as you would like and Q' is
>>>>>>>>>>>> essentially
>>>>>>>>>>>>> an orthogonal reformulation of of the random projection.  This
>>>>>>>>>>>> projection is
>>>>>>>>>>>>> about as close as you are likely to get to something that
>>> exactly
>>>>>>>>>>>> preserves
>>>>>>>>>>>>> distances.  As such, you should be able to use MDS on B to get
>>>>> exactly
>>>>>>>>>>>> the
>>>>>>>>>>>>> same results as you want.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Additionally, if you start with the original form and do an
>>> SVD
>>>>> of B
>>>>>>>>>>>> (which
>>>>>>>>>>>>> is fast), you will get a very good approximation of the
>>> prominent
>>>>> right
>>>>>>>>>>>>> singular vectors of A.  IF you do that, the first few of these
>>>>> should be
>>>>>>>>>>>>> about as good as MDS for visualization purposes.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Fri, Jul 1, 2011 at 2:44 AM, Lance Norskog <
>>> [email protected]
>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I did some testing and make a lot of pretty charts:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> http://ultrawhizbang.blogspot.com/
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> If you want to get quick visualizations of your clusters,
>>> this
>>>>> is a
>>>>>>>>>>>>>> great place to start.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Lance Norskog
>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Lance Norskog
>>>>>>>>>>>> [email protected]
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Lance Norskog
>>>>>>>>> [email protected]
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Lance Norskog
>>>>>>>> [email protected]
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Lance Norskog
>>>>>>> [email protected]
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Lance Norskog
>>>>>> [email protected]
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Lance Norskog
>>>>> [email protected]
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Lance Norskog
>>> [email protected]
>>> 
>> 
> 
> 
> 
> -- 
> Lance Norskog
> [email protected]

Re: Dimensional Reduction via Random Projection: investigations

Reply via email to