None of this actually applies because real data are not uniformly
distributed (not even close).  Do the sampling on your own data and pick a
good guess from that.

On Wed, Oct 19, 2011 at 11:40 AM, Sean Owen <[email protected]> wrote:

> Ah, I'm looking for the distance between points within, rather than
> on, the hypercube. (Think of it as random rating vectors, in the range
> 0..1, across all movies. They're not binary ratings but ratings from 0
> to 1.)
>
> On Wed, Oct 19, 2011 at 6:30 PM, Justin Cranshaw <[email protected]>
> wrote:
> > I think the analytic answer should be sqrt(n/2).
> >
> > So let's suppose X and Y are random points in the n dimensional hypercube
> {0,1}^n.  Let Z_i be an indicator variable that is 1 if X_i != Y_i and 0
> otherwise.  Then d(X,Y)^2 =sum (X_i - Y_i)^2 = sum( Z_i).  Then the expected
> squared distance is E d(X,Y)^2 = sum( E Z_i) = sum( Pr[ X_i != Y_i]) = n/2.
> >
> >
>

Reply via email to