Independent unit normal values is what I used in all of my R prototypes. That worked to machine precision.
On Wed, Apr 6, 2011 at 1:50 PM, Dmitriy Lyubimov <[email protected]> wrote: > Ok i can re-do this to use murmur then. > > "unit Guassian vectors" is what the original paper("funding structure > with randomness... ") refers to as the first suggested way of > generating Omega. > > As you said, i tried to find the exact meaning of this but till this > day am a little bit fuzzy about what they meant there. > > I have always suspected that norm distribution should be o.k. too > > On Wed, Apr 6, 2011 at 1:44 PM, Ted Dunning <[email protected]> wrote: > > Interesting point. If you don't need to re-use the vector ever, then > there > > is no need to generate it coherently. > > Unit Gaussian vector is an ambiguous term just the way you say, btw. For > > random projections using a vector > > composed of random elements each independently drawn from a unit normal > > distribution should be fine. The > > result will be approximately orthogonal and if you divide by the number > of > > elements, approximately orthornormal. > > That is all that is required for the random projection work. > > > > On Wed, Apr 6, 2011 at 1:33 PM, Dmitriy Lyubimov <[email protected]> > wrote: > >> > >> Actually I ended up not to use even that i have an implementation. I > >> am currently using just Random.nextGaussian since i needed to generate > >> single Gaussian vectors and I meant to ask if that's the best way to > >> do it. > >> > >> I had a version once that used conversion from uniformly generated > >> murmur hash to gaussian similarly to what you discussed but again, I > >> had doubts that's the way. What's the way? > >> > >> -D > >> > >> On Wed, Apr 6, 2011 at 1:02 AM, Ted Dunning <[email protected]> > wrote: > >> > The random matrix that dmitriy has uses MurmurHash based on the two > >> > indices > >> > to create the random values. They aren't cached since they are > >> > generated > >> > fairly quickly. > >> > > >> > On Wed, Apr 6, 2011 at 12:36 AM, Sean Owen (JIRA) <[email protected]> > >> > wrote: > >> > > >> >> > >> >> [ > >> >> > >> >> > https://issues.apache.org/jira/browse/MAHOUT-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016285#comment-13016285 > ] > >> >> > >> >> Sean Owen commented on MAHOUT-550: > >> >> ---------------------------------- > >> >> > >> >> Well it turns out MersenneTwisterRNG won't take a new seed, but it > just > >> >> means setSeed() in RandomWrapper needs to make a new RNG instead. I > can > >> >> add > >> >> that, it's a detail. > >> >> > >> >> > Add RandomVector and RandomMatrix > >> >> > --------------------------------- > >> >> > > >> >> > Key: MAHOUT-550 > >> >> > URL: > https://issues.apache.org/jira/browse/MAHOUT-550 > >> >> > Project: Mahout > >> >> > Issue Type: New Feature > >> >> > Components: Math > >> >> > Reporter: Lance Norskog > >> >> > Assignee: Sean Owen > >> >> > Attachments: MAHOUT-550.patch, MAHOUT-550.patch, > >> >> RandomMatrix.patch > >> >> > > >> >> > > >> >> > Add Vector and Matrix implementations that generate a unique and > >> >> reproducible random number for each index. > >> >> > >> >> -- > >> >> This message is automatically generated by JIRA. > >> >> For more information on JIRA, see: > >> >> http://www.atlassian.com/software/jira > >> >> > >> > > > > > >
