Re: [math] Improving speed of UnitSphereRandomVectorGenerator
Hi. On Wed, 30 Jan 2013 11:54:50 +, Sean Owen wrote: Hello all, I have a small proposal to improve the speed of UnitSphereRandomVectorGenerator. This class picks a random point on the unit n-sphere -- a unit vector, chosen uniformly from all possible directions. It does so using a rejection process -- generates a random point in the unit n-cube (well, with side lengths 2) and rejects any points outside the unit n-sphere, then normalizes the length. This is correct and works well at low dimension. However the volume of the unit n-sphere compared to the unit n-cube drops exponentially. This method eventually takes an extraordinary amount of time when dimensions get past about 20, since virtually no samples are usable. For example, here is the time taken to make pick 10 points as a function of dimension up to 20: 1 : 11 2 : 1 3 : 0 4 : 1 5 : 0 6 : 1 7 : 1 8 : 17 9 : 4 10 : 3 11 : 13 12 : 32 13 : 15 14 : 41 15 : 220 16 : 897 17 : 1770 18 : 7426 19 : 48457 20 : 122647 ... It's equally correct, and much faster, to generate these points by picking n standard Gaussians and normalizing. This method takes negligible time even into thousands of dimensions. Is this non-trivial enough that I might propose a patch? I wanted to ask users first. Patch certainly welcome! Also, please open a ticket on the bug-tracking system. Thanks, Gilles - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] Improving speed of UnitSphereRandomVectorGenerator
On Wed, Jan 30, 2013 at 4:56 PM, Gilles gil...@harfang.homelinux.orgwrote: Hi. On Wed, 30 Jan 2013 11:54:50 +, Sean Owen wrote: Hello all, I have a small proposal to improve the speed of UnitSphereRandomVectorGenerato**r. This class picks a random point on the unit n-sphere -- a unit vector, chosen uniformly from all possible directions. It does so using a rejection process -- generates a random point in the unit n-cube (well, with side lengths 2) and rejects any points outside the unit n-sphere, then normalizes the length. This is correct and works well at low dimension. However the volume of the unit n-sphere compared to the unit n-cube drops exponentially. This method eventually takes an extraordinary amount of time when dimensions get past about 20, since virtually no samples are usable. For example, here is the time taken to make pick 10 points as a function of dimension up to 20: 1 : 11 2 : 1 3 : 0 4 : 1 5 : 0 6 : 1 7 : 1 8 : 17 9 : 4 10 : 3 11 : 13 12 : 32 13 : 15 14 : 41 15 : 220 16 : 897 17 : 1770 18 : 7426 19 : 48457 20 : 122647 ... It's equally correct, and much faster, to generate these points by picking n standard Gaussians and normalizing. This method takes negligible time even into thousands of dimensions. Is this non-trivial enough that I might propose a patch? I wanted to ask users first. Patch certainly welcome! Also, please open a ticket on the bug-tracking system. ohoh, I had the reply open and was distracted, and in the mean time you already responded ;-). Thomas