I opened a ticket: http://trac.osgeo.org/grass/ticket/1828
On Fri, Dec 7, 2012 at 10:16 AM, Markus Metz <[email protected]>wrote: > On Thu, Dec 6, 2012 at 10:19 PM, Aren Cambre <[email protected]> wrote: > > OK, I think I get it now. After you've hit 4 SDs (or even arguably 3 > SDs!) > > the contribution of more points to that square fades to almost nothing > > without extreme clusters of points. > > Right. > > > > Seems like this ought to be clarified in the UI and documentation? > Should I > > file an enhancement request? > > I would rather have the option 'stddeviation' renamed to 'radius', > kernel radius in map units, which it actually is currently for all > kernels but gaussian, and internally adjust standard deviation instead > of radius for the gaussian kernel. I guess that the kernel radius is > of more importance to users than the standard deviation. Makes sense? > > Markus M > > > > > BTW, I dropped down to a SD of 200, and it ran in 212 minutes. I'm > trying a > > SD of 300 now but on a grid with 1/4 as many squares as I cut the rows > and > > cols by half. We'll see how it does. > > > > Aren > > > > > > On Thu, Dec 6, 2012 at 12:17 PM, Markus Metz < > [email protected]> > > wrote: > >> > >> On Thu, Dec 6, 2012 at 3:13 PM, Aren Cambre <[email protected]> > wrote: > >> > Thanks. I am using EPSG:3081, and its unit is meters. So is it the > case > >> > that > >> > a SD of x on EPSG:3081 means a search radius of 4x meters? If so, why > >> > 4x? > >> > >> Because the gaussian function is infinite, a cutoff of 3 to 6 SDs is > >> usually applied when using a gaussian function as kernel density > >> function. In theory, an alternative is to combine the infinite > >> gaussian function with a finite function, e.g. uniform. I think > >> historically v.kernel had only one kernel function, gaussian. The > >> original authors decided to ask for SD and not for the search radius > >> as input and have set the cutoff to 4 x SD where the gaussian function > >> is reasonably close to zero. > >> > >> BTW, search radius = 4 x SD applies only to the gaussian kernel, for > >> all other kernel functions the search radius is equal to SD. > >> > >> Markus M > >> > >> > > >> > On Thu, Dec 6, 2012 at 2:29 AM, Markus Metz > >> > <[email protected]> > >> > wrote: > >> >> > >> >> On Thu, Dec 6, 2012 at 3:58 AM, Aren Cambre <[email protected]> > >> >> wrote: > >> >> > It's gotten slow again. This run will probably take more than 10 > >> >> > hours. > >> >> > However, I am using a standard deviation of 1000. Is that what > could > >> >> > be > >> >> > causing this? > >> >> > >> >> Yes. With a standard deviation of 1000, the search radius is now > 4000, > >> >> that is, for each cell a 8000x8000 box is searched. With many densely > >> >> packed points, this can take quite some time. > >> >> > >> >> Markus M > >> >> > >> >> > > >> >> > v.kernel input=tickets@PERMANENT output=tickets_new_heatmap_1000 > >> >> > stddeviation=1000 > >> >> > STDDEV: 1000.000000 > >> >> > RES: 18.290457 ROWS: 2370 COLS: 2650 > >> >> > > >> >> > Writing output raster map using smooth parameter=1000.000000. > >> >> > > >> >> > Normalising factor=6482635.018778. > >> >> > > >> >> > On Sat, Nov 24, 2012 at 9:03 PM, Aren Cambre <[email protected]> > >> >> > wrote: > >> >> >> > >> >> >> I installed r53983. The v.kernel execution that took almost a day > >> >> >> now > >> >> >> executes in 25.5 minutes. Thank you! > >> >> >> > >> >> >> Aren > >> >> >> > >> >> >> > >> >> >> On Fri, Nov 23, 2012 at 12:51 PM, Markus Metz > >> >> >> <[email protected]> wrote: > >> >> >>> > >> >> >>> On Fri, Nov 23, 2012 at 5:35 PM, Aren Cambre < > [email protected]> > >> >> >>> wrote: > >> >> >>> > Thanks! > >> >> >>> > > >> >> >>> > I am not familiar with GRASS's release customs. Will this > become > >> >> >>> > part > >> >> >>> > of a > >> >> >>> > binary release soon, or should I just pull down the latest > >> >> >>> > release > >> >> >>> > in > >> >> >>> > the > >> >> >>> > 6.4.2 trunk? I'm assuming this has been merged into the > trunk... > >> >> >>> > >> >> >>> It should be available as a binary for Windows by tomorrow in the > >> >> >>> nightly builds [0]. > >> >> >>> > >> >> >>> Markus M > >> >> >>> > >> >> >>> [0] http://wingrass.fsv.cvut.cz/grass64/ > >> >> >>> > >> >> >>> > > >> >> >>> > Aren > >> >> >>> > > >> >> >>> > > >> >> >>> > On Fri, Nov 23, 2012 at 7:32 AM, Markus Metz > >> >> >>> > <[email protected]> > >> >> >>> > wrote: > >> >> >>> >> > >> >> >>> >> On Fri, Nov 23, 2012 at 2:07 PM, Aren Cambre > >> >> >>> >> <[email protected]> > >> >> >>> >> wrote: > >> >> >>> >> > Isn't taking about 10,000% too much time considered a bug? > :-) > >> >> >>> >> > >> >> >>> >> Hmm, yes. v.kernel is fixed in devbr6 and relbr6 with r53982 > and > >> >> >>> >> r53983, respectively. > >> >> >>> >> > >> >> >>> >> Markus M > >> >> >>> >> > >> >> >>> >> > > >> >> >>> >> > On Nov 23, 2012 5:11 AM, "Markus Metz" > >> >> >>> >> > <[email protected]> > >> >> >>> >> > wrote: > >> >> >>> >> >> > >> >> >>> >> >> On Fri, Nov 23, 2012 at 4:14 AM, Aren Cambre > >> >> >>> >> >> <[email protected]> > >> >> >>> >> >> wrote: > >> >> >>> >> >> > I'm able to reproduce reliably here. I'll email you > details > >> >> >>> >> >> > privately. > >> >> >>> >> >> > >> >> >>> >> >> Thanks. I can confirm that v.kernel takes a long time in > >> >> >>> >> >> GRASS 6 > >> >> >>> >> >> with > >> >> >>> >> >> the settings provided by you. It does not crash, however. > >> >> >>> >> >> > >> >> >>> >> >> I can speed up v.kernel in GRASS 6 to complete in 10 > minutes > >> >> >>> >> >> instead > >> >> >>> >> >> of 16+ hours, but I am not sure if this fix can/will go > into > >> >> >>> >> >> GRASS > >> >> >>> >> >> 6.4 > >> >> >>> >> >> because by now only bugs should be fixed. > >> >> >>> >> >> > >> >> >>> >> >> Markus M > >> >> >>> >> >> > >> >> >>> >> >> > > >> >> >>> >> >> > Aren > >> >> >>> >> >> > > >> >> >>> >> >> > > >> >> >>> >> >> > On Thu, Nov 22, 2012 at 9:02 AM, Markus Metz > >> >> >>> >> >> > <[email protected]> > >> >> >>> >> >> > wrote: > >> >> >>> >> >> >> > >> >> >>> >> >> >> On Sat, Nov 17, 2012 at 4:06 PM, Aren Cambre > >> >> >>> >> >> >> <[email protected]> > >> >> >>> >> >> >> wrote: > >> >> >>> >> >> >> > I have a dataset of just over 700,000 incidents that > >> >> >>> >> >> >> > happened > >> >> >>> >> >> >> > in > >> >> >>> >> >> >> > square-ish > >> >> >>> >> >> >> > Texas county that's about 30 miles on each side. > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > Here's the parameters reported by v.kernel as it's > >> >> >>> >> >> >> > executing: > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > STDDEV: 1000.000000 > >> >> >>> >> >> >> > RES: 111.419043 ROWS: 458 COLS: 447 > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > Writing output raster map using smooth > >> >> >>> >> >> >> > parameter=1000.000000. > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > Normalising factor=6482635.018778. > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > I am running this on a Windows 7 x64 machine with 8 GB > >> >> >>> >> >> >> > RAM > >> >> >>> >> >> >> > and > >> >> >>> >> >> >> > an > >> >> >>> >> >> >> > Intel > >> >> >>> >> >> >> > Core > >> >> >>> >> >> >> > i7 Q720 1.6 GHz with 4 physical cores. I notice that > >> >> >>> >> >> >> > it's > >> >> >>> >> >> >> > not > >> >> >>> >> >> >> > multithreaded, > >> >> >>> >> >> >> > only using 1 core. > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > It takes about 16 hours to complete. Is this correct? > >> >> >>> >> >> >> > I'd > >> >> >>> >> >> >> > like > >> >> >>> >> >> >> > to > >> >> >>> >> >> >> > use > >> >> >>> >> >> >> > this > >> >> >>> >> >> >> > on a dataset with closer to 5 million records, and I'm > >> >> >>> >> >> >> > really > >> >> >>> >> >> >> > concerned > >> >> >>> >> >> >> > how > >> >> >>> >> >> >> > long it may take. > >> >> >>> >> >> >> > >> >> >>> >> >> >> The time required by v.kernel is a function of the > number > >> >> >>> >> >> >> of > >> >> >>> >> >> >> cells > >> >> >>> >> >> >> and > >> >> >>> >> >> >> the input parameter stddeviation. The larger any of > these > >> >> >>> >> >> >> values > >> >> >>> >> >> >> is, > >> >> >>> >> >> >> the more time v.kernel will need. Nevertheless, I think > >> >> >>> >> >> >> that > >> >> >>> >> >> >> the > >> >> >>> >> >> >> 16+ > >> >> >>> >> >> >> hours are not correct. I tested with a vector with 3 > >> >> >>> >> >> >> million > >> >> >>> >> >> >> points > >> >> >>> >> >> >> for a grid with 2700 rows and 1087 columns, magnitudes > >> >> >>> >> >> >> larger > >> >> >>> >> >> >> than > >> >> >>> >> >> >> the > >> >> >>> >> >> >> grid used by you. v.kernel completes in just over one > >> >> >>> >> >> >> minute. > >> >> >>> >> >> >> > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > I posted my question about the 16+ hours at > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > > http://gis.stackexchange.com/questions/41058/how-do-i-compute-v-kernel-maps-in-less-than-16-hours/ > . > >> >> >>> >> >> >> > Bill Huber, who si apparently knowledgeable about > kernel > >> >> >>> >> >> >> > density > >> >> >>> >> >> >> > calculations in general, posted a response, and he > felt > >> >> >>> >> >> >> > like a > >> >> >>> >> >> >> > kernel > >> >> >>> >> >> >> > density map shouldn't take much time at all. But > digging > >> >> >>> >> >> >> > more > >> >> >>> >> >> >> > deeply, > >> >> >>> >> >> >> > turns > >> >> >>> >> >> >> > out he had come up with a kernel density calculation > >> >> >>> >> >> >> > method > >> >> >>> >> >> >> > over a > >> >> >>> >> >> >> > decade > >> >> >>> >> >> >> > ago using Fourier transforms. See > >> >> >>> >> >> >> > > http://www.directionsmag.com/features/convolution/129753 > >> >> >>> >> >> >> > and > >> >> >>> >> >> >> > the > >> >> >>> >> >> >> > next > >> >> >>> >> >> >> > two > >> >> >>> >> >> >> > articles linked to it (they are short articles). > >> >> >>> >> >> >> > Apparently > >> >> >>> >> >> >> > this > >> >> >>> >> >> >> > transforms > >> >> >>> >> >> >> > it from an O(n^2) problem to an O(n ln n) complexity > >> >> >>> >> >> >> > problem. > >> >> >>> >> >> >> > >> >> >>> >> >> >> The approach of Bill Huber is raster-based, not vector > >> >> >>> >> >> >> based, > >> >> >>> >> >> >> making > >> >> >>> >> >> >> some things easier, at the cost of precision. The > >> >> >>> >> >> >> coordinate > >> >> >>> >> >> >> precision, however, is only needed for kernel functions > >> >> >>> >> >> >> other > >> >> >>> >> >> >> than > >> >> >>> >> >> >> uniform. In GRASS, you could get something like a > >> >> >>> >> >> >> raster-based > >> >> >>> >> >> >> density > >> >> >>> >> >> >> map by > >> >> >>> >> >> >> > >> >> >>> >> >> >> - exporting the points with v.out.ascii > >> >> >>> >> >> >> - re-importing the points with r.in.xyz method=n to get > >> >> >>> >> >> >> the > >> >> >>> >> >> >> number > >> >> >>> >> >> >> of > >> >> >>> >> >> >> points per cell > >> >> >>> >> >> >> - running a neighborhood analysis using a circular > window > >> >> >>> >> >> >> with > >> >> >>> >> >> >> r.neighbors method=sum -c > >> >> >>> >> >> >> > >> >> >>> >> >> >> Optionally you could use the gauss option of r.neighbors > >> >> >>> >> >> >> to > >> >> >>> >> >> >> get > >> >> >>> >> >> >> an > >> >> >>> >> >> >> equivalent to v.kernel kernel=gaussian > >> >> >>> >> >> >> > >> >> >>> >> >> >> HTH, > >> >> >>> >> >> >> > >> >> >>> >> >> >> Markus M > >> >> >>> >> >> >> > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > I inspected v.kernel's main.c > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > ( > http://trac.osgeo.org/grass/browser/grass/trunk/vector/v.kernel/main.c), > >> >> >>> >> >> >> > and looks like v.kernel uses an output-centric method > >> >> >>> >> >> >> > (using > >> >> >>> >> >> >> > Bill's > >> >> >>> >> >> >> > wording) > >> >> >>> >> >> >> > of calculating the output, which seems like O(n^2) > >> >> >>> >> >> >> > complexity. > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > So I guess what I'm getting at is it appears to me > that > >> >> >>> >> >> >> > the > >> >> >>> >> >> >> > algorithm > >> >> >>> >> >> >> > behind > >> >> >>> >> >> >> > GRASS GIS's v.kernel is straightforward but is a > greedy > >> >> >>> >> >> >> > algorithm > >> >> >>> >> >> >> > (http://en.wikipedia.org/wiki/Greedy_algorithm), > which > >> >> >>> >> >> >> > is > >> >> >>> >> >> >> > fine, > >> >> >>> >> >> >> > but > >> >> >>> >> >> >> > it > >> >> >>> >> >> >> > make > >> >> >>> >> >> >> > take a while to execute. Is this true? > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > Is there not spatial indexing I could add to the > >> >> >>> >> >> >> > dataset? > >> >> >>> >> >> >> > I've > >> >> >>> >> >> >> > done > >> >> >>> >> >> >> > various > >> >> >>> >> >> >> > Google searches on that and can't come up with > anything > >> >> >>> >> >> >> > clear. > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > Aren > >> >> >>> >> >> >> > > >> >> >>> >> >> >> > _______________________________________________ > >> >> >>> >> >> >> > grass-user mailing list > >> >> >>> >> >> >> > [email protected] > >> >> >>> >> >> >> > http://lists.osgeo.org/mailman/listinfo/grass-user > >> >> >>> >> >> >> > > >> >> >>> >> >> > > >> >> >>> >> >> > > >> >> >>> > > >> >> >>> > > >> >> >> > >> >> >> > >> >> > > >> > > >> > > > > > >
_______________________________________________ grass-user mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/grass-user
