Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-10 Thread Israel Brewster
> On Jan 9, 2017, at 1:54 PM, Kevin Grittner wrote: > > On Mon, Jan 9, 2017 at 11:49 AM, Israel Brewster > wrote: > >> [load of new data] > >> Limit (cost=354643835.82..354643835.83 rows=1 width=9) (actual >> time=225998.319..225998.320 rows=1

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-09 Thread Kevin Grittner
On Mon, Jan 9, 2017 at 11:49 AM, Israel Brewster wrote: > [load of new data] > Limit (cost=354643835.82..354643835.83 rows=1 width=9) (actual > time=225998.319..225998.320 rows=1 loops=1) > [...] I ran the query again [...] > Limit (cost=354643835.82..354643835.83

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-09 Thread Rémi Cura
Hey, I like your curiosity ! At the billion range, you __have__ to use pgpointcloud, pyramid raster solution (actually the more common way to perform this task) or another database (hello monetdb). Cheers, Rémi-C 2017-01-09 20:11 GMT+01:00 Jonathan Vanasco : > > On Jan 9,

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-09 Thread Jonathan Vanasco
On Jan 9, 2017, at 12:49 PM, Israel Brewster wrote: > Planning time: 4.554 ms > Execution time: 225998.839 ms > (20 rows) > > So a little less than four minutes. Not bad (given the size of the database), > or so I thought. > > This morning (so a couple of days later) I ran the query again

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-09 Thread Paul Ramsey
At BILLIONS, you're getting to a point where the point index is probably (a) very large and (b) very deep, so you might want to do something different with your data storage, like loading the data in spatially compact patches of several 10s of points. Then the index will float more nicely in

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-09 Thread Israel Brewster
So just for interests sake, to kick things up a notch (and out of sheer morbid curiosity), I loaded a higher-resolution dataset (Elevation data for the state of Alaska, 2 arc second resolution, as opposed to 100 meter resolution before). Same structure/indexes and everything, just higher

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-05 Thread Israel Brewster
On Jan 5, 2017, at 1:38 PM, Rémi Cura wrote: > > Hey, > 1 sec seems really good in this case, > and I'm assuming you tuned postgres so the main index fits into ram (work_mem > and all other stuff). > > You could avoid a CTE by mixing both cte. > > WITH pts AS ( >

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-05 Thread Israel Brewster
Ah, yes indeed. Upping the segment length to 1,000 brings the execution time down to 642 ms, and further upping it to 10,000 brings the execution time down again to 442.104 ms. I'll have to play around with it and see where the minimum is. Would that be likely to vary depending on initial path

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-05 Thread Rémi Cura
Hey, 1 sec seems really good in this case, and I'm assuming you tuned postgres so the main index fits into ram (work_mem and all other stuff). You could avoid a CTE by mixing both cte. WITH pts AS ( SELECT (pt).geom, (pt).path[1] as vert FROM ST_DumpPoints( ST_Segmentize(

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-05 Thread Paul Ramsey
Varying the segment length upwards might have a salutary effect for a while, as the efficiency improvement of fewer inner loops battles with the inefficiency of having more points selected by the index filter. Worth an experiment. P On Thu, Jan 5, 2017 at 1:00 PM, Israel Brewster

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-05 Thread Israel Brewster
> On Jan 5, 2017, at 10:38 AM, Paul Ramsey wrote: > > Yes, you did. You want a query that spits out a tupleset of goemetries (one > each for each wee segment), and then you can join that set to your main table > using st_dwithin() as the join clause. > So start by

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-05 Thread Paul Ramsey
Yes, you did. You want a query that spits out a tupleset of goemetries (one each for each wee segment), and then you can join that set to your main table using st_dwithin() as the join clause. So start by ditching the main table and just work on a query that generates a pile of wee segments. On

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-05 Thread Israel Brewster
On Jan 5, 2017, at 8:50 AM, Paul Ramsey wrote: > > The index filters using bounding boxes. A long, diagonal route will have a > large bounding box, relative to the area you actually care about (within a > narrow strip of the route). Use ST_Segmentize() to add points

Re: [GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-05 Thread Paul Ramsey
The index filters using bounding boxes. A long, diagonal route will have a large bounding box, relative to the area you actually care about (within a narrow strip of the route). Use ST_Segmentize() to add points to your route, ST_DumpPoints() to dump those out as point and ST_MakeLine to generate

[GENERAL] Improve PostGIS performance with 62 million rows?

2017-01-05 Thread Israel Brewster
I have a database (PostgreSQL 9.6.1) containing 62,702,675 rows of latitude (numeric), longitude(numeric), elevation(integer) data, along with a PostGIS (2.3.0) geometry column (location), running on a CentOS 6.8 box with 64GB RAM and a RAID10 SSD data drive. I'm trying to get the maximum