> On Sep 16, 2020, at 7:35 AM, Marco Boeringa <[email protected]> wrote:
> 
> Hi all,
> 
> This is probably more of a PostgreSQL question than a PostGIS one, but I have 
> wondered if there is actually any work going on in allowing PostgreSQL / 
> PostGIS to build GiST type spatial indexes in parallel, and / or if this is 
> even logically and technically feasible? According to the PostgreSQL 
> documentation, only B-tree indexes can be indexed in parallel.
> 
> With the ever growing size of spatial databases like OpenStreetMap, with 
> tables running into the 100s of million records, spatial indexing using GiST 
> is one of the major bottle necks in re-creating or reloading a spatial 
> PostGIS database. The indexing process seems highly CPU bound, with 
> negligible disk activity for the majority of the time the indexing process 
> runs, hence being able to take advantage of multiple cores seems like a 
> possible big win. Nonetheless, there seems little to no mention of such 
> (future) option for GiST type indexing when searching on the internet for 
> relevant information.

Marco, 
I do not know if there is active work in the area of making GIST index builds 
faster, but I have heard discussions of various approaches from people much 
smarter than I, so I am sure there are potential areas of improvement 
available. The single-threaded performance of index build might be made faster 
with some bulk/batch handling of inserts, though how that interacts with the 
generic GIST API expectation of one-at-a-time insertion I do not know.
Probably the biggest hurdle is just that the number of size-constrained GIST 
data sets is much smaller than that of BTREE data sets, so it's a lower 
priority. Certainly the growth in OSM ubiquity is increasing the number of 
users with very large spatial databases they need to index though, so we can 
expect more pressure as time goes on.
ATB,
P


> 
> I have no knowledge of the internals of PostgreSQL, so there may be technical 
> limitations I am unaware of that make this impossible or technically very 
> difficult to implement, but it would be nice to hear something about this 
> even if it is not feasible.
> 
> Yes, I know there are BRIN type spatial indexes for PostGIS, which are 
> comparatively super fast to create and lead to very small indexes even for 
> ultra large tables, but from the little information and personal experience I 
> gathered, BRIN seems most suited for Point data only, and for static, not 
> updated data, due to its requirement of clustered data for efficiency 
> (actually not a problem in my particular case, since I don't do updates, but 
> only reloads). The few times I tried to use it for large, spatially 
> clustered, Polygon data sets, it seemed less efficient when accessing the 
> data spatially in a GIS, with clearly longer display times, although I don't 
> have real benchmarks for that.
> 
> Most OpenStreetMap related tools like e.g. osm2pgsql also default to GiST, 
> and probably with good reason.
> 
> Marco
> 
> _______________________________________________
> postgis-users mailing list
> [email protected]
> https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/postgis-users

Reply via email to