Paul,

Just wondering, given the fact that, as it now turns out, the specific step to calculate the bounding boxes is such a large part of the overall time of the building of the GiST spatial index, shouldn't this particular step be parallelized in PostgreSQL via parallel workers?

I realize that with that question, this has become more of a question for the PostgreSQL mailing list, but you likely have some idea if that is even viable given your detailed knowledge of what goes on with GiST spatial indexing.

However, with apparently 90% or so of all of the building time of the GiST spatial index for Polygon geometries being determined by this particular step, and no indications of use parallel processing taking place (low IO, only one core at 100% from what I usually see), that question seems warranted.

Parallelizing this step would likely also benefit 'geometry' storage spatial indexing as well. Even though it is faster than 'geography', it is still a major process taking many hours for Planet data.

It might be a nice GSOC project to explore if this step could be parallelized.

Marco

Op 1-11-2024 om 16:28 schreef Paul Ramsey:
Basically, yes. Calculating a bbox on a plane is just some min/max calls. Doing it on a sphere involves a lot of trig and huge functions. Calculating boxes is step one of index building. The geocentric geography index involves a varsize key, while the 2d geometry index uses a fixed size key. Even things like distance cost a lot more

https://docs.google.com/presentation/d/1G7UkT9szpyRcWPp59aVRfN1-f1jfuf40dwxs034M2RA/edit#slide=id.g4c694067b3_0_79

p


On Nov 1, 2024, at 8:13 AM, thi...@gelassene-pferde.biz wrote:

Is that because geometry is calculated in the plane whereas geography on a curved surface?

Paul Ramsey <pram...@cleverelephant.ca> escribió:

Yes, building a geography index is a lot more computationally expensive.


Reply via email to