Re: [Tile-serving] [osm2pgsql-dev/osm2pgsql] parallelize the COPY phase (Discussion #2426)

Jochen Topf via Tile-serving Wed, 29 Oct 2025 00:49:38 -0700

In the usual configuration there are two threads doing COPYs, one for the 
"middle" tables (in slim mode only), one for the output tables. Data is 
collected in chunks and then send via a queue to those threads for the actual 
COPY operation. We could use a thread pool instead of those two threads for the 
actual COPY but never thought that this would improve the situation much. In 
the end the bottle neck is probably the I/O isn't it? And doing more of this in 
parallel means more contention on the WAL and, if we are writing to the same 
table in multiple COPYs at once, more contention an that table. So it is 
unclear to me why having more parallelismus would help significantly. Doing 
anything with multithreading in C++ code is always a pain, so keeping this code 
as simple as possible is also important.


But maybe we are wrong there and didn't take some issue into account. And if 
somebody wanted to try this, that would be great, we'd gat actual data.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/osm2pgsql-dev/osm2pgsql/discussions/2426#discussioncomment-14812699
You are receiving this because you are subscribed to this thread.

Message ID: 
<osm2pgsql-dev/osm2pgsql/repo-discussions/2426/comments/[email protected]>

_______________________________________________
Tile-serving mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/tile-serving

Re: [Tile-serving] [osm2pgsql-dev/osm2pgsql] parallelize the COPY phase (Discussion #2426)

Reply via email to