This is assuming that you are feeding a database, which is not our use case. But I realize it is for most people, which is why I wanted to open this discussion in the first place. Martijn Martijn van Exel skype: mvexel
On Wed, Apr 29, 2015 at 12:28 PM, François Battail <francois.batt...@sipibox.fr> wrote: > Le 29/04/2015 19:27, Paul Norman a écrit : >> >> The real gains from the threading branch were not multi-threaded PBF >> reading, but more concurrency in the >> geometry processing and database parts. > > > Completely agree. Parsing Planet *without* doing anything is around 800s > (less than 15 min), copying it is something like 200s, so ideally we can > gain a 4x speedup by using tricky things (AIO, look ahead, threading...). > It's simply ridiculous according to the time needed to process OSM objects > and invoking libpq even when using binary format and prepared statements. > > May be for some specific applications it could be of interest, but for > integrating OSM data in a database there's no value for optimizing parsing > as the database workers are mostly the limiting factor. > > In my application, with 32 GB of memory (and 32 GB of swap) I need to pause > the parser because the queue is full and I'm waiting for the database to > process the bulk loading (without indexes). > > I've tried to optimize as much as possible all stages - even the parsing by > using a custom allocation system - I don't see the point to optimize more > this part as the bottleneck is the database (and I don't want to rewrite > PostgreSQL which is a very good software!). > > Best regards > > _______________________________________________ > dev mailing list > dev@openstreetmap.org > https://lists.openstreetmap.org/listinfo/dev _______________________________________________ dev mailing list dev@openstreetmap.org https://lists.openstreetmap.org/listinfo/dev