On 01/-10/-28163 12:59 PM, Hartmut Holzgraefe wrote:
[...]

For the sorting code to perform well work_mem is key though, while
for the index recreation step maintenance_work_mem is needed.
Currently the wiki guide on pgsql configuration suggests static
values for both (and until recently did not mention work_mem at all).
There's also an issue with osm2pgsql not really returning the
cache memory to the operating system due to heap fragmentation.

Better results for large imports (table sizes much larger than
RAM size) could probably be archived by:

* making sure osm2pgsql properly returns the memory used for its
   cache to the operating system, for this i've got a working patch:

    https://github.com/hholzgra/osm2pgsql/tree/freeable_cache

OK, I have committed your patch and extended it to have a command line option to fall back to the old behavior.

In Linux, allocating the node cache as one large chunk works well, as internally Linux over commits memory and only allocates physical ram for those pages that are actually written too. So you can still specify a large cache value and only use as much physical memory as you need to cache all the nodes. As it is hard to guess how much cache one needs, particularly for diff-imports, this is pretty helpfull to not waste memory.

Other operating systems (Solaris? Mac OSX? Windows?) might behave differently and actually reserve the full amount of memory, in which case one might want to fall back to the old behavior at least for diff imports.


* serializing the index creation and clustering steps

   running these in parallel makes sense where everything fits
   into caches in RAM so that things get CPU bound, but on large
   imports things will be IO bound anyway, and parallel index
   builds just lead to lower cache hit rates which causes even
   more IO load

I have committed this patch too, although I have changed it so that the default remains to do the indexing in parallel, and the command line switch changes the behavior to serial indexing.

On my preliminary benchmarks, doing the indexing in parallel was slightly faster than doing it one table at a time. However, I only tried it on small planet extracts (about 100Mb for the osm.pbf file). I also didn't play around with the postgresql settings of work_mem and maintanance_work_mem. Which is potentially where the benefit from doing things sequentially comes from, by being able to set those values higher.




* start with low default work_mem and maintenance_work_mem settings
   and raise them on a per statement level, so making the appropriate
   buffer for a given operation (work_mem for ORDER BY,
   maintenance_work_mem for CREATE INDEX) as large as possible and
   then shrinking it back to its default size afterwards

Is it possible to adjust work_mem and maintanance_work_mem at run time?


Whether CLUSTER or the current approach is better/faster for our
imports needs to be benchmarked, my personal bet would be that
CLUSTER wins as our data distribution over time is not totally
random, but this really needs to me tested out.

I haven't yet benchmarked the difference between CLUSTER and the current sorting based method.

Has anyone else got numbers on this?

Kai


(one additional advantage of CLUSTER would be that peak disk
space usage during the operation would only be about two times
the data size instead of three times with the current approach)



_______________________________________________
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev

Reply via email to