> > I made a rough parallelizing test by making 4 copies of finland.osm.pbf and > running ogr2ogr in four separate windows. This way the total CPU load of the > 8 cores was staying around 50%. > Result: All four conversions were ready after 3 minutes (45 seconds per > conversion) while a single conversion takes 2 minutes.
In my opinion, "45 seconds per conversio"n isn't really a good summary : I'd say that your computer could handle 4 conversions in parallel in 3 minutes. But the fact of running conversions in parallel didn't make them *individually* faster (that would be non-sense) that running a single one. We probably agree, that's just the way of presenting the info that is a bit strange. > Conclusion: 4 parellel conversions in 3 minutes vs. within 8 minutes if > performed as serial runs is much faster. 50% CPU load may tell that the speed > of SATA disk is the limiting factor now. Test with SSD drive should give > more information about this. Yes at some point the disk is the limiting factor whatever the number of CPUs you have. Somehow it feels like the laptop has only 4 real processors/cores > even the resource manager is showing eight. I've not followed what the CPU state-of-the-art is currently, but perhaps it is a quad-core with hyper-theading ? The hyper-threaded virtual cores wouldn't be as efficient as normal cores. > > I believe that by parallelizing the conversion program it is hard to take the > juice as effectively from all the cores. Yes, if you parallelize I/O operations, then there's a risk that it makes it slower actually. Only the CPU intensive operations should be parallelized to limit that risk. But when reading OSM data, there isn't that much computation involved. Way resolving is somehow stupid and mostly aobut I/O after all. Only the resolving of multipolygons might involve CPU intensive operations to compute the spatial relation between rings, but that's a tiny amount of the data of a OSM file, and even if it is slow, it is perhaps 10 or 20% of the global conversion time. > > It may be difficult to feed rendering chain by having a bunch of source > databases but it looks strongly that by splitting Germany into four distinct > OSM source files it would be possible to import the whole country in 15 > minutes with a good laptop. I still maintain that splitting a file is a non trivial task. I strongly believe that to do so, you must import the whole country and do spatial requests afterwards. So, if the data producer doesn't do it for you, there's no point in doing it at your end. However if you get it splitted , then it might indeed be beneficial to operate on smaller extracts. (With a risk of some duplicated and/or truncated and/or missing objects at the border of the tiles) _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
