Even Rouault wrote: > >> I made a rough parallelizing test by making 4 copies of finland.osm.pbf and >> running ogr2ogr in four separate windows. This way the total CPU load of the >> 8 cores was staying around 50%. >> Result: All four conversions were ready after 3 minutes (45 seconds per >> conversion) while a single conversion takes 2 minutes.
> In my opinion, "45 seconds per conversio"n isn't really a good summary : I'd > say > that your computer could handle 4 conversions in parallel in 3 minutes. But > the > fact of running conversions in parallel didn't make them *individually* faster > (that would be non-sense) that running a single one. We probably agree, that's > just the way of presenting the info that is a bit strange. Ok, let's use other units. Some suggestions: - data process rate as MB/sec or MB/minute (input sile size in pbf format) - node conversion rate nodes/sec - way or feature conversion rate as count/sec None of them is a perfect speed unit. Nodes/sec feels most exact but practical speed depends on the nature of data, especially on the amount of relations and how complicated they are. Megabytes of pbf data per minute could be rather good measure too. In my single process vs. four parallel processes example the conversion rates were 60 MB/minute vs. 160 MB/minute, respectively. By looking at file sizes in http://download.geofabrik.de/osm/europe/ one can make a fast estimate that converting 300 MB of data from Spain should take about 5 minutes. With parallel runs Finland, Sweden and Norway would also be ready at the same time without any cost. ...... >> It may be difficult to feed rendering chain by having a bunch of source >> databases but it looks strongly that by splitting Germany into four distinct >> OSM source files it would be possible to import the whole country in 15 >> minutes with a good laptop. > I still maintain that splitting a file is a non trivial task. I strongly > believe > that to do so, you must import the whole country and do spatial requests > afterwards. So, if the data producer doesn't do it for you, there's no point > in > doing it at your end. However if you get it splitted , then it might indeed be > beneficial to operate on smaller extracts. (With a risk of some duplicated > and/or truncated and/or missing objects at the border of the tiles) I agree. Splitting OSM data files on the client side was my ancient idea from more than a week ago. It does not make sense nowadays. Data should come splitted from the data producer. It would need some thinking about how to split the data so that there would not be troubles at the data set seams. This GSoC project seems to aim at something similar http://wiki.openstreetmap.org/wiki/Google_Summer_of_Code/2012/Data_Tile_Service -Jukka- _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
