Jon Burgess [mailto:[EMAIL PROTECTED] wrote: >Sent: 22 December 2007 12:39 PM >To: Andy Robinson (blackadder) >Cc: 'osm-dev List' >Subject: Re: [OSM-dev] Messing with planet > >On Sat, 2007-12-22 at 11:24 +0000, Andy Robinson (blackadder) wrote: >> I tried to uncompress the latest planet last night (first time in quite a >> while) which failed when the disk on my hand-me-down dev box was full :-( >> >> Would be helpful if those with planet experience could share your current >> method of processing and analysing planet and the amount of disk and >memory >> you think is the current minimum. > >It depends, but in general I think you want your tools to be able to >read a .bz2 or .gz file directly or be able to read from STDIN (so you >can pipe from bzcat etc). When I was working a lot with the planet files >I would convert the .bz2 to a .gz file since that is much faster to >decompress (if you're going to be reading it a lot). > >The memory requirement entirely depends on how much state information >you need to keep while processing the file. Don't expect to be able to >keep the whole planet file in RAM unless you can devise a very compact >representation of the data (even then you'll probably need >1GB of ram). > >> For instance, do all the tools require an >> uncompressed planet (I'm thinking osmosis I guess now?). > >Osmosis will read .gz or .bz2 files directly: > --read-xml compressionMethod=bzip2 ... > >> Also any guidance >> on free HDD space size and memory settings when processing planet and >also >> if importing it into a blank mysql database. Is import of the whole >planet >> into the rails database scheme a workable option for an individual >anymore? > >I have not looked at importing the planet into MySQL for a few months. >Even then, it would takes several hours to import into MySQL and a >couple of GB of disk space. The planet files has grown several times the >size since then so I imagine it might easily take a day to import it now >and over 10GB of disk space. > >I think you'd be much better off importing a subset, like the UK planet: >http://nick.dev.openstreetmap.org/downloads/planet/ > > >> These are questions that were not really an issue 6 months ago when >planet >> was so much smaller. I appreciate once I have got back up to speed I >could >> be using the diff files to limit requirements. >> >> Specifically I wanted to be working on two aspects over the festive >season. >> 1. some evaluation of the rails stuff now that I have a working rails >port >> 2. some tag analysis and user stats stuff > >The user information is not present in the planet dumps so that may be >tricky. > >My recommendation for any analysis would be to use a streaming solution, >reading the file from STDIN. > >Alternatively osm2pgsql could be useful for you. It will import the >planet file in a few hours*. The keys get mapped into columns in a >couple of tables. Depending on what you need the existing tables may be >OK for you, or you might need to adjust the list of exported tags in the >source if you need more keys. If you are good with SQL then you can get >all kinds of stats from the database tables, > e.g. To retrieve the top 10 highway= values > >gis=> select highway,count(highway) as num from planet_osm_roads group by >highway order by num desc limit 10; > highway | num >----------------+-------- > secondary | 323227 > primary | 175034 > motorway_link | 119377 > motorway | 53166 > trunk | 36471 > trunk_link | 11810 > primary_link | 8441 > secondary_link | 458 > residential | 433 > unclassified | 113 >(10 rows) > >This query took just a few seconds to run. > >Because the map features are stored in a spatial format it is also >relatively easy obtain geo-referenced results without needing to deal >with the node+way hierarchy. For more details see: >http://trac.openstreetmap.org/browser/applications/utils/export/osm2pgsql/R >EADME.txt > > Jon > > > >* The import of the planet on tile.openstreetmap.org this week took 8 >hours. The Postgres DB is 7GB, but the peak disk usage is probably in >the region of 10 - 20GB during the import. Tile has 2GB of RAM and I'd >recommend this as the useful minimum for the full planet import. 1GB is >probably the real minimum but will need 1-2GB of swap space and will be >slower. > > Jon >
Jon, Thanks for this. Most useful. Quite a bit of it is not evident on the wiki (mainly because planet has grown so much so quickly I guess) so I'll add some updates once I have any other responses in. Cheers Andy _______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev

