Re: [OSM-talk] Bug in using osm2pgsql to keep up with dailies
Did you see the patch posted a few days ago about making creates into modifies? Otherwise I'll just commit it to SVN. Have a nice day, On Wed, Sep 3, 2008 at 7:00 PM, Michal Migurski [EMAIL PROTECTED] wrote: I'm revisiting the planet.osm stuff this week, from below. If the planet.osm file is from 2008-08-27, do I start running daily diffs at 20080827-20080828 or 20080828-20080829? I would assume the former, but I get duplicate key errors when I try: ERROR: duplicate key value violates unique constraint osm_bayarea_ways_pkey (7) Arguments were: 26580292, {26469086,11080906,165095606,11080816}, {ref,A232,highway,trunk,name,Croydon Road}, f, Error occurred, cleaning up Am I correct to go in this order? create planet-080827.osm.bz2 ignore 20080827-20080828.osc.gz append 20080828-20080829.osc.gz append 20080829-20080830.osc.gz etc. -mike. On Aug 12, 2008, at 12:30 PM, Jon Burgess wrote: On Sun, 2008-08-10 at 18:45 -0700, Michal Migurski wrote: So I'm definitely doing the bbox thing - I ran out of space on the volume when doing a slim import of planet.osm with a box that covered only the extended SF Bay Area. Seems like that should be fairly reasonable, right? Perhaps the slim mode is not taking the bounding box into account. I'll take a look. Any news? The news is mixed. The slim mode code does correctly exclude nodes outside of the bounding box when reading them in from the file. Unfortunately all the ways and relations still make it to the intermediate tables. It isn't until the code tries to extract the geometries from the ways that it can discover if the nodes for the way are outside the bounding box. It may be possible to improve this but it would need to make the assumption that all the nodes are in the file. I don't have time to look at this right now though. michal migurski- [EMAIL PROTECTED] 415.558.1610 ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Bug in using osm2pgsql to keep up with dailies
On Sun, 2008-08-10 at 18:45 -0700, Michal Migurski wrote: So I'm definitely doing the bbox thing - I ran out of space on the volume when doing a slim import of planet.osm with a box that covered only the extended SF Bay Area. Seems like that should be fairly reasonable, right? Perhaps the slim mode is not taking the bounding box into account. I'll take a look. Any news? The news is mixed. The slim mode code does correctly exclude nodes outside of the bounding box when reading them in from the file. Unfortunately all the ways and relations still make it to the intermediate tables. It isn't until the code tries to extract the geometries from the ways that it can discover if the nodes for the way are outside the bounding box. It may be possible to improve this but it would need to make the assumption that all the nodes are in the file. I don't have time to look at this right now though. Jon ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Bug in using osm2pgsql to keep up with dailies
On Sun, 2008-08-10 at 18:45 -0700, Michal Migurski wrote: So I'm definitely doing the bbox thing - I ran out of space on the volume when doing a slim import of planet.osm with a box that covered only the extended SF Bay Area. Seems like that should be fairly reasonable, right? Your best bet would probably to start with a pre-filtered planet extract like the one for California here: http://downloads.cloudmade.com/north_america/united_states/california Jon ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Bug in using osm2pgsql to keep up with dailies
Slim mode over the full planet won't work without the intarray module, but it's not included in contrib/ for postgresql-8.3 on Debian Lenny. Where should I be looking for this? If anything like ubuntu then it is -- but freakily it's called _int rather than intarray. So see if you can find that instead. Dave ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Bug in using osm2pgsql to keep up with dailies
So I'm definitely doing the bbox thing - I ran out of space on the volume when doing a slim import of planet.osm with a box that covered only the extended SF Bay Area. Seems like that should be fairly reasonable, right? Perhaps the slim mode is not taking the bounding box into account. I'll take a look. Any news? Probably the right thing to do would be to get the import done once with a larger volume available to Postgres (EC2 does give you a secondary disk at /mnt that's over 100GB), then keep up with incrementals moving forward after the initial inconvenience. 100GB would definitely be enough. I've seen the full slim-mode planet import taking around 40GB. This code to handle the diff mode import is all very new and does not scale up to handling the whole planet yet. So I'm running in further frustrations here. Slim mode over the full planet won't work without the intarray module, but it's not included in contrib/ for postgresql-8.3 on Debian Lenny. Where should I be looking for this? -mike. michal migurski- [EMAIL PROTECTED] 415.558.1610 ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Bug in using osm2pgsql to keep up with dailies
2008/8/5 Michal Migurski [EMAIL PROTECTED]: Another way to save more disk space is to filter out the data you don't require. Either by applying a bounding box or by removing items from the default.style. So I'm definitely doing the bbox thing - I ran out of space on the volume when doing a slim import of planet.osm with a box that covered only the extended SF Bay Area. Seems like that should be fairly reasonable, right? Perhaps the slim mode is not taking the bounding box into account. I'll take a look. Probably the right thing to do would be to get the import done once with a larger volume available to Postgres (EC2 does give you a secondary disk at /mnt that's over 100GB), then keep up with incrementals moving forward after the initial inconvenience. 100GB would definitely be enough. I've seen the full slim-mode planet import taking around 40GB. This code to handle the diff mode import is all very new and does not scale up to handling the whole planet yet. -- Jon ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
[OSM-talk] Bug in using osm2pgsql to keep up with dailies
Nice, works very well. One hiccup I see is that if I run the executable from a directory other than the one where it was built, it complains that default.style can't be found. Otherwise works beautifully. So, two frustrating things about osm2pgsql's --slim mode: I first tried doing the whole planet.osm without --slim, which worked well. However, when I would then use the --slim option to catch up on dailies, I found that a number of tables (prefix_nodes, prefix_ways, etc.) hadn't been created. It was not possible to do the dailies unless they had been planed-for from the start. The second thing is that upon going back to the original planet.osm with the much-slower --slim mode turned on, it required so much disk space that it maxed out an EC2 standard disk image. It would be nice if it were possible to do the initial planet.osm import without --slim for speed and space, and still import subsequent diffs. -mike. michal migurski- [EMAIL PROTECTED] 415.558.1610 ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Bug in using osm2pgsql to keep up with dailies
On Mon, 2008-08-04 at 14:51 -0700, Michal Migurski wrote: Nice, works very well. One hiccup I see is that if I run the executable from a directory other than the one where it was built, it complains that default.style can't be found. Otherwise works beautifully. So, two frustrating things about osm2pgsql's --slim mode: I first tried doing the whole planet.osm without --slim, which worked well. However, when I would then use the --slim option to catch up on dailies, I found that a number of tables (prefix_nodes, prefix_ways, etc.) hadn't been created. It was not possible to do the dailies unless they had been planed-for from the start. The second thing is that upon going back to the original planet.osm with the much-slower --slim mode turned on, it required so much disk space that it maxed out an EC2 standard disk image. It would be nice if it were possible to do the initial planet.osm import without --slim for speed and space, and still import subsequent diffs. I'm afraid that is not possible. The conversion from OSM to postgres is lossy. It converts all the node references on the ways into linestring geometries referencing the individual lat/lon of the nodes without any reference to the IDs. This makes it impossible to update this data without storing a copy of all the raw nodes and ways in the extra tables generated by the slim-mode import. An alternative way to do this is to use osmosis to update the planet file with the daily diff and then reload this into postgres. Unfortunately this may take too long to be a practical solution. Another way to save more disk space is to filter out the data you don't require. Either by applying a bounding box or by removing items from the default.style. Jon ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Bug in using osm2pgsql to keep up with dailies
It would be nice if it were possible to do the initial planet.osm import without --slim for speed and space, and still import subsequent diffs. I'm afraid that is not possible. The conversion from OSM to postgres is lossy. It converts all the node references on the ways into linestring geometries referencing the individual lat/lon of the nodes without any reference to the IDs. This makes it impossible to update this data without storing a copy of all the raw nodes and ways in the extra tables generated by the slim-mode import. Gotcha. An alternative way to do this is to use osmosis to update the planet file with the daily diff and then reload this into postgres. Unfortunately this may take too long to be a practical solution. Another way to save more disk space is to filter out the data you don't require. Either by applying a bounding box or by removing items from the default.style. So I'm definitely doing the bbox thing - I ran out of space on the volume when doing a slim import of planet.osm with a box that covered only the extended SF Bay Area. Seems like that should be fairly reasonable, right? Probably the right thing to do would be to get the import done once with a larger volume available to Postgres (EC2 does give you a secondary disk at /mnt that's over 100GB), then keep up with incrementals moving forward after the initial inconvenience. -mike. michal migurski- [EMAIL PROTECTED] 415.558.1610 ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk