Re: [OSM-dev] Deleting TIGER node tags
in a 12-hour editing session, a very simple analysis would suggest about a 0.5% chance of conflicts. this isn't very much, imho cheers, matt it's running only for a week and I have already the first conflict. and it's max 2h edit. session latest JOSM resolved all conflicts so no big deal. it is a real possibility ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
On Sat, 08 Aug 2009 06:03:37 +0300, Eugene Alvin Villar sea...@gmail.com wrote: On Wed, Jul 22, 2009 at 6:44 PM, John Smith delta_foxt...@yahoo.com wrote: However it'd be nice for editors to strip it out automatically. -1 I don't favor editors stripping out created_by=* tags automatically whenever a user edits an area. These changes mask the actual edits the user makes and makes analyzing changesets harder. (Imagine looking at a changeset where hundreds of ways were edited by stripping out the created_by=* tags but only one way had a significant change (e.g. adding oneway=yes). What I have been doing is to strip these created_by=* in an explicit separate changeset and adding the tag comment=cleaned up created_by=* tags. JOSM already strips automatically the created_by tags when uploading, but only from those objects that the user has modified. It doesn't touch any other objects that the user has downloaded and contain a created_by tag, so there will be no spam. This way the created_by tags will gradually disappear as the data is edited. ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
On Sat, Aug 8, 2009 at 11:32 AM, Teemu Koskinen teemu.koski...@mbnet.fiwrote: On Sat, 08 Aug 2009 06:03:37 +0300, Eugene Alvin Villar sea...@gmail.com wrote: On Wed, Jul 22, 2009 at 6:44 PM, John Smith delta_foxt...@yahoo.com wrote: However it'd be nice for editors to strip it out automatically. -1 I don't favor editors stripping out created_by=* tags automatically whenever a user edits an area. These changes mask the actual edits the user makes and makes analyzing changesets harder. (Imagine looking at a changeset where hundreds of ways were edited by stripping out the created_by=* tags but only one way had a significant change (e.g. adding oneway=yes). What I have been doing is to strip these created_by=* in an explicit separate changeset and adding the tag comment=cleaned up created_by=* tags. JOSM already strips automatically the created_by tags when uploading, but only from those objects that the user has modified. It doesn't touch any other objects that the user has downloaded and contain a created_by tag, so there will be no spam. This way the created_by tags will gradually disappear as the data is edited. Ah, I see. This makes sense. Thanks for clarifying! (Now if only Potlatch and Merkaartor will do the same...) ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
--- On Fri, 31/7/09, Andy Allan gravityst...@gmail.com wrote: Each diff upload would take about 100 seconds, each changeset would take about 40 minutes, we'd be doing about 30-35 changesets per day and finish the thing after about 100 days (some time in November if we start soon). Gogogogogogogogo For It! IMHO, IANAOSMAdmin etc. I wonder during that time if All and non-US changesets could be published so only those that wanted US data would have the excess information? ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
wait a second, this will create a lot of editing conflicts when editing in Josm. Not sure how Potlatch does it in save mode. Josm still has some bugs in the conflict resolution. Can we delay until Josm is 100% master of conflict resolution. one bug was fixed today, filed a new ticket today #3141 someone with better Potlatch knowledge should comment. Can run some tests on the weekend. otherwise mappers will hate you if they loose all the hour long edits because of conflicts On Jul 31, 2009, at 6:19 AM, Andy Allan wrote: On Thu, Jul 30, 2009 at 11:01 AM, Frederik Rammfrede...@remote.org wrote: Each diff upload would take about 100 seconds, each changeset would take about 40 minutes, we'd be doing about 30-35 changesets per day and finish the thing after about 100 days (some time in November if we start soon). Gogogogogogogogo For It! IMHO, IANAOSMAdmin etc. Cheers, Andy ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
Matt Amos wrote: in a 12-hour editing session, a very simple analysis would suggest about a 0.5% chance of conflicts. this isn't very much, imho And then only if you're editing the USA data. -- Lennard ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
Hi Frederik, when a way is deleted in JOSM the nodes with tags remain in the db. this causes tons of orphans. is it possible to delete these nodes in the same cleanup. after deleting the tiger tags it's difficult to identify these useless nodes without a full history lookup. check should be - no other tags remain on node - no way or relation uses this node apo On Jul 30, 2009, at 3:01 AM, Frederik Ramm wrote: Hi, Frederik Ramm wrote: I just did a little test, prepared an .osc document that removed the node tags from about 1000 nodes: http://www.openstreetmap.org/browse/changeset/1894387 It came out at roughly 10 node changes per second. Some more tests made directly from the dev server suggest that performance is around 20 changes per second, slightly deteriorating if you upload too many changes in one diff upload (the peak performance seems to be at around 1k-2k changes per diff upload). Anything larger than 10k changes per diff upload is not feasible (you get into a territory where you have to manually increase default timeouts and all that), and also takes performance down into the 10-15 changes per second range PLUS increases the probability of having edit conflicts. If we wanted to do this cleanup through normal API requests, the best way thus seems to be dividing the data into roughly 88k batches of 2k edits each and uploading them as diff uploads; possibly grouping them in changesets of up to 25 batches each (=50k edits), which would result in roughly 3500 changesets. Each diff upload would take about 100 seconds, each changeset would take about 40 minutes, we'd be doing about 30-35 changesets per day and finish the thing after about 100 days (some time in November if we start soon). An average day in OSM currently has roughly 150k node modifications. For the 100 days of this operation, this would increase to 1.5m node modifications (factor 10). An average daily OSM diff currently has roughly 200 MB uncompressed (somedays it's 100 MB, some days it's 400 MB). For the 100 days of this operation, daily diffs would be approximately 150 MB larger, increasing the strain on downstream systems by roughly 75%. I have not done any osm2pgsql testing. If it is clever then it will detect that no geometry change has been effected by the node modification and the additional cost would mainly result from having to parse 75% more node updates. If however it automatically re-calculates the geometry of every way that contains a modified node, then it is likely that any osm2pgsql based sites running incremental updates would take anywhere between 2 and 10 times as long to process updates during the 100 days of this operation. Everything said here is of course highly speculative and based on the haphazard assumption that our systems always perform roughly as they did when I did my tests. Bye Frederik ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
Hi, Apollinaris Schoell wrote: when a way is deleted in JOSM the nodes with tags remain in the db. this causes tons of orphans. is it possible to delete these nodes in the same cleanup. It would certainly be preferable, *if* these nodes are really useless, to remove them instead of burdening the database with an unnecessary update. My gut feeling is that only a small portion of the 177m nodes in question fall into this useless category but I will have them counted. Bye Frederik -- Frederik Ramm ## eMail frede...@remote.org ## N49°00'09 E008°23'33 ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
--- On Wed, 22/7/09, Marc Schütz schue...@gmx.net wrote: Should the editors be changed to automatically remove created_by? Or maybe just well-known values like /JOSM.*/, /Potlatch.*/? Right, wrong or indiff I've been removing them on things I edit as I see them as a waste, since it only tells you what created them, not the last editor that edited them and so does nothing to figure out most editing bugs in apps, except perhaps on new nodes/ways/etc. ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
On Wed, Jul 22, 2009 at 10:46 AM, John Smith delta_foxt...@yahoo.comwrote: --- On Wed, 22/7/09, Marc Schütz schue...@gmx.net wrote: Should the editors be changed to automatically remove created_by? Or maybe just well-known values like /JOSM.*/, /Potlatch.*/? Right, wrong or indiff I've been removing them on things I edit as I see them as a waste, since it only tells you what created them, not the last editor that edited them and so does nothing to figure out most editing bugs in apps, except perhaps on new nodes/ways/etc. Well that wasn't necessarily true. Potlatch always updated it for instance. But anyway, all the editors should be adding the created_by tag to the changeset now instead, so there's no new ones being created. I remove them if I'm editing something anyway and I can be bothered, but there's not much point in going out of your way. Dave ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
On 22 Jul 2009, at 10:39, Marc Schütz wrote: what about created tags? they are pretty much useless too. 2G difference on the uncompressed planet file. Now that the editors are ignoring the created_by tag when creating and changing objects, the tag will slowly decrease over time. There is no point in creating a huge number of changesets and downstream activity for that, when it will naturally occur without any manual editing by the user. Should the editors be changed to automatically remove created_by? Or maybe just well-known values like /JOSM.*/, /Potlatch.*/? The created by tag should now be on the changeset, i.e. one level up. JOSM recently (1735 1720) implemented a change where all the created_by tags are dropped. That was the original intention when changesets were introduced. You can use the history to get the old versions of the created_by tag prior to the changeset introduction. Shaun smime.p7s Description: S/MIME cryptographic signature ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
Russ Nelson wrote: Where the TIGER data is in good shape, they do line up exactly. I can point you to some examples. Can you point me to some examples where they don't line up? I think practically along every county border I have followed in northern California. I have connected/joined hundreds of roads up. Some still unfixed examples in NorCal: http://www.openstreetmap.org/?lat=37.5677lon=-121.5432zoom=14layers=B000FTF http://www.openstreetmap.org/?lat=38.6975lon=-122.50186zoom=17layers=B000FTF http://www.openstreetmap.org/?lat=38.853751lon=-122.401245zoom=18layers=B000FTF http://www.openstreetmap.org/?lat=38.080017lon=-120.927692zoom=18layers=B000FTF http://www.openstreetmap.org/?lat=37.55852lon=-120.57501zoom=16layers=B000FTF http://www.openstreetmap.org/?lat=37.4335lon=-120.89862zoom=17layers=B000FTF One from NM: http://www.openstreetmap.org/?lat=35.18295lon=-103.04354zoom=17layers=B000FTF Washington: http://www.openstreetmap.org/?lat=47.22859lon=-123.48972zoom=16layers=B000FTF Idaho: http://www.openstreetmap.org/?lat=47.35137lon=-116.88245zoom=16layers=B000FTF Oregon/Nevada: http://www.openstreetmap.org/?lat=41.99973lon=-118.11559zoom=16layers=B000FTF The worst so far was the border between Santa Cruz and Santa Clara counties along Summit rd/Skyline dr. Think sausage lakes... Probably much of this is just the generally uneven quality of TIGER data. Cities/counties/states clearly have differing standards as to precision, what they consider road enough to report, how to report freeways, how often they update; some may just not be trying very hard. I have found several housing areas with the road grid reported at double or half scale compared to reality. Maybe somebody did it on graph paper... To be fair, along most boundaries, most road ends are in the same location, so the import could have connected them; but it didn't, and so the routing still breaks. /Stellan ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
This is all redundant information. Be bold, delete it. 80n On Fri, Jun 26, 2009 at 5:59 PM, Andy Allan gravityst...@gmail.com wrote: Hello Devs, source = tiger_import_dch_v0.6_20070813 tiger:county = St. Louis, MO tiger:tlid = 100111260:100111261:10055:10059 tiger:upload_uuid = bulk_upload.pl-6143e1a9-589d-43a0-9248-e95658773ef4 Stuff like this appears on every node in the US, which is a pain. I reckon it's all pointless, since all that info is on the ways in the first place, and it's worth deleting them. Here's some numbers from Matt to consider: Tiger node tags make up 85.43% of all node tags and take up: * 12.97% of the bzipped planet size (805Mb). * 34.68% of the uncompressed planet size. * 42.20% of the lines in the planet. * 31.51% of the parsing time of the planet (based on xmllint --stream). So, can anyone think of a good reason to keep them? Should we just delete tags like these? I'd love to hear if anyone think we should keep them (bearing in mind all the info, including ids, would remain on the ways in any case). http://wiki.openstreetmap.org/wiki/TIGER_fixup/node_tags Cheers, Andy ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
2009/6/27 Minh Nguyen m...@zoomtown.com: Ngày 6/26/09 12:59 PM, Andy Allan viết: Hello Devs, source = tiger_import_dch_v0.6_20070813 tiger:county = St. Louis, MO tiger:tlid = 100111260:100111261:10055:10059 tiger:upload_uuid = bulk_upload.pl-6143e1a9-589d-43a0-9248-e95658773ef4 Stuff like this appears on every node in the US, which is a pain. I reckon it's all pointless, since all that info is on the ways in the first place, and it's worth deleting them. Here's some numbers from Matt to consider: Tiger node tags make up 85.43% of all node tags and take up: * 12.97% of the bzipped planet size (805Mb). * 34.68% of the uncompressed planet size. * 42.20% of the lines in the planet. * 31.51% of the parsing time of the planet (based on xmllint --stream). So, can anyone think of a good reason to keep them? Should we just delete tags like these? I'd love to hear if anyone think we should keep them (bearing in mind all the info, including ids, would remain on the ways in any case). http://wiki.openstreetmap.org/wiki/TIGER_fixup/node_tags Cheers, Andy The tiger:county data is actually kinda useful. I've been using it to know where the county lines are according to TIGER, to make the county boundaries more precise in my area. It's also nice to see tiger:county values on nearby streets when mapping power lines in Potlatch, so I know just how carried away I got without having to leave the editor and zoom way out. But that's probably a fringe usage. :) Andy's only talking about the node tags at the moment -- this data will still be on the ways for now so you'll still be able to do these things. Dave ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
On Sat, 2009-06-27 at 09:26 +0100, 80n wrote: This is all redundant information. Be bold, delete it. Indeed. Leave it in the ways, remove it from the nodes. ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
Indeed. Leave it in the ways, remove it from the nodes. Is this a low (database) level delete? Or will we gain a new history entry for every node showing the node as it was before the delete and as it was after? If the latter then won't the database end up bigger rather than smaller? Ed ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
2009/6/27 Ed Loach e...@loach.me.uk: Indeed. Leave it in the ways, remove it from the nodes. Is this a low (database) level delete? Or will we gain a new history entry for every node showing the node as it was before the delete and as it was after? If the latter then won't the database end up bigger rather than smaller? Ed The database is a matter for the sysadmins, this will however impact greatly on the planetfile size, which more people often struggle with. -- Regards, Thomas Wood (Edgemaster) ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
Greg Stark wrote: Also, are ways actually entirely in one county or another? It seems to me they would often span borders. That is how they are reported to the census bureau. Every county reports on the parts of roads that lie within its borders. They rarely match up exactly enough at the boundary for the import to be able to join them up. In some extreme cases, like roads along a ridge that is also the county border, every left turn is reported as a piece from county A, and every right turn as a piece from county B :) ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Deleting TIGER node tags
On Jun 27, 2009, at 10:16 AM, Stellan Lagerstrom wrote: Greg Stark wrote: Also, are ways actually entirely in one county or another? It seems to me they would often span borders. That is how they are reported to the census bureau. Every county reports on the parts of roads that lie within its borders. They rarely match up exactly enough at the boundary for the import to be able to join them up. Where the TIGER data is in good shape, they do line up exactly. I can point you to some examples. Can you point me to some examples where they don't line up? -- Russ Nelson - http://community.cloudmade.com/blog - http://wiki.openstreetmap.org/wiki/User:RussNelson r...@cloudmade.com - Twitter: Russ_OSM - http://openstreetmap.org/user/RussNelson ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
[OSM-dev] Deleting TIGER node tags
Hello Devs, source = tiger_import_dch_v0.6_20070813 tiger:county = St. Louis, MO tiger:tlid = 100111260:100111261:10055:10059 tiger:upload_uuid = bulk_upload.pl-6143e1a9-589d-43a0-9248-e95658773ef4 Stuff like this appears on every node in the US, which is a pain. I reckon it's all pointless, since all that info is on the ways in the first place, and it's worth deleting them. Here's some numbers from Matt to consider: Tiger node tags make up 85.43% of all node tags and take up: * 12.97% of the bzipped planet size (805Mb). * 34.68% of the uncompressed planet size. * 42.20% of the lines in the planet. * 31.51% of the parsing time of the planet (based on xmllint --stream). So, can anyone think of a good reason to keep them? Should we just delete tags like these? I'd love to hear if anyone think we should keep them (bearing in mind all the info, including ids, would remain on the ways in any case). http://wiki.openstreetmap.org/wiki/TIGER_fixup/node_tags Cheers, Andy ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev