Hi, On 05/16/2012 03:13 AM, Alan Mintz wrote:
I can understand why that is - it's being worked on by many people, may need partial revertability, will probably run for a long time, etc. Removal of one tag in bulk doesn't present these issues, and may be possible, which is why I'm asking: a) does it help; and b) is it possible?
It is certainly possible to remove created_by from the database without a trace; it just requires a couple (two?) SQL statements.
It would be an almost unprecedented action; the last time we kicked something out of the database like that was when we removed the first, aborted, TIGER import.
There is some reservation among sysadmins against doing something like that because, being outside of the envelope of "normal operations", it could have side effects that nobody foresaw. It would also falsify history in that, by removing that tag, we would essentially claim that the tag never was there in the first place. This is of course not terribly important but still - for objects created in pre-0.6 API times the created_by tag that you can look up in the object history is the only thing that tells us what editor was used when the object was created.
So yes, it is possible and I believe if the benefit was big enough it could be done.
But what's the benefit really? Most people who run a local database instance will run an osm2pgsql database and not have imported created_by in the first place so no waste of space there. When new diffs are generated and pushed out, they are unlikely to contain many created_by tags because created_by is deleted upon sight by modern editors, so that's a non-issue too; and as for planet file size, I removed all 1.75 million created_by tags from a 1300 MB germany.osm.pbf and ended up with a 1297 MB file which suggests that 45 MB of the planet file could be saved by removing all created_by tags (that's about 0.3%).
It might make a larger difference on the history planet file, and there surely will be some places where the response to a "map" API call might be more thoroughly affected (there are stretches of coastline, I believe, where every node carries a source and a created_by tag).
However, on the whole, I don't think that there's a large enough benefit for drastic action; as I said, most editors will already drop created_by on upload so the tag is slowly dying out anyway.
Bye Frederik -- Frederik Ramm ## eMail [email protected] ## N49°00'09" E008°23'33" _______________________________________________ Talk-us mailing list [email protected] http://lists.openstreetmap.org/listinfo/talk-us

