At 2012-05-13 02:49, Frederik Ramm wrote:
Removing ele=0 from objects is, in my opinion, totally unnecessary;

And maybe incorrect, as ele=0 means we know the elevation is 0, while no ele tag means we do not know the elevation.


like created_by, over which WorstFixer made a similar fuss, such information could be removed where an object is touched for some other reason but I don't see why it would have to be mass-removed.

The reason for this may not be obvious to some. I assume it's because we store history of all objects, and it's a waste of space, not to mention bandwidth and processing resources to push the changes out to the mirrors, for almost no benefit. I just add "created_by=''" to my JOSM presets (or maybe it does this automatically now) so I clean it up when performing other edits.


Even so, a mass-removal would be ok if proposed, discussed, and accepted by the community like we expect everyone to; it's not ok to just do it on your own and see if someone notices.

Yes. Having said all that, OSMTI says there are 23 million nodes (33% of the total) with created_by tags! This seemed surprisingly high to me.

I retrieved nodes from 300 random 0.1x0.1 degree bboxes. Of those, only 37 returned any nodes at all**. All but 6 of those areas had no "created_by" tags on their nodes. Of those, only 2 were significant in percentage*, both in Norway.

#137 had 1558 nodes, 801 of which (51%) have created_by tags. BLTR: 68.137 13.766 68.237 13.866 #264 had 2297 nodes, 1946 of which (85%) have created_by tags. BLTR: 60.787 4.900 60.887 5.000

In #137, they are mostly tagged:
    <tag k="created_by" v="JOSM"/> (TI says this makes up 63% of the values)

In #264, they are mostly tagged:
<tag k="created_by" v="almien_coastlines"/> (TI says this makes up 10% of the values)
    <tag k="source" v="PGS(could be inacurately)"/>


My questions are:

1. Would removing the created_by from 33% of the nodes in the database save significant storage space, dump size, backup time, etc.?

2. Is it possible to remove these in bulk from the database without having to keep the history, push those diffs to mirrors, etc.? Do the mirrors occasionally start fresh from a new dump? Or can they run the same bulk purge? Or do I overestimate the necessity of doing it this way (and we can just clean it up with the regular tools and processes)?



* While not a significant portion of the total nodes in the area (only 4%), there were almost 600 created-by-tagged nodes in this file from England:

#123 had 14013 nodes, 594 of which (4%) have created_by tags. BLTR: 51.086 0.088 51.186 0.188


** I guess this clarifies why old satellites that fall from their orbits and other space junk never seem to hit anything, even if they survive re-entry :)

--
Alan Mintz <alan_mintz+...@earthlink.net>


_______________________________________________
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us

Reply via email to