This message explains a problem that user interfaces such as Nominatim have when choosing the correct localized 'name' tag to show to the user, why I believe this is caused by incomplete tagging, and a proposal to fix it.
I was surprised when searching on the OSM home page to find Edinburgh was not in Scotland but in 'Ecosse'. My preferred language is English, although I do understand a few words of French, and I had set my browser language preferences accordingly. The browser sends an HTTP Accept-Language header giving English a better score than French. So why does Nominatim think I would prefer to see the French name for the country? It is because it sees some tags like (simplified to illustrate): name=Scotland name:fr=Ecosse name:es=Escocia Given that, and the user's preferred languages [en, fr], what name should be picked? The program cannot know that the name 'Scotland' is in English, so the best course of action is to pick a name that the browser says it will accept. If none of the names is tagged with an accepted language then it can fall back to the ordinary 'name' tag as a last resort, but if some localized names are there then they should be used. The alternative would be no localization. I have noticed similar problems when searching in the USA: someone added Serbian Cyrillic names for the 50 states, which now pop up instead of the English names because I have included Serbian in my language list, even though with tiny score. I believe the answer, as so often, is to improve the tagging used so that software has the information it needs. In this case an explicit English- language name should be added, so we have name=Scotland name:en=Scotland name:fr=Ecosse name:es=Escocia (Another way to tag the same info would be to invent a new tag 'language_of_main_name=en' but this seems cumbersome and would not be understood by existing software.) In an attempt to fix this I have asked the maintainer of <http://keepright.ipax.at/> to add a data check. Where a choice of languages exists for a name, then there should be one that corresponds to the main 'name' tag. In other words for the example above there was name=Scotland but not any name:XX=Scotland. One should be added indicating the language of this name, so that user interfaces can choose among the name:XX. Of course if an object has just a single name tag to be used for all languages, that's fine. What I plan to do is to work through these 'language unknown' warnings and, with help from a tool, add explicit language tags. I have manually fixed the small number of cases in London but it gets more interesting in Wales (where a user who understands both English and Welsh, but prefers English, will currently be given the Welsh names) or Turkey (where a user preferring Turkish to Greek will be given Greek names for many places). In the new year I plan to write a small tool to help fix these, prompting a human being to decide or at least verify the language of each name. Then an additional name:XX tag will be added to the object. Sound sensible? -- Ed Avis <e...@waniasset.com> _______________________________________________ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk