Am 21.02.2013 13:01, schrieb Hans Schmidt:
Am 21.02.2013 12:36, schrieb Peter Wendorff:
Well... if there's no localized name tag, then you may omit the name:xx tag for that language, as there's no alternative. On the other hand name:de might be useful even then, as it's possible to translate programmatically if the software knows about the language. The German suffixes -straße, -weg, -platz could be automatically transcoded to street, way and square, the afaik swedish -gatan is street again, väg is way and so on. But if you try to translate something to another language this way where you don't know the source language, it's much more difficult.

Why would you want to translate the street names? Do you want to translate Paris' “Avenue des Champs-Élysées” to “Allee der Champs-Élysées”? Nobody would know what it is anymore. Also, nobody wants to translate a “Lindenallee” in some minor german town to “Linden avenue”. Also, automatic translation would be error prone.
For complete names you may be right, but for Natural Language Generation used in tools based on osm data parts of names might be useful to translate. For the Lindenallee this might translate to "Go down the alley..." where alley might not be a given classification by tags, but due to the name only.
So a recommendation might be to
- always tag name
- if you translate name into different languages, always add name:originalLanguageCode with the same content - if you want, add that even if you don't translate it to different languages.

Yes, that's redundant - but it's easy to cut out for software (cut out every language attribute that equals the plain name), if wanted; and it's less error prone than a tag like "language=de" or like the lists of default language areas you propose above. Sure: These list are helpful for all cases where only name is given, and that's a necessity for great software dealing with that, but that's the way defaults in OSM work: there should be a few defaults for mappers, where they should decide to not add a tag, but more defaults for data consumers, who could/should be able to have a best guess where data is missing.
You say that there should be few defaults for mappers. But what you propose is exactly the opposite: You'd have a default, meaning that you would need to create a name:originallanguage even if there is a name present. I would bet that nobody does this. And if you don’t do it like that, chaos will occur if you decide to display the name.
Wait...
I agree: even in the long term the majority of objects for sure will not have a name:originallanguage in addition to the only plain name tag. This is part of the incompleteness we have everywhere in osm.
I disagree, that this would lead to chaos for itself.

Imagine a text based application that could be read aloud by software. To do that properly names should be spoken with the pronunciation of the language they are from. Let's consider a screenreader for browsers and a browser based application as an example. The output of "Dies ist der Times Square in New York" (this is the Times Square in New York) is simple to do, but a screen reader based only on German as a language would speak it out roughly like (not sure if I get it comparable for English speakers here): "Dees ist der Teames Square in Nu Johk", because nobody could know that Times Square and New York are names based on the English language. In a website, additional markup could ideally solve that (given that the screenreader supports english language as well in the users setup): <p lang="de">Dies ist der <span lang="en">Times Square</span> in <span lang="en">New York</span></p>. But to generate markup like this the software has to know about the language. Sure: this may be done by approximation based on the area in the world, and yes, developers have to use something like that for the usual case where the languages is still unknown, but in the text-to-speech area this would produce many wrong results by accident.

In contrast, if you do it based on region, it would simplify things much more:

1. You take the nodes/relation for Canada, add language=en.
2. You take the nodes/relation for Québec: language=fr

Then everybody would just continue using name=British Columbia and name=Montréal, and no problem. The multilingual renderer would then show, in case the user wants to see French names, name=Montréal and name:fr=Colombie-Britannique. If the user is English, he would show name:en=Montreal and name:British Columbia.
I completely agree, as long as it's only about displaying. I completely agree that this is a valid fallback, but as I showed above that is not able to solve all problems. Even for rendering I'm not sure if that's really an optimal solution for languages written right-to-left or downwards. Here you have to know at least this characteristics of the language to decide about label sizes and placements - not sure if that's really given in the unicode characters itself.
Tell me where this is not easier than adding a redundant name:en or name:fr for every town, bus stop and street in Canada. You would only have to change the multilangual renderer so that it would display it like that. This is no problem because I guess it is still in development – It could be done relatively easy (from a non-developer standpoint speaking).
Examples above.
And yes, it's easier to skip the native language as a separate tag. It will work for most cases; but it won't for many others. We're not a map, we're a geo database, and languages are important for that as well, especially interesting for foreign languages. It is in fact interesting to see which pubs and restaurants in Germany are named by names from English/French/Spanish/Italian/... language. It's fascinating to see where e.g. pubs have Cymraeg (Welsh) or Gaelic names in their "native" areas and outside.
And these are examples that occur often.
Plus, most of todays nodes only have a name=... tag, not a name:xyz=... one. You would not need to change anything.
Sure. Software has to support that, and has to make a best guess, but it's only that: a best guess - sometimes it's wrong. Especially in multi-language parts of the world. To suggest English as a language in the hispanic cities, towns or suburbs of the united states (e.g. Santa Fe, New Mexico [1]) is error prone, I'm sure there are areas where you have two or more languages used roughly equally.

So:
- Yes: Software developers should support guessing the natural languages (where that's necessary) - No: Mappers should NOT delete localized name tags even if these are equal to the local one out of the assumption of redundancy. - No: Mappers should NOT be told to never add localized tags where only one single name tag exists.

regards
Peter

[1] http://www.openstreetmap.org/?lat=35.68022&lon=-105.94028&zoom=17&layers=M

_______________________________________________
talk mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/talk

Reply via email to