On 2015-02-14 20:45, Marko Mäkelä wrote:
On Sat, Feb 14, 2015 at 03:57:21PM +0100, Colin Smale wrote:
What about multi-lingual countries such as Belgium or Switzerland?
Or multi-lingual cities, such as Montréal in Canada?
But, is this really an issue? Street signs may be in two or more
languages, saying "Foo Street" and "Rue Foo" for example. Can anyone
name a multi-lingual area where a stopword in one language would be a
non-stopword in the other language?
"de" is "the" in Dutch, "of" in French - both (candidate) stopwords in
their own way, but you would want different rules for keeping or
omitting "de" in street names.
It also means "South" in Welsh, which you probably would not want to
omit in most cases.....
For example, could there be a highway=* with name="Rue Street" in a
French/English area? I would not think so.
For what it is worth, there are a lot of bilingual street signs in
Finland, using Finnish (name:fi), Swedish (name:sv) or in the north,
Sámi (name:se). It depends on the share of the minority population
whether multiple languages are used. The majority language appears
first in the signs. So, usually it is Finnish first, then Swedish, or
Swedish first, then Finnish. Sometimes the signs could be Finnish or
Swedish only.
How about this (sorry the abbreviations are wrong but it is only to
illustrate my point): mkgmap:country=POL {set mkgmap:lang=polish;}
AFAIU, your suggestion wrongly assumes that only one language will be
used in a given region. And I think it should be based on
administrative regions, not necessarily countries.
I intended to suggest that each area would have a single "default"
language. Main reason is to select the correct stopword treatment in the
absence of explicit name:xx tags. In most cases roads are just tagged
with "name=*" - so this mechanism would define the mapping of "name" to
a language. Then you only need a single stopword treatment for the
language, which can be shared by all territories which use that
language.
How would you represent an area that has multiple official languages
that can appear on street signs? I think that the OSM convention would
be something like this:
{ set mkgmap:lang:fi=yes; mkgmap:lang:sv=yes; }
or the (more tricky for our style rules)
{ set mkgmap:lang='fi;sv' }
Well, I assume that the maps produced by mkgmap are targeted to a
language (or ordered list of languages) chosen by the mkgmap user. I
can't imagine someone wanting all the languages in the map at the same
time. Can the Garmin format even handle that?
If the stopwords were also defined to be regular expressions, then it
could also handle prefixes and suffixes as well as whole words.
I agree that defining stopwords as regular expressions would provide
some necessary flexibility. Like someone said, we do not want to omit
Straße (or other stopwords) at the start of a street name in languages
that usually put the stopword at the end of the name. But, in French
and Spanish the stopword is always at the start of the name. An
anchored regexp (<Straße$ or ^Calle>) would nicely express this.
Maybe the regexp could also facilitate a rewriting system for
abbreviating the index entries, such as replacing "street" with "st" in
English, "Straße" with "Str" in German, "puiestee" with "pst" in
Estonian, "katu" with "k" in Finnish and so on.
Marko
_______________________________________________
mkgmap-dev mailing list
[email protected]
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev [1]
Links:
------
[1] http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________
mkgmap-dev mailing list
[email protected]
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev