I think it would be best to make the tools we use JOSM, Overpass API, iD, etc. Unicode aware, so they can handle this correctly.
Polyglot 2018-01-26 16:50 GMT+01:00 Matej Lieskovský <[email protected]>: > @marc: I just realized - I'm not talking about breaking words between > syllables but about breaking lines between words. It is not adding a > character, just using a nonbreakable version of a space. Sorry if I'm > not being clear. > > On 26 January 2018 at 16:47, Matej Lieskovský > <[email protected]> wrote: > > In Czech, a nonbreakable space should follow any single-letter > > preposition or conjunction and academic or military titles. A > > nonbreakable space should also be used due to some common > > contractions, between a number and a unit, and around some punctuation > > marks. > > > > I noticed that some Overpass queries were not returning some elements > > - that is how I found out that we actually have a rather large number > > of nonbreakable spaces in the data. > > > > Nonbreakable spaces are currently quite troublesome - not all > > consumers actually use Unicode collation, it is invisible in JOSM and > > it is not exactly easy to input. Also, the chance that we convince all > > contributors to use it correctly is exactly zero. Along with this > > potentially being "tagging for the renderer", there are many calls for > > a mass-removal. > > > > On the other hand, there is software that actually handles Unicode > > collation well and it does make the correct rendering of names an > > order of magnitude easier. Leaving this up to the renderer sounds > > logical, but imagine forcing every renderer to figure out what > > language any given name is in and then running the appropriate > > subprogram to fill in the nonbreakable spaces. This could require > > semantic analysis due to the need to add a nonbreakable space after > > the "V" in "V jámě" (preposition) but before the "V" in "Jiří V." > > (roman ordinal number) and after the "V." in "V. Špidla" (contraction > > of name (and yes, there are cases when you should use a contraction)). > > > > Nonbreakable spaces are strange - you cannot reliably tell if they are > > used OTG (but in some cases you can), official documents often ignore > > them (leaving them up to the automated systems in office software, so > > they do occur sometimes) and the rules governing them are older than > > computers, so asking if they are a rule or a character is... dubious. > > > > And yes, we do have really long names of things. Names of POIs named > > after people are a common use case. > > > > Matej > > > > On 26 January 2018 at 16:11, marc marc <[email protected]> > wrote: > >> Le 26. 01. 18 à 15:48, Matej Lieskovský a écrit : > >>> Several Slavic languages have rather formal rules about line breaks. > >> > >> it depends on whether it is a grammar rule or a "char". > >> In French, it is a rule to know how to cut a word at the end of a line. > >> Since it's a grammar rule, I don't see any point in adding a character > >> between syllables to describe it. it's up to the render > >> to know when it can do it if ppl wants this feature. > >> I know nothing about your language, but I feel it look like the same. > >> If my understanding is correct, I am in favour of not putting > >> this "nonbreakable" information into a value and moving it to app code > >> that need it (witch ? have you so long value that's needed to break it > >> in several line ?) > >> > >> Regards, > >> Marc > >> _______________________________________________ > >> Tagging mailing list > >> [email protected] > >> https://lists.openstreetmap.org/listinfo/tagging > > _______________________________________________ > Tagging mailing list > [email protected] > https://lists.openstreetmap.org/listinfo/tagging >
_______________________________________________ Tagging mailing list [email protected] https://lists.openstreetmap.org/listinfo/tagging
