Good catch. Once again a lot of misconception by someone who wrote it without looking at conformance requirements in these standards. The so called "standard United States character set (437)" is also a proprietary legacy charset widely used in the US but not adopted as an US standard. It should have been named "IBM/MS DOS code page 437" without reference to US (in fact it was used worldwide as the default charset on many PC's).
But basically what this says is that UN/LOCODE works only with the subset of characters found in both ISO-8859-1 and CP437, and this is what "diacritic signs, when practicable" means. Of course it is *interoperable" with ISO 10646-1, but only via a transcoding conversion. CP437 ***was*** widely used in trada date interchange, it this is no longer true since long (ISO 8859-1 was adopted much more widely and now ISO 10646-1 is prefered (most of the time using UTF-8). But there still exists some old files for dBase II/III (as used in the 1980's in old softwares running MSDOS) or similar that are encoded in CP437 but those old files are not updated with the changes needed in 2015. Modern databases are running via SQL engines with interfaces exposing ISO 10646-1 (UTF-8) or only ISO8859-1 in US and western Europe. UN/LOCODE should not target just US or Western Europe. It should work as a worldwide standard, so it has to accept names in languages such as Czech or Polish that need Latin letters with diacritics not found in ISO8859-1 but other legacy ISO8859-* charsets: those languages are not transliterated to simpler forms, unlike names in Russian, Chinese, Thai, Hebrew, Arabic that define their own standard romanizations requiring also other characters not found in ISO8859-1. For UNLOCODE, the romanizations should better use the international romanizations defined for toponyms. But there's not even any reference to those existing standards (widely used in Russia, Chinab Japan, Israel, and Arabic countries). This omission is not forgivable. My opinion is that this paragraph has in fact not been updated since very long as it should have been in this 2015-2 version. Due to that, the names listed in UN/LOCODE are very questionable (and anyway the location codes in UN/LOCODE are largely deprecated in favor of ISO3166-* codes, where available, or names used by IATA or OACI, or postal codes in coutnries that have defined them, or region codes defined by their national or regional statistics institute. 2015-12-17 22:19 GMT+01:00 Doug Ewell <[email protected]>: > UN/LOCODE version 2015-2 has been released [1], and the Manual still > contains the following about character sets: > > "27. Place names in UN/LOCODE are given in their national language > versions as expressed in the Roman alphabet using the 26 characters of > the character set adopted for international trade data interchange, with > diacritic signs, when practicable (cf. Paragraph 3.2.2 of the UN/LOCODE > Manual). International ISO Standard character sets are laid down in ISO > 8859-1 (1987) and ISO10646-1 (1993). (The standard United States > character set (437), which conforms to these ISO standards, is also > widely used in trade data interchange)." > > Spot the errors. > > [1] http://www.unece.org/cefact/codesfortrade/codes_index.html > > -- > Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸 > > >

