> You speak as if date or number formats had nothing to do with language. I
> very
> much disagree. If I have message that says: "The date of the last version
> of
> this document was 2003年3月20日", nobody in their right mind would
> say
> that that is
> correct English.
I never said they would. The correct analysis of that content is that it has two runs
that are in different languages. (So, AFICT your example does not prove anything.)
> The core of what anyone means by locale is the language -- and that means,
> in
> our context, written language, thus including script (Cryl vs Latn) and
> variants
> (such as US vs UK spelling).
I have been putting "language" in quotation marks because the category types involved
include writing system and orthography -- you've heard my presentation on that, so you
know that I agree with you on that particular point.
As for "language" being the core of what anyone means by locale, I have most certainly
said that "language" is one of the defining components of a locale. There may even be
situations (translation software being an example) in which the processing mode does
not care about anything else. But in general, locales -- software processing modes
tailored for cultural user preferences -- *do* involve other non-linguistic
components. Even in an example like translation software where such non-linguistic
components are not needed, the infrastructure for managing the processing mode is
working in terms of parameter bundles that *do* include non-linguistic components. And
distinctions for such non-linguistic components are not in any situation I can think
of useful things to declare regarding linguistic documents.
> The choice of language affects most of what
> people
> traditionally associate with software globalization, including date, time,
> number, currency, formatting & parsing; segmentation (words, lines);
> collation
> and searching; resource bundle choice for translated text & appropriate
> icons,
> etc.
C'mon, Mark. Certainly a choice of language affects how something like a date is
displayed, but it is not the only factor. If I tell you that my language is English,
even English with US spelling, that does *not* tell you how I want my numbers, dates,
times, etc. formatted. It may give you a hint, and that hint may even lead you to do
what I want; but it also might not. (IIRC, you yourself prefer to use a date format
that is *not* what most systems would guess at from being told that your language
preference is US English.) Therefore it is plainly *not* the case that "language" is
all that anybody means by locale. Thus, the premise of your statement
> So if that is all of what someone means by locale, then there is little
> point in
> distinguishing between "locale IDs" and "language IDs".
is not established, and thus the implication is not established.
You are making broad, general comments without considering carefully enough how things
are really used. To repeat something I said earlier, it would not be a good idea to
design a transaction-processing system that makes assumptions about how to interpret
formatted number or currency strings from a language preference, or even from being
told what locale was set on the originating system; I need to know exactly what
determined the formatting of the string I received. *That* is an example of the level
of discussion of scenarios that needs to happen before any meaningful statements about
what a "language" or "locale" ID is and how it should be used. It simply is not good
enough to say "people traditionally associate [language] with ... date [etc.]". You
are trying to justify wrong (IMO) conclusions using inadequate analysis.
Locales in general *do* involve things beyond "language", and it is wrong to put
declarations specifically for such non-linguistic things into an attribute like
xml:lang, and therefore (for instance) entirely unhelpful to refer to RFC3066 tags as
locale tags, as though there were no difference.
I think 20 years of practice in software design have gotten many people stuck in a
rut, but the fact that people have thought in a given way for twenty years doesn't
make it right or desirable.
Peter Constable