Hi, Doug: Due to many different use of scripts, and look-alike symbols, some sort of classification is unavoidable.
For Latin, you are correct: > Spanish or Italian? Should we care? No, we don't care, because Latin script has taken care of the some of the spoken language difference for us already. So does Arabic, Cyrillic and Chinese scripts. However, the problem is a lot more complex for those mixed scripts, such as Japanese and Vietnamese. Japanese is not only phonetically different from Chinese, it is grammatically completely different from Chinese too. Even we are dealing with structured data, the difference so great, that we have to consider to treat them differently, thus the use of "language tags" or "script tags". The Mogolian and Han mixed use is only another example for such a case, which the Chinese group has not raised before this group - They have too much on their hands already :-( I have used "language tag" instead of "script tag", because 1) different languages using the same script, as the CJK cases. 2) "language tag" has been in [ISO639] already, we don't need argue if Cantonese needs a tag or not. That is an issue has been solved by [ISO639]. 3) We can use the tags already defined, but IETF doesn't need to implement every language tag defined in [ISO639], it is up to engineering consideration to cover all the cases well enough to facilitate communications on DNS. If this tag issue is raised in IDN down the line, for example, regarding different uses of diacritic marks between French and Dutch, then it is a challenge to the design of DNS tag coverage we are doing now, since we should have taken care of this issue among Latin users when DNS tag coverage is discussed. So for language tags we have to solve at this stage, I would suggest: CJK Latin Cyrillic Arabic Bengali Greek Although Greek does not necessarily cover a lot native users, it is familar to many Latin users, and serve a good case study for discussion. Liana On Mon, 26 Nov 2001 11:40:52 EST [EMAIL PROTECTED] writes: > In a message dated 2001-11-26 0:31:52 Pacific Standard Time, > [EMAIL PROTECTED] writes: > > > Have you thought about " Mixed language URLs " > > with language tags, for example: > > > > www.zh-china/mo-mogolia/zh-county/mybusiness.com > > > > shall be able to work? > > I thought one of the fundamental characteristics of domain names, > host names, > URLs, etc. is that they were identifiers, not true names, and hence > they were > not intended to be language-tagged. > > Just as an example, two popular search engines are teoma.com and > altavista.com. What language is "Teoma"? Is "Alta Vista" supposed > to be > Spanish or Italian? Should we care? > > -Doug Ewell > Fullerton, California
