N2441
Proposal to add NEGATIVE CIRCLED DIGIT ZERO
Eric Muller
2002-05-04
http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2441.doc
N2506
Request for subdivision of work - merge 10646 parts 1 and 2 into
single part as edition 3
Ksar
2002-09-24
http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2506.doc
N2512
Proposal
Hmm, one way forward could be to add the 4 letters in question to the
Latin script. There are examples of an analogue to this, namely adding
Latin letters to the Cyrillic script.
Best regards
keld
On Fri, Nov 15, 2002 at 11:17:57AM -0600, [EMAIL PROTECTED] wrote:
One of the Unicode design
A new WG2 document:
N2498 Shaping behaviour of six Syriac letters for Sogdian and Persian
by Michael Everson and Nicholas Sims-Williams
http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2498.pdf
Best regards
Michael, Mike and Keld
On Tue, Oct 29, 2002 at 09:07:16PM +0100, Marco Cimarosti wrote:
Kent Karlsson wrote:
Marco,
Keld, please allow me to begin with the end of your post:
I really have not contributed much to this thread, I think you mean
Kent.
Best regards
keld
On Thu, Oct 10, 2002 at 07:14:57AM -0400, Winkler, Arnold F wrote:
Tex,
Here is my recollection:
Sometime around 1991 in a IEEE P1003.1 (POSIX) meeting, Gary Miller (IBM)
was writing on the blackboard. After having spelled out
Internationalization a few times, he first abbreviated it to
On Thu, Oct 03, 2002 at 08:58:47AM -0700, Doug Ewell wrote:
Kenneth Whistler kenw at sybase dot com wrote:
Attempting to extend the system to Greek, Cyrillic, Hebrew, and Arabic
just (in my opinion) results in mnemonics that are harder to remember
than the character names, even. What is
On Wed, Oct 02, 2002 at 02:47:42PM -0400, John Cowan wrote:
Mark Davis scripsit:
Those mnemonics in (http://www.faqs.org/rfcs/rfc1345.html) are pretty
useless in practice, as well as being misnamed. From Websters: assisting or
intended to assist memory. So what about the combination ;S
On Fri, Aug 09, 2002 at 11:44:40PM +0100, Anto'nio Martins-Tuva'lkin wrote:
Hm. But middle dot is not also a letter symbol. It's also used as a
bullet, a tab filling, even a box-drawing char. Shouldn't Unicode
provide a way to separate this duality?
· has traditionally been used eg in word
On Mon, Jul 29, 2002 at 04:44:35PM -0700, Kenneth Whistler wrote:
Keld wrote:
In Linux,
*Which* Linux? :-) Caldera OpenLinux, Corel Linux, Debian GNU/Linux,
Elfstone Linux, Libranet Linux, Linux-Mandrake, Phat Linux, Red Hat Linux,
Slackware Linux, Stampede GNU/Linux, Storm Linux, SuSE
On Mon, Jul 29, 2002 at 07:45:18PM -0700, Curtis Clark wrote:
Keld Jørn Simonsen wrote:
I dont think using in a new orthography is a good idea.
This was indeed my surmise, and I'm glad to see agreement.
On the other hand, is creeping into old orthographies.
For Danish like in netc@fé
On Mon, Jul 29, 2002 at 01:05:49PM -0500, [EMAIL PROTECTED] wrote:
On 07/26/2002 10:46:16 PM Addison Phillips [wM] wrote:
That does leave you with the must less happy problem of finding a platform
with
user defined locales (approximately no platforms conveniently do this).
Indeed, a
On Mon, Jul 29, 2002 at 03:18:55PM -0500, [EMAIL PROTECTED] wrote:
On 07/29/2002 03:06:41 PM Keld Jørn Simonsen wrote:
That does leave you with the must less happy problem of finding a
platform
with
user defined locales (approximately no platforms conveniently do this).
Indeed
On Sat, Jul 27, 2002 at 09:52:25PM -0400, Jungshik Shin wrote:
On Sat, 27 Jul 2002, David Starner wrote:
At 08:46 PM 7/26/02 -0700, Addison Phillips [wM] wrote:
That does leave you with the must less happy problem of finding a platform
with user defined locales (approximately no
On Mon, Jul 29, 2002 at 03:21:03PM -0700, Kenneth Whistler wrote:
It's *much* easier -- and, in the long term, safer -- for them to
select from the extensive inventory of characters available in Unicode and
to avoid using ASCII punctuation characters with redefined word-building
On Mon, Jul 29, 2002 at 03:04:56PM -0700, Addison Phillips [wM] wrote:
Keld wrote:
It's *much* easier -- and, in the long term, safer -- for them to
select from the extensive inventory of characters available in
Unicode and
to avoid using ASCII punctuation characters with redefined
On Thu, Jun 27, 2002 at 11:59:14AM +0200, Lars Marius Garshol wrote:
This list has previously told me that the characters 0x80 - 0x9F in
ISO 8859-1 are a particular set of control characters from ISO 6429.
I also see that the ISO 8859-1 mapping published on unicode.org maps
these characters
On Thu, Jun 27, 2002 at 08:03:05AM -0400, Thomas E. Dickey wrote:
On Thu, 27 Jun 2002, Keld Jørn Simonsen wrote:
What people usually use is ISO 6429, this is eg what is used in
IETF charset definitions for the iso-8859 series.
6249 isn't the character-set definition - it's the control
On Thu, Jun 27, 2002 at 03:54:00PM -0400, Thomas Dickey wrote:
On Thu, Jun 27, 2002 at 03:43:30PM +0200, Keld J?rn Simonsen wrote:
On Thu, Jun 27, 2002 at 08:03:05AM -0400, Thomas E. Dickey wrote:
On Thu, 27 Jun 2002, Keld Jørn Simonsen wrote:
What people usually use is ISO 6429
On Wed, Jun 26, 2002 at 12:04:28PM -0400, Tex Texin wrote:
Hi Keld,
The livelink page had a link to proceed to public areas without going
thru the password.
That is how I got to the URL to the zip I mentioned below.
So, we can access the zip on your site now without passwords? If so that
On Wed, Mar 27, 2002 at 11:28:09AM +0200, Theo Veenker wrote:
Suppose I want to enable mnemonic input in my software. Using
mnemonics allows one to write e' (of course embedded in some
escape sequence) instead of \u00e9 or eacute; Which sets of
mnemonics are being used or should I use? I
On Wed, Mar 27, 2002 at 11:59:10AM -0500, Jungshik Shin wrote:
On Wed, 27 Mar 2002, Dan Kogai wrote:
On Wednesday, March 27, 2002, at 11:22 , Jungshik Shin wrote:
IMHO, you're also misusing the term 'charset' here. MIME charset
can be used synonymously with 'encodings' (or
On Tue, Mar 26, 2002 at 03:32:16PM +, Kevin Bracey wrote:
In message p0510100ab8c630b43d7b@[193.120.113.138]
Michael Everson [EMAIL PROTECTED] wrote:
Dear colleagues,
I have written an open letter to the Minister for Finance, Charlie
McCreevy, about the serious problem
On Thu, Mar 14, 2002 at 04:30:59PM +, Michael Everson wrote:
At 07:59 -0800 2002-03-14, Doug Ewell wrote:
Since
that time, three institutions in Norway and Finland helped fund a
small project team to sort out characters needed to complete support
for the Uralic Phonetic Alphabet
On Wed, Mar 06, 2002 at 09:09:27PM +, Michael Everson wrote:
At 12:19 -0800 2002-06-03, Asmus Freytag wrote:
At 01:20 PM 3/6/02 +, Michael Everson wrote:
I am writing with Eudora version 5.1b16 for Mac OS X, using the Mac
Roman character set because Eudora doesn't support Unicode.
On Sat, Mar 02, 2002 at 03:52:55PM -0500, Tex Texin wrote:
Doug says at the bottom of this missive:
But this is all very OT and I'd better stop now, because I know how
quickly this discussion can devolve into Operating System Wars
But the locales group would very much like to see more
On Fri, Mar 01, 2002 at 08:49:27AM -0800, Doug Ewell wrote:
Locale systems that force you to pick one immutable set of conventions
for a given country are broken in general I remember having to tell
MS-DOS that I was in South Africa or someplace, just to get my directory
listing the way I
A related issue is whether to use initials at all. I really
do not like if people write my name with initials, and
I think it is commonplace in Denmark not to use initials.
A quick glance in the local telephone book gives that
most people do not just give initials there (while some do)
while eg
On Thu, Feb 14, 2002 at 03:15:57PM +, David Hopwood wrote:
-BEGIN PGP SIGNED MESSAGE-
Keld Jørn Simonsen wrote:
On Thu, Feb 14, 2002 at 03:57:34PM +, Juliusz Chroboczek wrote:
MK What we are trying to establish is the exact meaning that UNICODE
MK ought to have
On Thu, Feb 14, 2002 at 03:57:34PM +, Juliusz Chroboczek wrote:
MK What we are trying to establish is the exact meaning that UNICODE
MK ought to have - that is, if it can have one at all.
In the Unix-like world, the term ``UTF-8'' has been used quite
consistently, and most documentation
On Mon, Jan 21, 2002 at 06:14:55PM +0100, Stefan Persson wrote:
- Original Message -
From: Lars Marius Garshol [EMAIL PROTECTED]
To: Unicoders [EMAIL PROTECTED]
Sent: den 21 januari 2002 15:16
Subject: Re: Norwegian sorting
I doubt that there is an official standard for this,
On Mon, Jan 21, 2002 at 11:11:43AM -0500, Tex Texin wrote:
Thanks Keld, that was one of the sources I checked first.
I saw that it was based on a Norwegian standard, but it didn't say what
the standard was used for. So I didn't know if this was a collation that
dictionaries or phone books
On Mon, Jan 21, 2002 at 07:27:07AM -0500, Tex Texin wrote:
I gave a course in internationalization last week, and one of the slides
I used indicated that in Norwegian u-umlaut sorts with Y between X and
Z. Some Norwegians attending disputed this. I see this is referenced
elsewhere as well and
On Wed, Oct 10, 2001 at 07:53:51PM +0100, Michael Everson wrote:
The Roadmaps to Unicode and ISO/IEC 10646 have been maintained by the
ad-hoc committee on the Roadmap, which consists of Michael Everson,
Rick McGowan, and Ken Whistler. They were hosted on Michael Everson's
site, and this
On Wed, Oct 10, 2001 at 02:25:17PM -0700, Carl W. Brown wrote:
I disagree with ISO's policy of charging for all of the public standards.
I would also rather that they be available for free. Some are.
I actually have some ideas to make all ISO IT-standards available for free,
with the
On Tue, Sep 11, 2001 at 06:27:20PM +0200, Stefan Persson wrote:
- Original Message -
From: Keld Jørn Simonsen [EMAIL PROTECTED]
To: Stefan Persson [EMAIL PROTECTED]
Cc: Mark Davis [EMAIL PROTECTED]; Michael (michka) Kaplan
[EMAIL PROTECTED]; Keld Jørn Simonsen [EMAIL PROTECTED
On Mon, Sep 10, 2001 at 11:09:28AM +0200, Marco Cimarosti wrote:
Asmus Freytag wrote:
But if you do this, all compound words starting with data
and continuing
with another word starting with a will be sorted incorrectly!
To achieve this effect, you would have to mark which AAs are
On Mon, Sep 10, 2001 at 03:58:05PM +0200, Marco Cimarosti wrote:
On Mon, Sep 10, 2001 at 11:09:28AM +0200, Marco Cimarosti wrote:
Asmus Freytag wrote:
But if you do this, all compound words starting with data
and continuing
with another word starting with a will be sorted
Where is this done for swedish? I have read both the TN and the SIS
standard, and I dont believe these say something on sorting
ü according to either German or Dutch sounds. Rolf Gavare does not
say something along this either, as far as I can remember.
Kind regards
keld
On Mon, Sep 10, 2001
On Sat, Sep 08, 2001 at 06:38:57PM -0700, Carl W. Brown wrote:
Asmus,
If you are entering Danish city names then enter it as Ålborg. You should
only use Aalborg where the font does not support Å. For matching logic you
can equate Å to Aa then the issue of compound words goes away.
well,
On Sun, Sep 09, 2001 at 06:04:30PM +0200, Stefan Persson wrote:
- Original Message -
From: Keld Jørn Simonsen [EMAIL PROTECTED]
To: Carl W. Brown [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: den 9 september 2001 14:21
Subject: Re: [OT] o-circumflex
On Sat, Sep 08, 2001 at 06
On Fri, Jul 13, 2001 at 02:14:25AM +0100, David Starner wrote:
As someone involved in the service I often wish there was some
form of compressed Unicode encoding. The 3-byte penalty that
Ethiopic bears under UTF-8 turns into higher bandwidth that web
hosting services meter and charge for
On Wed, May 30, 2001 at 02:31:19AM -0400, [EMAIL PROTECTED] wrote:
In a message dated 2001-05-29 4:28:09 Pacific Daylight Time,
[EMAIL PROTECTED] writes:
The goal is to improve an existing program I wrote which automatically
detects the encoding form of Cyrillic text (8-bit character
On Mon, May 28, 2001 at 12:11:45PM -0400, John Cowan wrote:
[EMAIL PROTECTED] scripsit:
I am trying to build a Unicode-based transliteration table from Cyrillic to
7-bit ASCII and would like to request the assistance of the Unicode list
members.
Note that what you are doing is
On Fri, Apr 13, 2001 at 11:32:16AM -0700, Markus Scherer wrote:
It looks to me like the "Cp" names might be IBM CCSIDs. For those, have a look at
the "ibm-" names in ICU's alias table at
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/data/convrtrs.txt
Note that ICU uses "cp" to mean
On Fri, Apr 13, 2001 at 09:54:25PM +0430, Roozbeh Pournader wrote:
On Thu, 12 Apr 2001, Kenneth Whistler wrote:
"Installing a glibc-2.2 prerelease or release replaces the C library
every program on your system uses. Therefore it has some risks, in
particular your Linux system may
On Thu, Apr 12, 2001 at 10:34:02AM +0100, Neil Shadrach wrote:
What implementations of this are ( freely ) available?
Is there a Perl version around?
( I'm aware of the Java demo )
The ISO 14651 standard is implemented in glibc 2.2
Keld
My email reply to a previous posting on ucs use
in the ISO C and C++ programming languages was unfortunately
bounced because the mailer software thought that I was citing too much.
I think that was not productive, and would like to ask if the
list maintainers woul have a look at the parameters
On Fri, Apr 06, 2001 at 02:51:16PM -0400, Dennis L. Goyette wrote:
What would really be nice, is for glibc-2.2 or any other unicode enabled
library to display unicode characters,etc by juts using the "escape"
sequence \u, where X represents a hexadecimal value..
Well, this is
On Fri, Mar 09, 2001 at 10:56:30AM -0800, Yves Arrouye wrote:
Since the U in UTF stands for Unicode, UTF-32 cannot represent more than
what Unicode encodes, which is is 1+ million code points. Otherwise, you're
talking about UCS-4. But I
thought that one of the latest revs of ISO 10646
On Mon, Mar 05, 2001 at 03:26:49PM -0800, Kenneth Whistler wrote:
Mike Sykes asked:
Can anyone tell me whether there is any prospect of terminology being
harmonised or reconciled between Unicode and ISO 10646?
Gradually--over the long run. The Unicode Glossary has already added some
On Wed, Feb 28, 2001 at 01:11:20PM -0800, Frank da Cruz wrote:
The idea behind UTF-8 is to be able to use it in non-Unicode-aware UNIX
versions: It lets you have Unicode filenames, Unicode directory names,
Unicode file contents, Unicode email, etc. But what it does not do is let
you *type*
On Mon, Feb 26, 2001 at 01:02:43PM -0800, Tex Texin wrote:
Perhaps the real question is what is the criteria for including or
excluding a fictional script. I have deleted John's mail, but
his criteria applied more broadly than Klingon if I recall.
Should we worry about elvish communication
On Tue, Feb 20, 2001 at 05:53:47PM -0800, Tex Texin wrote:
Hi,
I am updating my information on Slovak collation. I notice that
Croatian collation is different from what I had formerly been
using as Czech collation. I am not sure if Slovak collation
should be considered the same as
On Sun, Feb 18, 2001 at 01:46:25PM -0800, Carl W. Brown wrote:
I noticed that ISO has created the iso-8859-15 to replace the iso-8859-1
code page with one with the Euro symbol. I can not find a Greek code page
with the Euro symbol. I think that it is an issue since Greece is on of the
12
The email program I am using, mutt, can do this.
Kind regards
keld Simonsen
On Mon, Feb 12, 2001 at 02:55:41PM -0800, Michael (michka) Kaplan wrote:
What mail program are you using?
Many of them (Exchange, Outlook, etc.) do not support this. Some do not even
support international text in
On Fri, Dec 29, 2000 at 07:23:11PM -0800, Patrick Andries wrote:
However, the questions -- as I see them -- are : should they all speak
only English as a foreign language, why do they learn only one foreign
language (just next to them there are 100 millions native German
speakers...) and
On Thu, Nov 30, 2000 at 05:18:59AM -0800, Brendan Murray/DUB/Lotus wrote:
Branislav Tichy [EMAIL PROTECTED] wrote:
b) there are compound words, which have these sequences on a word border,
and in this case, they stands for two separate graphemes and _are_ sorted
as c+h, d+z a.s.f.
the
On Thu, Nov 30, 2000 at 07:52:37AM -0800, Brendan Murray/DUB/Lotus wrote:
Keld Jørn Simonsen [EMAIL PROTECTED] wrote:
I have no examples off my head on Danish names
where "aa" actually means two a-s, pronounced as two sounds.
I know of at least one - what about "Ha
On Thu, Nov 30, 2000 at 09:22:54AM -0800, Brendan Murray/DUB/Lotus wrote:
Keld Jørn Simonsen [EMAIL PROTECTED] wrote:
Anyway, you may have been fooled by the "g" which may be numb,
or pronounced like a short "u". so it is:
Haa-ge-man
Hå ue man
Nope
On Thu, Nov 30, 2000 at 03:44:00AM -0800, Branislav Tichy wrote:
hello,
this subject (or alike) has been probably already discussed, but let me
ask one more question about it: sequences vrs collating
i have recently read the page //www.unicode.org/unicode/standard/where/
and i basically
Why don't you just use the notation in ISO/IEC 15897
- the cultural registry - for this, or the Open Group (UNIX)
convention? I think there is no need to reinvent the wheel.
Kind regards
Keld
On Fri, Sep 15, 2000 at 05:56:39PM -0800, Carl W. Brown wrote:
I am working on a new locale proposal.
On Sat, Sep 02, 2000 at 04:56:23PM -0800, Doug Ewell wrote:
David Starner [EMAIL PROTECTED] wrote:
the Euro currency symbol. A separate "fr_FR_Euro" locale would be
fr_FR@EURO is the way Sun does it.
OK, now how does Sun represent my modified "en_US" locale with
2000-09-02 date
On Fri, Sep 01, 2000 at 08:11:02PM -0800, Doug Ewell wrote:
/|/|ike Ayers [EMAIL PROTECTED] wrote:
BTW, I've gotten confused during this thread over the naming of
country codes, etc. There are ISO specs, RFCs, POSIX specs (and
more?)... Is this information conveniently summarized
On Sat, Sep 02, 2000 at 10:50:25AM -0800, Doug Ewell wrote:
Keld Jørn Simonsen [EMAIL PROTECTED] wrote:
The standard for two-letter language codes is ISO 639-1. There is
also an ISO 639-2 (actually, there are two variants) that specifies
three-letter language codes.
Well, ISO 639-1
On Thu, Aug 31, 2000 at 07:25:43AM -0800, Paul Deuter wrote:
Does someone know the full locale string for Norwegian - Bokmal and
Norwegian Nynorsk?
In Windows the LCID is different for the two (0x414 and 0x814 respectively).
However in
Internet Explorer - the locale id is set to "no" for
On Thu, Aug 31, 2000 at 08:19:50AM -0800, Lars Marius Garshol wrote:
* Paul Deuter
|
| I was wondering if anyone knew that actual fully qualified string
| for these two. Is it "no-bokmal" and "no-nynorsk"?
If you are looking for the RFC 1766 identification tags, those are
no-bok and
On date and time formatting:
The forthcoming ISO TR 14652 can handle data formats in various ways,
including non-gregorian calendar systems like Japanese, Chinese, Hebrew
and Arabic calenders.
Check it out at http://www.dkuug.dk/jtc1/sc22/wg20 and see under
the current draft for 14652. There is
There is a new ISO standard coming out for a default collation,
namely ISO 14651, and a Unicode technical report too, which
should be equivalent technically. This should also be apllicable
to subsets of 10646, like the one you are indication (which I
read as 8859-1-ish). Nowadays I would
68 matches
Mail list logo