Resent with a non-renegade email address... (^8=
À 14:10 2004-07-09, Jony Rosenne a écrit:
I think the problem is with the
concept of default in this case. The default
should be the basis for a specific tailoring, and as a last resort
for
scripts and letters that do not have specific weights, but
Checking the DIN 5007, it indeed says that letters with diacritics are
sorted with the same primary weight (Section 5.1.1.3) and explicitly lists
in 6.2.3.1.1 "overstrikes" as being diacritics, and gives Å as an example of
that.
âMark
- Original Message -
From: "Markus Scherer" <[EMAIL P
Anto'nio Martins-Tuva'lkin wrote:
On 2004.07.12, 15:36, busmanus <[EMAIL PROTECTED]> wrote:
O, yes, and rough transcriptions in brackets do no harm (e.g. at the
first occurrence in the given text), at least if such are available.
This would be (very roughly) something like "Benkà (pron. Benkoh)"
a
The next version of Common Locale Data Repository (v1.2) will be coming out
in mid-October. To manage the work load more effectively in this release,
bug reports or requests for feature enhancements (RFEs) after the start of
September will not be considered for the v1.2 release, except insofar as
t
Mark Davis wrote:
So the question is whether Sybase tailorings, such as German, will be
affected positively or negatively, and to what degree. If a German customer
is accessing a database full of European names, and expects to find Ä with
E, and Ä with A and Å with Z and Å with L, then he will be r
On 2004.07.12, 20:55, Doug Ewell <[EMAIL PROTECTED]> wrote:
> Interesting: António's message was encoded in CP1251, but his usual
> CP1252 signature was still inserted at the bottom, and looked quite
> different with all those Cyrillic characters.
Yep, forgot about that. My mailer's developpers t
Elaine Keown
Tucson
Hi,
Michael Everson wrote:
>6) the Latin alphabet has a lot more than 26 letters
>in it.
I agree with Michael! And Roman/Latin is a growing
scriptit's already the 2nd-most used script in the
world (after Hanzi).
Last year an SIL script expert told me th
Jony Rosenne wrote:
For example, a Israeli oriented tailoring would cause Hebrew to sort first,
Arabic, Latin and Cyrillic to follow in whatever order is desired by the
user, and other scripts would follow in the default ordering. I am not sure
that the current default makes this task possible or e
AntÃnio Martins-TuvÃlkin wrote:
> ÂSoviet official ÐÐ (QruxÑv, pron. Hrueshawf, a.s.a. Krushchov
> etc.) used his shoe in a quite unexpected manner this morning Ââ
Interesting: AntÃnio's message was encoded in CP1251, but his usual
CP1252 signature was still inserted at the bottom, and looke
Title: RE: Problems Reading Saved Files With Unicode Names
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Dominikus Scherkl (MGW)
> Sent: Monday, July 12, 2004 9:51 AM
> > Because the filename was automatically converted with "?"
> > characters, it is considered and inva
On 2004.07.12, 15:36, busmanus <[EMAIL PROTECTED]> wrote:
> O, yes, and rough transcriptions in brackets do no harm (e.g. at the
> first occurrence in the given text), at least if such are available.
> This would be (very roughly) something like "Benkу (pron. Benkoh)"
> and "Benko (pron. Benkur)"
The native character set for Mac OS X is Unicode. Earlier versions of
Mac OS used Apple-proprietary character sets, and some applications
still use those character sets on Mac OS X, though their use is
deprecated.
The mappings for Apple's old character sets are available at:
http://www.unicode.
Title: RE: User Expectations for collation (was Re: Looking for transcription or transliteration standards latin->arabic)
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Mark Davis
> Sent: Monday, July 12, 2004 9:21 AM
> These provide good examples. It would be interestin
I missed Mark's change in subject - so I replied to Marcin's message right
now under the old subject line:
- Original Message -
From: "Marcin 'Qrczak' Kowalczyk" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Saturday, July 10, 2004 01:02
Subject: Re: Looking for transcription or trans
At 01:02 AM 7/10/2004, Marcin 'Qrczak' Kowalczyk wrote:
But there are cases when I would prefer to fold Polish diacritics in
searches.
It's basically every case when you are not sure that all stored data is
using diacritics,
Or when you are unsure how it is spelled, for example, looking up a
perso
William Tay wrote:
> I'd like to understand what character encoding an application that
> runs on MacOS uses. Just as Windows applications generally use code
> pages and UNIX applications use ISO-8859-X character set, what about
> MacOS applications?
>
> Is there any website that shows the encodi
> Because the filename was automatically converted with "?"
> characters, it is considered and invalid file name and I
> can't open this.
How about open them on a system without this problem (like Un*x)?
But be sure to refer to the file with quotes. something like
cp CDROM/'my??file' Windows/D/my_
>I am positive that all of my
tailorings for Sybase will be *affected*, for example. I don't think
they will be *substantially* affected, in the sense of any complete
redefinition of how the tailoring itself is defined. I don't think
they will be *substantially* affected, in the sense of any comple
These provide good examples. It would be interesting to see, of the people
on the [EMAIL PROTECTED] list, how many non-Poles would expect to find the
following orders:
Ab < Äb < Ac
Eb < Äb < Ec
Ob < Ãb < Oc
Ce < Äe < Cy
Ne < Åe < Ny
Sa < Åa < Sy
Za < Åa < Zy
Za < Åa < Zy
and either (a) or (b):
> John [Cowan]'s list is not "a few characters".
Let's take Latin, for starters. There are 1870 entries in the UCA for Latin.
If you subtract from John's list the ones that are already interleaved -- as
I did in my email -- then you get 78 values, or about 4%.
I'll repeat that list again below, s
Nicholas,
1. Use CD ripping software to create an ISO image of the CDROM. Roxio's Easy CD
Creator, Nero, etc - all can do this.
Either:
2a. Use a quality ISO image editor tool to rename the files. You'll have to do a bit
of research to find one but they do exist - it's just been a while sinc
Hi,
I think I have a problem that is related to unicode translation. I have
some files with filenames saved in unicode with special characters. This is
fine as I can open it. The problem began when I had to reconfigure my
computer system and backed up all my files. To do this, I backed it up t
Hi,
I'd
like to understand what character encoding an application that runs
on MacOS uses. Just as Windows applications generally use code pages
and UNIX applications use ISO-8859-X character set, what about MacOS
applications?
Is
there any website that shows the encoding of characters
busmanus wrote:
Mike Ayers wrote:
Interesting case, and one reason why diacritic stripping,
although brutal, may be desireable - it doesn't pretend to be accurate.
An even funnier example than TÃrÅcsik's name, would be
Benkà /bÉnkoË/ and BenkÅ /bÉnkÃË/, two famous musicians of
Hungary.
Alain LaBonté a écrit :
It would be much better to make sorting, matching and searching
consistent with tailored tables of either the UCA or ISO/IEC 14651.
Unfortunately that is not what happens in most products, except in
some good search engines (Google, Altavista and the like, which are
s
25 matches
Mail list logo