Alain LaBonté a écrit :
It would be much better to make sorting, matching and searching
consistent with tailored tables of either the UCA or ISO/IEC 14651.
Unfortunately that is not what happens in most products, except in
some good search engines (Google, Altavista and the like, which are
busmanus wrote:
Mike Ayers wrote:
Interesting case, and one reason why diacritic stripping,
although brutal, may be desireable - it doesn't pretend to be accurate.
An even funnier example than Trcsik's name, would be
Benk /bnko/ and Benk /bnk/, two famous musicians of
Hungary.
Hi,
I'd
like to understand what character encodingan application that runs
onMacOS uses. Just as Windows applications generally use code pages
andUNIX applications useISO-8859-X character set, what about MacOS
applications?
Is
there any websitethat shows the encoding of characters of the
Hi,
I think I have a problem that is related to unicode translation. I have
some files with filenames saved in unicode with special characters. This is
fine as I can open it. The problem began when I had to reconfigure my
computer system and backed up all my files. To do this, I backed it up
Nicholas,
1. Use CD ripping software to create an ISO image of the CDROM. Roxio's Easy CD
Creator, Nero, etc - all can do this.
Either:
2a. Use a quality ISO image editor tool to rename the files. You'll have to do a bit
of research to find one but they do exist - it's just been a while
John [Cowan]'s list is not a few characters.
Let's take Latin, for starters. There are 1870 entries in the UCA for Latin.
If you subtract from John's list the ones that are already interleaved -- as
I did in my email -- then you get 78 values, or about 4%.
I'll repeat that list again below,
These provide good examples. It would be interesting to see, of the people
on the [EMAIL PROTECTED] list, how many non-Poles would expect to find the
following orders:
Ab b Ac
Eb b Ec
Ob b Oc
Ce e Cy
Ne e Ny
Sa a Sy
Za a Zy
Za a Zy
and either (a) or (b):
a) La a Ly//
I am positive that all of my
tailorings for Sybase will be *affected*, for example. I don't think
they will be *substantially* affected, in the sense of any complete
redefinition of how the tailoring itself is defined. I don't think
they will be *substantially* affected, in the sense of any
Because the filename was automatically converted with ?
characters, it is considered and invalid file name and I
can't open this.
How about open them on a system without this problem (like Un*x)?
But be sure to refer to the file with quotes. something like
cp CDROM/'my??file'
William Tay wrote:
I'd like to understand what character encoding an application that
runs on MacOS uses. Just as Windows applications generally use code
pages and UNIX applications use ISO-8859-X character set, what about
MacOS applications?
Is there any website that shows the encoding of
At 01:02 AM 7/10/2004, Marcin 'Qrczak' Kowalczyk wrote:
But there are cases when I would prefer to fold Polish diacritics in
searches.
It's basically every case when you are not sure that all stored data is
using diacritics,
Or when you are unsure how it is spelled, for example, looking up a
I missed Mark's change in subject - so I replied to Marcin's message right
now under the old subject line:
- Original Message -
From: Marcin 'Qrczak' Kowalczyk [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Saturday, July 10, 2004 01:02
Subject: Re: Looking for transcription or
Title: RE: User Expectations for collation (was Re: Looking for transcription or transliteration standards latin-arabic)
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
Behalf Of Mark Davis
Sent: Monday, July 12, 2004 9:21 AM
These provide good examples. It would be interesting to
The native character set for Mac OS X is Unicode. Earlier versions of
Mac OS used Apple-proprietary character sets, and some applications
still use those character sets on Mac OS X, though their use is
deprecated.
The mappings for Apple's old character sets are available at:
Title: RE: Problems Reading Saved Files With Unicode Names
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
Behalf Of Dominikus Scherkl (MGW)
Sent: Monday, July 12, 2004 9:51 AM
Because the filename was automatically converted with ?
characters, it is considered and invalid file
Antnio Martins-Tuvlkin antonio at tuvalkin dot web dot pt wrote:
Soviet official (Qruxv, pron. Hrueshawf, a.s.a. Krushchov
etc.) used his shoe in a quite unexpected manner this morning
Interesting: Antnio's message was encoded in CP1251, but his usual
CP1252 signature was still inserted at
Jony Rosenne wrote:
For example, a Israeli oriented tailoring would cause Hebrew to sort first,
Arabic, Latin and Cyrillic to follow in whatever order is desired by the
user, and other scripts would follow in the default ordering. I am not sure
that the current default makes this task possible or
Mark Davis wrote:
So the question is whether Sybase tailorings, such as German, will be
affected positively or negatively, and to what degree. If a German customer
is accessing a database full of European names, and expects to find with
E, and with A and with Z and with L, then he will be
The next version of Common Locale Data Repository (v1.2) will be coming out
in mid-October. To manage the work load more effectively in this release,
bug reports or requests for feature enhancements (RFEs) after the start of
September will not be considered for the v1.2 release, except insofar as
Anto'nio Martins-Tuva'lkin wrote:
On 2004.07.12, 15:36, busmanus [EMAIL PROTECTED] wrote:
O, yes, and rough transcriptions in brackets do no harm (e.g. at the
first occurrence in the given text), at least if such are available.
This would be (very roughly) something like Benk (pron. Benkoh)
and
Checking the DIN 5007, it indeed says that letters with diacritics are
sorted with the same primary weight (Section 5.1.1.3) and explicitly lists
in 6.2.3.1.1 overstrikes as being diacritics, and gives as an example of
that.
Mark
- Original Message -
From: Markus Scherer [EMAIL
Resent with a non-renegade email address... (^8=
À 14:10 2004-07-09, Jony Rosenne a écrit:
I think the problem is with the
concept of default in this case. The default
should be the basis for a specific tailoring, and as a last resort
for
scripts and letters that do not have specific weights,
22 matches
Mail list logo