Re: Unicode Normalization on Mac OS X (HFS+ filesystem)

2008-06-07 Thread Chet Ramey

Stephan Kleisinger wrote:

Configuration Information [Automatically generated, do not change]:
Machine: i386
OS: darwin9.3.0
Compiler: /usr/bin/gcc-4.0
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='i386' 
-DCONF_OSTYPE='darwin9.3.0' -DCONF_MACHTYPE='i386-apple-darwin9.3.0' 
-DCONF_VENDOR='apple' -DLOCALEDIR='/opt/local/share/locale' 
-DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -DMACOSX   -I.  -I. -I./include 
-I./lib  -I/opt/local/include -O2
uname output: Darwin cicero.lan 9.3.0 Darwin Kernel Version 9.3.0: Fri 
May 23 00:49:16 PDT 2008; root:xnu-1228.5.18~1/RELEASE_I386 i386

Machine Type: i386-apple-darwin9.3.0

Bash Version: 3.2
Patch Level: 39
Release Status: release

Description:
The Mac OS X Filesystem HFS+ reports filenames in Unicode NFD
Normalization form. As arguments (e.g. open()) all normalization
forms are accepted.
Input for the bash from the Terminal.app is usually in NFC.
This results in problems.
The German Direcory name Bücher (Buecher/Books):
1. Bash completion does not work
   BüTAB - Nothing
   BuTAB - Works
   (glob and Bu* work in the same way)
2. if \w is included in $PS1 the display length is calculated wrong so
   when using the arrow-keys to recall the history the display is 
disrupted

3. when an argument is Completeted (BuTAB - Bücher) the argument
   is in NFD. Deleting the argument results in wrong cursor position


The problem is that there is no way using standard interfaces to
distinguish or convert between the two forms on Mac OS X, at least
none that I have found (and I've been looking at this for some time).
I'm not particularly interested in using Mac OS-specific APIs; I
have no experience with them.

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer

Chet Ramey, ITS, CWRU[EMAIL PROTECTED]http://cnswww.cns.cwru.edu/~chet/




Unicode Normalization on Mac OS X (HFS+ filesystem)

2008-06-06 Thread Stephan Kleisinger

Configuration Information [Automatically generated, do not change]:
Machine: i386
OS: darwin9.3.0
Compiler: /usr/bin/gcc-4.0
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='i386' - 
DCONF_OSTYPE='darwin9.3.0' -DCONF_MACHTYPE='i386-apple-darwin9.3.0' - 
DCONF_VENDOR='apple' -DLOCALEDIR='/opt/local/share/locale' - 
DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -DMACOSX   -I.  -I. -I./ 
include -I./lib  -I/opt/local/include -O2
uname output: Darwin cicero.lan 9.3.0 Darwin Kernel Version 9.3.0: Fri  
May 23 00:49:16 PDT 2008; root:xnu-1228.5.18~1/RELEASE_I386 i386

Machine Type: i386-apple-darwin9.3.0

Bash Version: 3.2
Patch Level: 39
Release Status: release

Description:
The Mac OS X Filesystem HFS+ reports filenames in Unicode NFD
Normalization form. As arguments (e.g. open()) all normalization
forms are accepted.
Input for the bash from the Terminal.app is usually in NFC.
This results in problems.
The German Direcory name Bücher (Buecher/Books):
1. Bash completion does not work
   BüTAB - Nothing
   BuTAB - Works
   (glob and Bu* work in the same way)
2. if \w is included in $PS1 the display length is calculated wrong so
	   when using the arrow-keys to recall the history the display is  
disrupted

3. when an argument is Completeted (BuTAB - Bücher) the argument
   is in NFD. Deleting the argument results in wrong cursor position

Repeat-By:
Use the Standard Terminal.app

[EMAIL PROTECTED]:/tmp $ echo Bücher | hd
  42 c3 bc 63 68 65 72 0a   |Bücher.|   
-NFC

0008
[EMAIL PROTECTED]:/tmp $ mkdir Bücher | hd
[EMAIL PROTECTED]:/tmp $ ls -d B* | hd
  42 75 cc 88 63 68 65 72  0a   | 
Bu?.cher.|  -NFD

0009
[EMAIL PROTECTED]:/tmp $ cd Bü*
bash: cd: Bü*: No such file or directory
[EMAIL PROTECTED]:/tmp $ cd Bu*
[EMAIL PROTECTED]:/tmp/Bücher $

using the history:
[EMAIL PROTECTED]:/tmp/Bücher $
[EMAIL PROTECTED]:/tmp/Büchrenampstree -s Ter
[EMAIL PROTECTED]:/tmp/Bücher $ cd ..

using backspace to delete the completed Filename:
cd BuTAB Backspace
- the Cursor is one char left of its intended position





smime.p7s
Description: S/MIME cryptographic signature