Re: [AOLSERVER] high ASCII in regexp (AOLserver 3.5.1 tcl8.4.1)

2002-11-23 Thread Zoran Vasiljevic
On Friday 22 November 2002 16:38, you wrote: Zoran, Here's a reproducible example of what I'm talking about: wats:nscp 75 encoding system iso8859-1 wats:nscp 76 set u ¾ÆÆ®¹Ìµð¾î ¾ÆÆ®¹Ìµð¾î wats:nscp 77 set u ¾ÆÆ®¹Ìµð¾î wats:nscp 78 regexp {^(.*)$} $u junk m

Re: [AOLSERVER] high ASCII in regexp (AOLserver 3.5.1 tcl8.4.1)

2002-11-23 Thread Dossy
On 2002.11.23, Zoran Vasiljevic [EMAIL PROTECTED] wrote: Using my encoding-aware nscp: [...] $m is the same as $u Very cool! Is there a reason why we wouldn't want your changes to nscp checked into CVS? Any idea what I'm doing wrong? As already posted to the list: the nscp trashes

[AOLSERVER] high ASCII in regexp (AOLserver 3.5.1 tcl8.4.1)

2002-11-22 Thread Dossy
(The following is a message I sent to Zoran off-list, but I figured folks from the list might already know the answer, so I'm sending it to the list as well.) Zoran, Here's a reproducible example of what I'm talking about: wats:nscp 75 encoding system iso8859-1 wats:nscp 76 set u

Re: [AOLSERVER] high ASCII in regexp (AOLserver 3.5.1 tcl8.4.1)

2002-11-22 Thread Dossy
Another interesting behavior: wats:nscp 20 encoding system utf-8 wats:nscp 21 set u ¾ÆÆ®¹Ìµð¾î wats:nscp 22 set m ¾ÆƮ¹̵ð¾î wats:nscp 23 string compare $u $m 0 Not what I would have expected. -- Dossy On 2002.11.22, Dossy [EMAIL PROTECTED] wrote: (The

Re: [AOLSERVER] high ASCII in regexp (AOLserver 3.5.1 tcl8.4.1)

2002-11-22 Thread Rob Mayoff
+-- On Nov 22, Dossy said: Any idea what I'm doing wrong? You're typing iso8859-1 into nscp. nscp doesn't use a Tcl channel for input, so it does no charset translation on that input. Hence the system encoding is irrelevant. You must only send UTF-8 to nscp, and you'll only get UTF-8

Re: [AOLSERVER] high ASCII in regexp (AOLserver 3.5.1 tcl8.4.1)

2002-11-22 Thread Dossy
On 2002.11.22, Rob Mayoff [EMAIL PROTECTED] wrote: +-- On Nov 22, Dossy said: Any idea what I'm doing wrong? You're typing iso8859-1 into nscp. nscp doesn't use a Tcl channel for input, so it does no charset translation on that input. Hence the system encoding is irrelevant. You

Re: [AOLSERVER] high ASCII in regexp (AOLserver 3.5.1 tcl8.4.1)

2002-11-22 Thread Rob Mayoff
+-- On Nov 22, Dossy said: This doesn't make sense. How do you explain this: [deletia] $u is getting set to what I'd expect it to, but $m isn't. Tcl stores strings internally in UTF-8. Sometimes it converts strings to UCS-16 (16-bit characters), for example to do regexp matching, and

Re: [AOLSERVER] high ASCII in regexp (AOLserver 3.5.1 tcl8.4.1)

2002-11-22 Thread Zoran Vasiljevic
On Friday 22 November 2002 16:38, you wrote: Any idea what I'm doing wrong? I will double-check this here but I have to agree with Rob. The ncp channel is NOT encoding-aware. You should not interpret (test/make_conclusion/etc) based on typing into the ncp alone. I have an encoding-aware

Re: [AOLSERVER] high ASCII in regexp (AOLserver 3.5.1 tcl8.4.1)

2002-11-22 Thread Jim Davidson
In a message dated 11/22/2002 11:26:08 AM Eastern Standard Time, [EMAIL PROTECTED] writes: Any idea what I'm doing wrong? I will double-check this here but I have to agree with Rob. The ncp channel is NOT encoding-aware. You should not interpret (test/make_conclusion/etc) based on typing into

Re: [AOLSERVER] high ASCII in regexp (AOLserver 3.5.1 tcl8.4.1)

2002-11-22 Thread Jim Davidson
In a message dated 11/22/2002 11:22:20 AM Eastern Standard Time, [EMAIL PROTECTED] writes: BTW, this is exactly the same problem that I described in http://dqd.com/~mayoff/encoding-doc.html two years ago. ...which, btw, is the guide I used add encoding support to aolserver 3.4 and 4.0. It's a

Re: [AOLSERVER] high ASCII in regexp (AOLserver 3.5.1 tcl8.4.1)

2002-11-22 Thread No Name
In a message dated 11/22/02 11:36:17 AM Eastern Standard Time, [EMAIL PROTECTED] writes: Agree - it's nscp's very simple read code which eval's strings directly without converting to utf8. I suppose we could assume latin1 input and convert to utf8 or perhaps provide a command to set the nscp

Re: [AOLSERVER] high ASCII in regexp (AOLserver 3.5.1 tcl8.4.1)

2002-11-22 Thread Jeff Hobbs
The ncp channel is NOT encoding-aware. You should not Agree - it's nscp's very simple read code which eval's strings directly without converting to utf8. I suppose we could assume latin1 input and convert to utf8 or perhaps provide a command to set the nscp encoding. Tcl has APIs for this