I would maintain, that according to the ANSI standard, the output of the
program on the *nix systems is wrong. The length function is defined in the
standard as "Length returns a count of the number of characters in the
argument." So independent of character set the result should be 1. I just
skimmed the standard, and the only qualifier that I saw was that a character
must be a multiple of 8 bits. (I've worked on machines that use 9bit bytes, so
even this restriction would break rexx on some machines.)
Of course I realize that length has always been used to determine the number of
bytes, so fixing this to correspond to the standard would break just about
every program.
I guess I'm hoping that someone can point to a place in the standard that says
characters are exactly 8bits in length.
Bruce
On Feb 15, 2011, at 3:40 AM, Rony G. Flatscher wrote:
> Hi there,
>
> "stumbling" over a surprising two characters for the single German character
> "ß" on the MacOSX, I researched what the situation is on Ubuntu Linux.
>
> It turns out that the console of all these systems returns non-English
> characters as UTF-8 characters.
>
> As a consequence e.g. the German single character "sharp s" ("ß") as an UTF-8
> character consists of 16-bits with a hexadecimal value of "C39F"x (cf.
> <http://www.fileformat.info/info/unicode/char/df/index.htm>).
>
> Running the following Rexx program on Windows (codepage 1250) in a shell:
> parse version v
> say v
> a='ß'
> say a "length:" a~length
> say a "a~c2x: " a~c2x
>
> yields:
> REXX-ooRexx_4.1.0(MT) 6.03 23 Aug 2010
> ß length: 1
> ß a~c2x: DF
>
> Running the same ooRexx program under Ubuntu (and MacOSX) yields:
> REXX-ooRexx_4.1.0(MT) 6.03 23 Aug 2010
> ß length: 2
> ß a~c2x: C39F
> Note that the single character "ß" has suddenly a length of "2" instead of
> "1" as under Windows (and has been the case for the past 30 years).
>
> This is a totally different result yielding all of a sudden an inconsistent
> behaviour of Rexx programs on different platforms, which will break quite a
> few Rexx programs in countries, that have been in a need to use non-English
> characters in the past 30 years (practically everyone living in a country
> where English is either not the main language or the only language, i.e.
> everyone outside of the US, GB, Australia)! So non-English Rexx programs
> running on ooRexx on those systems will mostlikely break, possibly in a very
> subtle manner.
>
> Although this problem has been known for some time, it is now realizing on
> non-Windows-platforms and needs to be addressed ASAP, IMHO!
>
> ---rony
>
> P.S.: Out of curiosity I wrote the following NetRexx program which gets
> translated to Java (Java uses UTF-16, where "ß" is represented by 16-bits
> with a value of "00df"x) by the NetRexx compiler, and gets run under Windows
> with the codepage set to 1250:
> parse version v
> say v
> a="ß"
> say a "length:" a.length
> say a "a.c2x: " a.c2x
>
> output:
> E:\java\scriptJars>java test
> NetRexx 2.05 14 Jan 2005
> ß length: 1
> ß a.c2x: DF
>
>
> ------------------------------------------------------------------------------
> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
> Pinpoint memory and threading errors before they happen.
> Find and fix more than 250 security defects in the development cycle.
> Locate bottlenecks in serial and parallel code that limit performance.
> http://p.sf.net/sfu/intel-dev2devfeb_______________________________________________
> Oorexx-devel mailing list
> Oorexx-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oorexx-devel
------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel