CVTUTF.C bug question

2000-08-23 Thread Oliver Steinau

I have a question concerning the CVTUTF.C file that is on the CD in the
Unicode 3.0 book. There's a piece of code which I don't think is correct...

Function ConvertUTF8toUTF16 contains the following piece of code (line 210):
[ch is the result of the conversion so far]:

if (ch = kMaximumUCS2) {
*target++ = ch;
} else if (ch  kMaximumUCS4) {
*target++ = kReplacementCharacter;
} else {
if (target + 1 = targetEnd) {
result = targetExhausted; break;
};
ch -= halfBase;
*target++ = (ch  halfShift) + kSurrogateHighStart;
*target++ = (ch  halfMask) + kSurrogateLowStart;
};

with kMaximumUCS2 = 0x and kMaximumUCS4 = 0x7fff.

Shouldn't the first comparison read "if (ch = kMaximumUTF16)..."?

In addition, function ConvertUTF8toUCS4 is a **COPY** of ConvertUTF8toUTF16,
which sure isn't what it's intended to be. To correct this, would it be
correct to just replace the above code with

if (ch  kMaximumUCS4) {
ch = kReplacementCharacter;
}
*target++ = ch;

? It would be great if someone could comment on this...

Thanks a lot,

/oliver



 winmail.dat


Re: CVTUTF.C bug question

2000-08-23 Thread Mark Davis

Thanks. That code does need to be fixed, once we get the time.

Oliver Steinau wrote:

 I have a question concerning the CVTUTF.C file that is on the CD in the
 Unicode 3.0 book. There's a piece of code which I don't think is correct...

 Function ConvertUTF8toUTF16 contains the following piece of code (line 210):
 [ch is the result of the conversion so far]:

 if (ch = kMaximumUCS2) {
 *target++ = ch;
 } else if (ch  kMaximumUCS4) {
 *target++ = kReplacementCharacter;
 } else {
 if (target + 1 = targetEnd) {
 result = targetExhausted; break;
 };
 ch -= halfBase;
 *target++ = (ch  halfShift) + kSurrogateHighStart;
 *target++ = (ch  halfMask) + kSurrogateLowStart;
 };

 with kMaximumUCS2 = 0x and kMaximumUCS4 = 0x7fff.

 Shouldn't the first comparison read "if (ch = kMaximumUTF16)..."?

 In addition, function ConvertUTF8toUCS4 is a **COPY** of ConvertUTF8toUTF16,
 which sure isn't what it's intended to be. To correct this, would it be
 correct to just replace the above code with

 if (ch  kMaximumUCS4) {
 ch = kReplacementCharacter;
 }
 *target++ = ch;

 ? It would be great if someone could comment on this...

 Thanks a lot,

 /oliver

   --
   Name: winmail.dat
winmail.datType: DAT File (application/x-unknown-content-type-dat_auto_file)
   Encoding: base64