Hi Mark,
A big-endian (motorola) unicode character will be in the form : msb
lsb, so if the character falls within the ascii range, say "A",
then it will be <numToChar(65) numToChar(0)>.
If it's in little-endian (intel) format, the same char will be
<numToChar(0) numToChar(65)>.
Unidecode simply removes the most significant byte of each unicode
char/pair, so on Intel, thats the second byte, and on motorola
that's the first byte.
Yep, that's what I read in the docs.
But the docs also read:
"The ability to handle double-byte characters on "little-endian"
processors was added in version 2.0. In previous versions, the
uniDecode function always removed the second byte of each pair of
bytes, regardless of platform."
This gives me the impression that the function itself will take care
of the differences between the processors -> "...regardless of
platform"!
Maybe I am wrong?
So the upshot is that if your data is big-endian (motorola), then
to work with unidecode on intel, you'll need to swap each pair of
bytes.
function swapBytes pString
repeat with n = 1 to length(pString) - 1 step 2
put char n+1 of pString & char n of pString after swappedString
end repeat
return swappedString
end swapBytes
Thanks a lot, will try this (well maybe... ;-)
I'm hoping that we'll get a complete revamp of Revs unicode
handling, one of these days, but we're stuck with this sort of
thing for now. :(
Best,
Mark
Regards from germany
Klaus Major
[EMAIL PROTECTED]
http://www.major-k.de
_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution