Steve,
You're right and I'm wrong, I was confused by the UTF code numbers,
which differ from the actual byte encodings used for UTF8. Indeed,
all the multi-byte higher-order stuff is stuffed into 128-255 in the
UTF8 encoding, so a straight byte-swap would work (for UTF8 and the
various one-byte latin code pages, that is).
Paul
On 21-Jun-07, at 10:30 AM, Stephen Woodbridge wrote:
Hmmmm, I am probably wrong on this but I thought 0x0 - 0x7f are
standard UTF8 characters with a constant meaning that is the same
as ascii for those bytes, and the all multi-byte characters had to
have a the highorder bit set to indicate is was part of a multibyte
sequence.
I was not under the impresion that at you could have 0x0 - 0x7f as
a part of a multi-byte sequence. I am not an expert in this area
and probably just know enough to mislead you ;) but I think it is
worthwhile getting some additional inside into this. I for one
would like to see a multi-byte UTF8 sequence with \r embedded in it.
-Steve
Paul Ramsey wrote:
Danger, will Robinson. All values are fair game in bytes 2,3,4 of
the UTF encodings, so yes, it's possible you'll wreck multi-byte
characters by doing a simple replacement on the byte array.
Better to use an encoding-aware string replace function (not
knowing C, I don't know what that would be, but there must be some
in the PgSQL code base).
P
On 21-Jun-07, at 7:03 AM, Joe Conway wrote:
Obe, Regina wrote:
Joe,
Can you take a look at it again. It was messed up in my
firefox too. I think originally I had it looking right in
Firefox, but then IE it didn't look right so I changed it to
look right in IE, but forgot to check back in firefox.
Hopefully this time I have made all browser masters happy.
http://www.bostongis.com/PrinterFriendly.aspx?
content_name=postgresql_plr_tut02
The tutorial looks perfect now in Firefox on Fedora Core 7.
BTW, I have confirmed on the R-devel list that the R engine is
expecting \n for EOL, and \r will cause a syntax error, on all
platforms. I will probably fix this by simply replacing \r with
\n in PL/R functions. My only reservation is whether this might
cause issues for installations with multibyte characters. Does
anyone know if it is possible for multibyte characters to include
a byte = 13 (\r), i.e. is the simple replacement of \r safe in
all locales?
Thanks,
Joe
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users