Turns out that all that was unnecessary, and the program would have saved and restored the UTF fine if I just hadn't tried to blindly untaint the data with...
sub untaint_blind { $_[0] =~ /^(.*)$/; my $ret = $1; $ret; } This is perl5.6.1 -----Original Message----- From: Jarkko Hietaniemi [mailto:[EMAIL PROTECTED]] Sent: Friday, January 10, 2003 1:39 PM To: Merijn van den Kroonenberg Cc: Narins, Josh; [EMAIL PROTECTED] Subject: Re: beginniner's 5.6.1 latin1<->utf8 question On Fri, Jan 10, 2003 at 07:28:00PM +0100, Merijn van den Kroonenberg wrote: > You might be looking for these: > > > # ISO 8859-1 to UTF-8 > s/([\x80-\xFF])/chr(0xC0|ord($1)>>6).chr(0x80|ord($1)&0x3F)/eg; > > # UTF-8 to ISO 8859-1 > s/([\xC2\xC3])([\x80-\xBF])/chr(ord($1)<<6&0xC0|ord($2)&0x3F)/eg; > > I think that will work (they are not mine, so don't blame me if not > ;-) They are mine :-) so I feel free to say that they don't &#NNN; conversion... but they certainly could be changed to work so. > Greetings, Merijn > > ----- Original Message ----- > From: "Narins, Josh" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Friday, January 10, 2003 6:54 PM > Subject: beginniner's 5.6.1 latin1<->utf8 question > > > > > > At one point I had a regex which perfectly converts the string A > > below > into > > a series of ê strings. > > This is nice for me, because I just sling them out on the web, and > > as entities, they always seem to work. > > > > I've lost the regex, can't seem to find it. I know it had chr or ord > > in > it. > > > > I've been reading the perl-unicode archives, and googling, but I > > just > don't > > see it. > > > > This is for perl5.6.1 with Sun's (reputedly?) sick iconv. > > > > If someone could tap me in the right direction... > > > > Thx in advance > > > > -------------------------------------------------------------------- > > ------ > ---- > > This message is intended only for the personal and confidential use > > of the > designated recipient(s) named above. If you are not the intended > recipient of this message you are hereby notified that any review, > dissemination, distribution or copying of this message is strictly > prohibited. This communication is for information purposes only and > should not be regarded as an offer to sell or as a solicitation of an > offer to buy any financial product, an official confirmation of any > transaction, or as an official statement of Lehman Brothers. Email > transmission cannot be guaranteed to be secure or error-free. > Therefore, we do not represent that this information is complete or > accurate and it should not be relied upon as such. All information is > subject to change without notice. > > > > -- Jarkko Hietaniemi <[EMAIL PROTECTED]> http://www.iki.fi/jhi/ "There is this special biologist word we use for 'stable'. It is 'dead'." -- Jack Cohen ------------------------------------------------------------------------------ This message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product, an official confirmation of any transaction, or as an official statement of Lehman Brothers. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice.