Turns out that all that was unnecessary, and the program would have saved
and restored the UTF fine if I just hadn't tried to blindly untaint the data
with...

sub untaint_blind {
   $_[0] =~ /^(.*)$/;
   my $ret = $1;
   $ret;
}

This is perl5.6.1



-----Original Message-----
From: Jarkko Hietaniemi [mailto:[EMAIL PROTECTED]] 
Sent: Friday, January 10, 2003 1:39 PM
To: Merijn van den Kroonenberg
Cc: Narins, Josh; [EMAIL PROTECTED]
Subject: Re: beginniner's 5.6.1 latin1<->utf8 question


On Fri, Jan 10, 2003 at 07:28:00PM +0100, Merijn van den Kroonenberg wrote:
> You might be looking for these:
> 
> 
>     # ISO 8859-1 to UTF-8
>     s/([\x80-\xFF])/chr(0xC0|ord($1)>>6).chr(0x80|ord($1)&0x3F)/eg;
> 
>     # UTF-8 to ISO 8859-1
>     s/([\xC2\xC3])([\x80-\xBF])/chr(ord($1)<<6&0xC0|ord($2)&0x3F)/eg;
> 
> I think that will work (they are not mine, so don't blame me if not 
> ;-)

They are mine :-) so I feel free to say that they don't &#NNN; conversion...
but they certainly could be changed to work so.

> Greetings, Merijn
> 
> ----- Original Message -----
> From: "Narins, Josh" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Friday, January 10, 2003 6:54 PM
> Subject: beginniner's 5.6.1 latin1<->utf8 question
> 
> 
> >
> > At one point I had a regex which perfectly converts the string A 
> > below
> into
> > a series of &#234; strings.
> > This is nice for me, because I just sling them out on the web, and 
> > as entities, they always seem to work.
> >
> > I've lost the regex, can't seem to find it. I know it had chr or ord 
> > in
> it.
> >
> > I've been reading the perl-unicode archives, and googling, but I 
> > just
> don't
> > see it.
> >
> > This is for perl5.6.1 with Sun's (reputedly?) sick iconv.
> >
> > If someone could tap me in the right direction...
> >
> > Thx in advance
> >
> > --------------------------------------------------------------------
> > ------
> ----
> > This message is intended only for the personal and confidential use 
> > of the
> designated recipient(s) named above.  If you are not the intended 
> recipient of this message you are hereby notified that any review, 
> dissemination, distribution or copying of this message is strictly 
> prohibited.  This communication is for information purposes only and 
> should not be regarded as an offer to sell or as a solicitation of an 
> offer to buy any financial product, an official confirmation of any 
> transaction, or as an official statement of Lehman Brothers.  Email 
> transmission cannot be guaranteed to be secure or error-free.  
> Therefore, we do not represent that this information is complete or 
> accurate and it should not be relied upon as such.  All information is 
> subject to change without notice.
> >
> >

-- 
Jarkko Hietaniemi <[EMAIL PROTECTED]> http://www.iki.fi/jhi/ "There is this special
biologist word we use for 'stable'.  It is 'dead'." -- Jack Cohen

------------------------------------------------------------------------------
This message is intended only for the personal and confidential use of the designated 
recipient(s) named above.  If you are not the intended recipient of this message you 
are hereby notified that any review, dissemination, distribution or copying of this 
message is strictly prohibited.  This communication is for information purposes only 
and should not be regarded as an offer to sell or as a solicitation of an offer to buy 
any financial product, an official confirmation of any transaction, or as an official 
statement of Lehman Brothers.  Email transmission cannot be guaranteed to be secure or 
error-free.  Therefore, we do not represent that this information is complete or 
accurate and it should not be relied upon as such.  All information is subject to 
change without notice.


Reply via email to