Re: [OT] Unicode vs URI escaping

Andy_Bach Mon, 18 Nov 2002 11:49:40 -0800

"Niels Poppe" <[EMAIL PROTECTED]>
11/18/02 01:09 PM

 
        To:     <[EMAIL PROTECTED]>
        cc:     <[EMAIL PROTECTED]>
        Subject:        Re: [OT] Unicode vs URI escaping



Ok, ok, i'm not offended

And even this last unpack/map/join try gets beaten largely in terms of 
speed
if tested with strings of any length > 4. But the point was to get correct
output independent of perl version or context. For speed, make sure to 
'use
bytes' or 'no utf8' in those escape functions, otherwise things might 
break,
which was the reason the whole issue came up on
[EMAIL PROTECTED] in the first place.

So, the 'could be improved' still stands, for an escape function that 
works
under perl 5.00n and also under 5.6+ independent of use/no utf8 and/or
bytes, emits no warnings when run with -w and produces correct output when
fed UTF-8 strings containing character values larger than 255, faster then
the following:

my %ESCMAP = ();
for ( 0 .. 255 ) { $ESCMAP{ $_ } = sprintf("%%%02X", $_); }
for ('a'..'z', 'A'..'Z', '0'..'9', '_', '.', '-') {
  $ESCMAP{ord($_)} = $_;
}

sub escape {
  join '', map { $ESCMAP{$_} } unpack 'C*', shift
}

N.



Andy Bach, Sys. Mangler
Internet: [EMAIL PROTECTED] 
VOICE: (608) 261-5738  FAX 264-5030

Wire telegraph is a kind of a very, very long cat. You pull his tail in
New York and his head is meowing in Los Angeles. And radio operatesexactly 
the same way. The only difference is that there is no cat.
    --Albert Einstein (explaining radio)

Re: [OT] Unicode vs URI escaping

Reply via email to