"Niels Poppe" <[EMAIL PROTECTED]> 11/18/02 01:09 PM To: <[EMAIL PROTECTED]> cc: <[EMAIL PROTECTED]> Subject: Re: [OT] Unicode vs URI escaping
Ok, ok, i'm not offended And even this last unpack/map/join try gets beaten largely in terms of speed if tested with strings of any length > 4. But the point was to get correct output independent of perl version or context. For speed, make sure to 'use bytes' or 'no utf8' in those escape functions, otherwise things might break, which was the reason the whole issue came up on [EMAIL PROTECTED] in the first place. So, the 'could be improved' still stands, for an escape function that works under perl 5.00n and also under 5.6+ independent of use/no utf8 and/or bytes, emits no warnings when run with -w and produces correct output when fed UTF-8 strings containing character values larger than 255, faster then the following: my %ESCMAP = (); for ( 0 .. 255 ) { $ESCMAP{ $_ } = sprintf("%%%02X", $_); } for ('a'..'z', 'A'..'Z', '0'..'9', '_', '.', '-') { $ESCMAP{ord($_)} = $_; } sub escape { join '', map { $ESCMAP{$_} } unpack 'C*', shift } N. Andy Bach, Sys. Mangler Internet: [EMAIL PROTECTED] VOICE: (608) 261-5738 FAX 264-5030 Wire telegraph is a kind of a very, very long cat. You pull his tail in New York and his head is meowing in Los Angeles. And radio operatesexactly the same way. The only difference is that there is no cat. --Albert Einstein (explaining radio)