On Thu, Jul 22, 2004 at 12:09:14AM +0200, Jesse Houwing wrote:
> This is the rule in question:
>
> uri SARE_URI_EQUALS
> m{^(?:(?:h|%[46]8)(?:t|%[57]4){2}(?:p|%[57]0)(?:s|%[57]3)?(?::|%3a)?(?:%5c|\\|%2f|/){0,2})[^/\?;]+=(?!(?:..)?$).*}iHrm. I have no idea what this is actually looking trying to match. The first (?: bit isn't necessary, btw. Looks like an URL with a = somewhere in the host section? ie: something like 'http://penistone=2eopoloveok=2ecom/3/' in a quoted-printable part? (this is the only set of matches I could find with your RE) If not, please post an example and I'll be happy to help debug. (I don't think this is a 3.0 bug though. See below.) If so, however: yeah, that'll be different. In 2.6: http://penistone=2eopoloveok=2ecom/3/ vs 3.0: http://penistone.opoloveok.com/3/ which is caused by 2.6 doing a very half-assed attempt at decoding the quoted-printable part, so you get the QP bits in the URI. 3.0 does the decoding properly (thanks total MIME parser rewrite!), so you end up with the URI you're supposed to get, properly decoded. Specifically, in PerMsgStatus::get_decoded_body_text_array(), which 2.6x uses to get the uri list from, the un-quoted-printable code is: s/\=([0-9A-F]{2})/chr(hex($1))/ge; which clearly has one flaw: it's looking for case-sensitive A-F! D'oh! Therefore, it doesn't match the URI above (uses lowercase). 3.0 does the right thing here. :) -- Randomly Generated Tagline: Historically Tcl has always stored all intermediate results as strings. (With 8.0 they're rethinking that. Of course, Perl rethought that from the start.) -- Larry Wall in <[EMAIL PROTECTED]>
pgplhAV7RKJXO.pgp
Description: PGP signature
