On Sat, Aug 31, 2002 at 08:20:33PM +0100, Matthew Toseland wrote:
> Looking at Parser.flex...
> /* Non whitespace and not close of tag (right angle bracket).  I.e.
>  * chars that
>  * would not cause an unquoted attribute to end */
> NONSEP=[^>\n\r\ \t\b\012:?]
> NONSEP_NOQUOTE=[^>\n\r\ \t\b\012:?"]
> This I don't understand... "?" or ":" do not terminate the attribute
> (meaning the URL in an a href=<unquoted URL>. Presumably it is to reduce
> backtracking? Anyway, the proposed modifications are:
> NONSEP=[^>\n\r\ \t\b\012:]
> NONSEP_NOQUOTE=[^>\n\r\ \t\b\012:"]
> ......
> /* Catch any colon or ?htl= within the URL */
> LINK_PATTERNS1={LINK_ATTRS}{WS}={WS}["][^":]*[:][^"]*
> LINK_PATTERNS3={LINK_ATTRS}{WS}={WS}["][^"?]*?htl=
JFlex's handling of "'s has changed... so the above is wrong. I have a
fixed version, with all the "'s escaped, even inside []'s, which is
apparently what jflex 1.5.3 wants.
> This should achieve the functionality we want: block all colons (if we
> want to change the port, we should encode it as
> __CHECKED_HTTP_hostname_port__ or something), allow ? unless it's part
> of a ?htl=... However, I could be grossly mistaken. Comments?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available

Reply via email to