On Tue, 24 Apr 2001, mouss wrote:

> The correct way to do it is something like this:
>    assume you want to match a URL "u" with a balacklisted URL "b".
> then first decompose each of them to the base components:
>          scheme (http, ftp, ...)

I believe the spec. has a user identification thing in here.  It's
important because it's extremely poorly seperated and can contain
anything- it makes things like http://www.microsoft.com@realURL possible
for some sets of browsers- the ultimate in URLized social engineering :(

(I'm not sure the seperator is an at symbol, but I'm just not in a good
enough mood to trapse through the HTTP spec. again.)

[snip]
> [server]
> for the server, there are 4 cases:
> 
>    2. both "u" and "b" use hostname based expressions. then the usual regex 
> matching
> is used. so www.playboy.com matches *.playboy.com
> 

I believe you've missed the %dd conversion step, which can be
per-character.  That's what makes HTTP so much fun and pattern matching on
URIs so much pain...

> so all this is known since a long time, has been coded and documented. but 
> still, people
> write flawed software.

No doubt in part caused by protocol design specifications that don't take
into account downstream usage issues.  HTTP *sucks* as a protocol- no
length restrictions, no code normalizaitons, no structure worth
mentioning, poor tokenization....

You know, I bet that everyone on the list passes HTTP and probably no more
than three people have even done a cursorary protocol evaluation on it.  I
wonder who the third person is? 

When you look at specs like HTTP and FTP, then the list of protocols
"supported" by most firewalls it's not comforting.

Paul
-----------------------------------------------------------------------------
Paul D. Robertson      "My statements in this message are personal opinions
[EMAIL PROTECTED]      which may have no basis whatsoever in fact."

-
[To unsubscribe, send mail to [EMAIL PROTECTED] with
"unsubscribe firewalls" in the body of the message.]

Reply via email to