On Mon, 22 Apr 2002, Tony Lewis wrote:

> I think all of the expressions proposed thus far are too fragile. Consider
> the following URL:
> 
> http://www.google.com/search?num=100&q=%2Bwget+-GNU
> 
> The regular expression needs to account for multiple arguments separated by
> ampersands. It also needs to account from any valid URI character between an
> equal sign and either end of string or an ampersand.

 I'm not sure what you are referring to.  We are discussing a common
problem with "static" pages generated by default by Apache as "index.html" 
objects for server's filesystem directories providing no default page. 
Any dynamic content should probably be protected by "robots.txt" and
otherwise dealt by a user specifically depending on the content. 

 BTW, wget's accept/reject rules are not regular expressions but simple
shell globbing patterns. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: [EMAIL PROTECTED], PGP key available        +

Reply via email to