^ is a negate in character classes if I remember correctly, however in
this REGEX it means beginning of the line, like $ is end of line (input)



On Mon, 2006-01-30 at 21:57 +0800, 盖世豪侠 wrote:
> +^http://([a-z0-9]*\.)*pilat.free.fr/
> As far as I know, '^' means matching the characters not within a range by *
> complementing* the set, so why it's a accepted pattern for crawl urls?
> 
> So the same with
> -^(file|ftp|mailto)
> 
> Any differences?
> 
> 
> --
> 《盖世豪侠》好评如潮,让无线收视居高不下,无线高兴之余,仍未重用。周星驰岂是池中物,喜剧天分既然崭露,当然不甘心受冷落,于是转投电影界,在大银幕上一展风采。无线既得千里马,又失千里马,当然后悔莫及。




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to