Actually, the ^ means start of line. This character is used as a negative
indicator only within the context of sets, eg, [^0-9].

Thanks,

Steve Betts
[EMAIL PROTECTED]
937-477-1797

-----Original Message-----
From: ¸ÇÊÀºÀÏÀ [mailto:[EMAIL PROTECTED]
Sent: Monday, January 30, 2006 8:58 AM
To: [email protected]
Subject: puzzle about regx ofurl pattern

+^http://([a-z0-9]*\.)*pilat.free.fr/
As far as I know, '^' means matching the characters not within a range by *
complementing* the set, so why it's a accepted pattern for crawl urls?

So the same with
-^(file|ftp|mailto)

Any differences?


--
¡¶¸ÇÊÀºÀÏÀ¡·ºÃÆÀÈç³±£¬ÈÃÎÞÏßÊÕÊӾӸ߲»Ï£¬ÎÞÏ߸ßÐËÖ®Ó࣬ÈÔÎ´ÖØÓá£ÖÜÐÇ³ÛÆñÊÇ
³ØÖÐÎϲ¾çÌì·Ö¼ÈȻո¶£¬µ±È»²»¸ÊÐÄÊÜÀäÂ䣬ÓÚÊÇתͶµçÓ°½ç£¬ÔÚ´óÒøÄ»ÉÏÒ»Õ¹·ç
²É¡£ÎÞÏ߼ȵÃǧÀïÂí£¬ÓÖʧǧÀïÂí£¬µ±È»ºó»Úμ°¡£




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to