Am Montag, 2. August 2010, um 20:14:32 schrieb brad:
>  I do have about 10
> entries in the regex-urlfilter.txt file, but they are mainly to exclude
> sites.  For Example:

I've got too this problem with 1.1. nutch often hanging at util.regexp... 
forever.
It does hang if i just use (in regexfilter property files) something like:

http://www.mydomain.local/

If i change this to be:

http://www\.mydomain\.local/

it does work - i have no glue why i have to escape the "." to be a period as 
"." should match the period too. However for me it solved this annoying hang 
@java util pattern matching. Maybe you can give this a try - maybe it does 
help, maybe not :-).

You can get more information on "which" regex nutch "hangs" if you overwrite 
the extension point or the plugin code and add some debugging line just before 
the match call and find some other regex which does match and does not hang 
;-).

Torsten


-- 
Bitte senden Sie mir keine Word- oder PowerPoint-Anhänge.
Siehe http://www.gnu.org/philosophy/no-word-attachments.de.html

Really, I'm not out to destroy Microsoft. That will just be a 
completely unintentional side effect."
        -- Linus Torvalds

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to