Dennis Kubes wrote: > The thread dumps pointed me to the Regex URL Filter and greedy pattern > matching. It seems that there is a standing "error" in the JVM where > the "wrong" regular expression will cause the program to hang and the > cpu to go to 100%. Basically the behaviors that we are seeing. And > this would make sense as this error wouldn't appear unless the "right" > url came up. See this link for a complete explanation.
Ah, that would explain why I don't see this behavior - one of the first changes I do in my installations is to remove regex-urlfilter and replace it with a suitable combination of prefix/suffix-urlfilter, or a custom one ... Of course, we should solve this issue in our code, if possible, but using different urlfilters is a quick workaround. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
