[Nutch-dev] [jira] Commented: (NUTCH-233) wrong regular expression hang reduce process for ever

2006-11-28 Thread Sean Dean (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-233?page=comments#action_12453919 ] Sean Dean commented on NUTCH-233: - Could I suggest that this change, from .*(/.+?)/.*?\1/.*?\1/ to .*(/[^/]+)/[^/]+\1/[^/]+\1/ be committed to at least trunk for

[Nutch-dev] [jira] Commented: (NUTCH-233) wrong regular expression hang reduce process for ever

2006-08-16 Thread Stefan Groschupf (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-233?page=comments#action_12428542 ] Stefan Groschupf commented on NUTCH-233: Hi Otis, yes for a serious whole web crawl I need to change this reg ex first. It only hangs with some random urls

[Nutch-dev] [jira] Commented: (NUTCH-233) wrong regular expression hang reduce process for ever

2006-07-25 Thread Stefan Groschupf (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-233?page=comments#action_12423438 ] Stefan Groschupf commented on NUTCH-233: I think this should be fixed in .8 too, since everybody that does real whole web crawl with over a 100 Mio pages

[Nutch-dev] [jira] Commented: (NUTCH-233) wrong regular expression hang reduce process for ever

2006-03-16 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-233?page=comments#action_12370685 ] Jerome Charron commented on NUTCH-233: -- Stefan, I have created a small unit test for urlfilter-regexp and I doesn't notice any incompatibility in java.util.regex with