Hi Sebastian,
Thank you sir!
Two things you provided solved the problem for me! One was the correct
syntax for the regex but the other was when you provided the info on the
indexchecker command. Part of what i was dealing with was not having much
to go on when debugging and that command helped
Hi Dave,
I'm by now means an expert of the JEXL syntax (cf.
(http://commons.apache.org/proper/commons-jexl/reference/syntax.html)
but after a few trials the expression must be
doc.getFieldValue('url')=~'.*/englishnews/.*'
It's easy to test using the indexchecker, e.g.
% bin/nutch indexchecker
Ryan and Roannel,
Thank you guys so much for your replies. I didn't realize it but I was not
seeing all of the emails from you.
Roannel you sent some really helpful replies that never came in as an
email. I found your replies when I browsed the web-based archives on the
apache site. I wanted
3 matches
Mail list logo