Markus Jelsma created NUTCH-2231:
------------------------------------

             Summary: Jexl support in generator job
                 Key: NUTCH-2231
                 URL: https://issues.apache.org/jira/browse/NUTCH-2231
             Project: Nutch
          Issue Type: Improvement
    Affects Versions: 1.11
            Reporter: Markus Jelsma
            Assignee: Markus Jelsma
             Fix For: 1.12


Generator should support Jexl expressions. This would make it much easier to 
implement focussing crawlers that rely on information stored in the CrawlDB. 
With the HostDB it is possible to restrict the generator to select only 
interesting records but it is very cumbersome and involves 
domainblacklist-urlfiltering.

With Jexl support, it is no hassle!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to