[ 
https://issues.apache.org/jira/browse/NUTCH-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated NUTCH-2231:
---------------------------------
    Attachment: NUTCH-2231.patch

Patch for trunk! It adds a JexlUtil where the expression parsing is done. 
CrawlDbReader has been updated accordingly.

> Jexl support in generator job
> -----------------------------
>
>                 Key: NUTCH-2231
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2231
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.11
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>             Fix For: 1.12
>
>         Attachments: NUTCH-2231.patch
>
>
> Generator should support Jexl expressions. This would make it much easier to 
> implement focussing crawlers that rely on information stored in the CrawlDB. 
> With the HostDB it is possible to restrict the generator to select only 
> interesting records but it is very cumbersome and involves 
> domainblacklist-urlfiltering.
> With Jexl support, it is no hassle!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to