[ 
https://issues.apache.org/jira/browse/NUTCH-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150483#comment-13150483
 ] 

Ferdy Galema commented on NUTCH-1184:
-------------------------------------

Not sure, but my best bet would be parser related, mostly because the fetcher 
already is pretty heavy weight at the moment. So that is ParserOutputFormat (a 
static method) or maybe a dedicated utility/class in the parse package that is 
instantiated with filters/normalizers.
                
> Fetcher to parse and follow Nth degree outlinks
> -----------------------------------------------
>
>                 Key: NUTCH-1184
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1184
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>             Fix For: 1.5
>
>         Attachments: NUTCH-1184-1.5-1.patch, NUTCH-1184-1.5-2.patch, 
> NUTCH-1184-1.5-3.patch, NUTCH-1184-1.5-4.patch, 
> NUTCH-1184-1.5-5-ParseData.patch, NUTCH-1184-1.5-5.patch
>
>
> Improvements to fetcher to follow Nth degree outlinks of fetched items:
> - fetch
> - parse
> - normalize and filter outlinks
> - create new FetchItem and inject in the queue

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to