[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

nutch.newbie (JIRA) Fri, 09 Feb 2007 09:02:44 -0800

    [ 
https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471754
 ]


nutch.newbie commented on NUTCH-443:
------------------------------------

Gal:

Thanks for the feedback and the test you have done. If Nutch is going to be 
open source version of google then maybe we should consider Stax. Could you 
please provide some info regarding your implementation.. probably in the 
mailing list..  Well my use case is going to be lot more then 100K items feed 
so I am interested to know more. I would like to hear others view of feedparser 
please beside the apache politics :-) The big question is -- Can anyone use 
Nutch to be a technorati or bloglines using feedparser? seems like no?

> allow parsers to return multiple Parse object, this will speed up the rss 
> parser
> --------------------------------------------------------------------------------
>
>                 Key: NUTCH-443
>                 URL: https://issues.apache.org/jira/browse/NUTCH-443
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher
>    Affects Versions: 0.9.0
>            Reporter: Renaud Richardet
>            Priority: Minor
>             Fix For: 0.9.0
>
>         Attachments: NUTCH-443-draft-v1.patch, NUTCH-443-draft-v2.patch, 
> parse-map-core-draft-v1.patch, parse-map-core-untested.patch, parsers.diff
>
>
> allow Parser#parse to return a Map<String,Parse>. This way, the RSS parser 
> can return multiple parse objects, that will all be indexed separately. 
> Advantage: no need to fetch all feed-items separately.
> see the discussion at 
> http://www.nabble.com/RSS-fecter-and-index-individul-how-can-i-realize-this-function-tf3146271.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

Reply via email to