FeedParser empty links for items
--------------------------------

                 Key: NUTCH-583
                 URL: https://issues.apache.org/jira/browse/NUTCH-583
             Project: Nutch
          Issue Type: Bug
    Affects Versions: 1.0.0
            Reporter: Enis Soztutar
            Assignee: Enis Soztutar
             Fix For: 1.0.0


FeedParser in feed plugin just discards the item if it does not have <link> 
element. However Rss 2.0 does not necessitate the <link> element for each 
<item>. 
Moreover sometimes the link is given in the <guid> element which is a globally 
unique identifier for the item. I think we can search the url for an item 
first, then if it is still not found, we can use the feed's url, but with 
merging all the parse texts into one Parse object. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to