[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494789 ]
nutch.newbie commented on NUTCH-444: ------------------------------------ Hi: If I may jump in :-) I am very very happy to see Nutch-443 in the trunk. We have used the older nutch-443 patch together with nutch-444 with great production success. However we had to "freeze nutch" and we haven't updated the core ever since. Now that 443 is in the trunk I am very tempted to update the core code but I am waiting to see what happens with 444. I was just about to write a mail to Doğacan. Anyway IMHO, I would vote for Rome parser and I am saying it strictly from an operational prospective. We used to run our site with feed parser but things are much better now with rome. So loads of thanks to Doğacan! Cheers > Possibly use a different library to parse RSS feed for improved performance > and compatibility > --------------------------------------------------------------------------------------------- > > Key: NUTCH-444 > URL: https://issues.apache.org/jira/browse/NUTCH-444 > Project: Nutch > Issue Type: Improvement > Components: fetcher > Affects Versions: 0.9.0 > Reporter: Renaud Richardet > Assigned To: Chris A. Mattmann > Priority: Minor > Fix For: 1.0.0 > > Attachments: parse-feed-v2.tar.bz2, parse-feed.tar.bz2 > > > As discussed by Nutch Newbie, Gal, and Chris on NUTCH-443, the current > library (feedparser) has the following issues: > - OutOfMemory when parsing > 100k feeds, since it has to convert the feed to > jdom first > - no support for Atom 1.0 > - there has been no development in the last year > Alternatives are: > - Rome > - Informa > - custom implementation based on Stax > - ?? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers