Andrzej Bialecki wrote:

Chris Mattmann wrote:

Hi Folks,

I just wanted to let you know that I�ve submitted the parse-rss plugin that
I was working on to the JIRA system under issue �NUTCH-30�
(http://issues.apache.org/jira/browse/NUTCH-30). The plugin includes a patch
filie (svn diff), along with the zipped up source and runtime libraries. The
rss parser is based on the commons-feedparser out of the jakarta sandbox,
and fully supports all of the major rss formats (atom, rss 1.0, 2.0, etc.).
Additionally, I�ve included a junit test that runs the parser on an example
rss file and validates the outlinks and content extracted.


I hope that you will find it useful and vote to have it included in the
nutch distro.


+1, with some reservations (see jira).

I think it's a very useful contribution. Thank you, Chris!

Wow... thats GREAT. (I'm the author of the FeedParser).

BTW. Its in commons-proper now but I just haven't had a chance to do a 0.5.0 release. We've had a release candidate but I need to release another one WRT some feedback we've had.

If you're running from a sandbox build I'd HIGHLY recommend getting a commons proper build of 0.5.0RC1.

http://jakarta.apache.org/commons/feedparser/

Kevin

--

Use Rojo (RSS/Atom aggregator). Visit http://rojo.com. Ask me for an invite! Also see irc.freenode.net #rojo if you want to chat.

Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html

If you're interested in RSS, Weblogs, Social Networking, etc... then you should work for Rojo! If you recommend someone and we hire them you'll get a free iPod!
Kevin A. Burton, Location - San Francisco, CA
AIM/YIM - sfburtonator, Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412




Reply via email to