Hi Israel,

You should check out the parse-rss plugin in Nutch. It not only adds the rss 
files themselves but in addition it adds the outlinks from the RSS too. I think 
the feed plugin may do the same but I'm not positive about that.

A lot of this has to do with MIME detection as well. Do you know if the URL you 
posted below is actually getting detected as RSS? Some questions:


 1.  what version of Nutch/JDK/OS are you using?
 2.  do you have some log information that you can show to determine if the 
parse-rss or feed plugin is being called?
 3.  Have you activated those plugins in your nutch-default.xml conf file?

Let me know on 1-3 and then maybe I can help more.

Cheers,
Chris


On 11/7/10 9:21 AM, "Israel" <[email protected]> wrote:

Hello, my problem is:

I want to search this rss:

http://www.merlot.org/merlot/materials.xml?sort.property=overallRating&rssTitle=Highest+Rated+Materials+In+MERLOT+

but I want that in the search results page don't shows me the RSS, but I
only want to look at the links...into the links

I can't' configure the "regex-urlfilter" file.....please help me, i need  a
step by step mini tutorial



++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: [email protected]
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to