Hi, Chris added the RSS parses plugin a while back. I never used it, so I'm not sure what that stuff is really for. Can somebody explain?
Normally fetching and indexing a single web page results in a single Document in the index. What happens when an RSS feed is encountered? If the RSS feed is full, we treat each item as its own page/Document, and if it's not, then we extract item links and include those in some future fetchlist? How does the link to an RSS feed make it into a fetchlist to begin with? One has to include it explicitly, or does some other parser also parse links to feeds from HEAD>LINK element? ( http://issues.apache.org/jira/browse/NUTCH-412 ?) Thanks, Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
