-----Original message----- > From:Sourajit Basak <[email protected]> > Sent: Tue 11-Jun-2013 14:50 > To: [email protected] > Subject: RSS based crawl - how to crawl ref links in next round > > We are crawling RSS links using a custom plugin. Thats working fine. > > Our intention is to crawl the discovered urls in the subsequent round. > However, we notice that the links discovered have a status fetch_success & > also has a signature.
This should not be true for NEW discovered outlinks. Parser plugins return a List<Outlink> and do not carry signature or fetch status information at all. Are you sure you haven't already crawled them? > Hence the generate phase in the subsequent round > isn't producing any urls to fetch. > > We are setting a non-null empty string as parseText in the custom plugin. > > Any ideas on how to force the second round ? > > ~ Sourajit >

