Re: selective crawl

Markus Jelsma Tue, 19 Jul 2011 10:21:01 -0700

So you still want to crawl and parse (for outlinks) but not index. Maybe using 
a parse filter to mark a page as interesting (perhaps by adding it to the meta 
data) and making an indexing filter that conditionally indexes pages based on 
that mark.


> Hello,
> 
> If I were to identify certain pages as pages of interest, in the
> parse-html plugin, how can I index only pages I mark as interesting,
> and exclude the rest? However I have to be able to extract outlinks
> from pages of non-interest.
> 
> What would be the correct approach to do that?
> 
> Best.

Re: selective crawl

Reply via email to