Dean,

I think you could also use the PruneIndexTool to "prune" those parent
pages from the index. If you search the mail list archive you can find
some discussion of how the PruneIndexTool works.

-Bryan

On 11/18/05, Dean Elwood <[EMAIL PROTECTED]> wrote:
> Hi Jerome,
>
> Thanks. So essentially I need to rebuild the IndexSegment class and
> customise. I guess that's the beauty of open-source software. The downside
> is I don't know any Java!
>
> I'll take a look into getting that done later.
>
> Many thanks,
>
> Dean
>
> ----- Original Message -----
> From: "Jérôme Charron" <[EMAIL PROTECTED]>
> To: <[email protected]>
> Sent: Thursday, November 17, 2005 10:18 PM
> Subject: Re: Crawling a page for links, but not indexing it
>
>
> > Is there anyway that I can do this from the Nutch side?
>
> Yes ... by modifying the IndexSegment class and avoid adding to the index
> the documents that match a configurable URL...
> ;-)
>
> Jérôme
>
> --
> http://motrech.free.fr/
> http://www.frutch.org/
>
>

Reply via email to