Franz,

Someone else will need to confirm this...

FYI...why not simply inject the urls directly into Nutch?

./nutch inject db/ -urlfile seeds.txt


At 03:49 PM 1/20/2006, you wrote:

Thank you, but if I do that will the page be read for urls?
Cheers, Frank

On 1/20/06, Neal Whitley <[EMAIL PROTECTED]> wrote:
> Franz,
>
> I 'think' you could use the regex url filter to not index this page
> (regex-urlfilter.txt).
>
> Something like:  -^http://([a-z0-9]*\.)*tripod.com/
>
> I am new to Nutch so I make no guarantee... :-)
>
> Neal
>
>
>
> At 05:23 AM 1/20/2006, you wrote:
>
> >Hello,
> >
> >We are trying to implement Nutch on an intranet and have setup a
> >special page which has links to all the other pages of the site, since
> >many are not linked together.
> >We will start with this special page and then go from there to all the
> >other pages, but we would like to not index the first page (so that it
> >doesn't show up in search results), just use it for its links.
> >Is it possible easily?
> >
> >Thank you.
>
>

Reply via email to