Brilliant,
thanks for the help!
-Damon

----- Original Message ----- 
From: "Alexander Barkov" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>; "Damon Tkoch" <[EMAIL PROTECTED]>
Sent: Monday, March 05, 2001 11:30 PM
Subject: Re: link validation config help!


> > Damon Tkoch wrote:
> > 
> > Hi,
> > I'm trying to use mnogosearch as a link validator for a large number
> > of sites, but I ran into a serious problem.
> > 
> > Here's my configuration, in it's simplest form:
> > 
> > DBAddr ...
> > DeleteBad no
> > Index no
> > CheckOnly NoMatch Regex ^http://barracuda\.enhydra\.org/.*\.html$
> > Realm *
> > URL http://barracuda.enhydra.org/index.html
> > 
> > This works beautifully, checking the existance of links outside the
> > barracuda.enhydra.org but not following.  Except when indexer gets to
> > this link, it follows it and starts indexing the other site.
> > 
> > <A href="http://www.sys-con.com/java/readerschoice2001/">
> > 
> > So now indexer is following through that page, all of its links, etc,
> > and suddenly indexer is trying to check the whole world, ignoring the
> > CheckOnly parameter.
> > 
> > I've tried different versions of the CheckOnly, with or without regex,
> > splitting it into multiple lines, etc... nothing seems to help.  And
> > indexer doesn't ignore the CheckOnly for all sites, just a few.
> > 
> > Any ideas?
> > 
> > (I first tried a Server-based method,
> > 
> > DBAddr ..
> > DeleteBad no
> > Index no
> > Folllow site
> > Server http://barracuda.enhydra.org/index.html
> > 
> > but this does not validate links from this site to another.)
> 
> 
> I think it should look like this (but I didn't check):
> 
> 
> 
> # do not build words index
> Index no
> 
> # The site itself
> Server http://barracuda.enhydra.org/index.html
> 
> # Other pages referenced from the site should be checked
> # but we don't want to follow futher from them
> 
> Follow no
> Realm NoMatch http://barracuda.enhydra.org/*

___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]

Reply via email to