|
Hi,
I'm trying to use mnogosearch as a link validator
for a large number of sites, but I ran into a serious problem.
Here's my configuration, in it's simplest
form:
DBAddr ...
DeleteBad no
Index no
CheckOnly NoMatch Regex
^http://barracuda\.enhydra\.org/.*\.html$
Realm *
This works beautifully, checking the existance of
links outside the barracuda.enhydra.org but not following. Except
when indexer gets to this link, it follows it and
starts indexing the other site.
<A
href="http://www.sys-con.com/java/readerschoice2001/">
So now indexer is following through that page, all
of its links, etc, and suddenly indexer is trying to check the whole world,
ignoring the CheckOnly parameter.
I've tried different versions of the CheckOnly,
with or without regex, splitting it into multiple lines, etc... nothing seems to
help. And indexer doesn't ignore the CheckOnly for all sites, just a
few.
Any ideas?
(I first tried a Server-based method,
DBAddr ..
DeleteBad no
Index no
Folllow site
but this does not validate links from this site to
another.) -Damon
|
- Re: link validation config help! Damon Tkoch
- Re: link validation config help! Damon Tkoch
- Re: link validation config help! Alexander Barkov
- Re: link validation config help! Alexander Barkov
- Re: link validation config help! Damon Tkoch
