On Thu, Jan 23, 2003 at 05:23:48PM -0800, Glenn Little wrote:
> I have two different directories for the two databases, and
> when I do a search with the search form that points htsearch
> to the alternate config, files in *both* database directories
> are accessed (as reported by unix "ls -lu").

That's whacked. The only thing I can think to explain that is the
"include" that someone else mentioned.

> In addition, I get results from both.  I am using "restrict"
> right now, just as a hack to filter the results down to something
> reasonable.  Of course, that's not foolproof because the pattern
> I'm using for "restrict" does in fact appear in the main web site
> as well in a couple of places.

What about using "exclude_urls" in your config file. You could then
eliminate the top level directories from your main site (assuming you
don't have 100s or 10s of them for that matter). I've successfully used
exclude_urls to help strip out formatting parameters. e.g. font_size=100,
font_size=200. You know, silly stuff that shouldn't be there...

I also just noticed this one: limit_urls_to
It might help you (although the default value is the start_url).
More info here: http://htdig.org/attrs.html#limit_urls_to


> Oh, also, the database is one of the few changes between the
> two config files, and when I use the default config file only the
> default database is searched but when I use the alternate config
> file, then both.

As was already suggested by someone else. Look for "include" in 
your new config file. The
other thing would be to rerun the crawl with the new config file and use
-i to create your databases from scratch. Maybe there's leftover stuff
that really shouldn't be there from a previous gaffe but you've not dumped yet.

emma

-- 
Emma Jane Hogbin
[[ 416 417 2868 ][ www.xtrinsic.com ]]


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to