You'll notice another post I have here that has yet to be answered:
Subject: HELP - Disaster with index!!! How to recover???
It's been a week now without an answer, but I have to do something. Because
I can't seem to get an answer on the above maybe I can get an answer on this
one which will hopefully dump the above problem.
The problem with the above is NOT a database size limit. I have checked all
table sizes and we are currently only using about 50% of allowed mysql
table size limit.
I've written a Perl script to access urlword and fetch all URLs with a
status of 200 and have written this list of URLs to a simple text file with
one URL per line. This consists of 3,112,768 unique URLs.
Now because aspseek has been corrupted for some unknown reason and the
"index -H" simply aborts, I obviously have to start the entire index over
again and keep my fingers crossed. So my question is, can I insert this HUGE
file of URLs using:
./index -i -f ./myurls.txt
and expect all 3,112,768 to be inserted? I'm sure it will, but the big
question is that when I run index to fetch these documents:
./index -N 80 -R 64
will index handle all this? Will index eat up all the available memory (2GB)
trying to load all these URLs in memory? I've had problems with aspseek
eating up all memory and eventually thrashing the disk cache with as few as
inserting 250,000 URLs. No problems running search, but index is a memory
hog. I understand in the aspseek.conf file there is this directive:
NextDocLimit 1000
which is the default, but I don't know if that has anything to do with this
or not. What I can say is I have found other directives like MaxBandwidth
does NOT work as stated (reported bug #26) so I'm afraid that if this
NextDocLimit does what I think it does and it has bugs too, I may be wasting
my time on this whole project.
If anyone has any suggestions on an alternative indexing and search program
let me know. We are using aspseek with our intranet and not as a public
search, but fetching the entire document and providing a "cached" version is
what is important to us.
Thanks,
Karen
_________________________________________________________________
Protect your PC - get McAfee.com VirusScan Online
http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963
- Re: [aseek-users] Just a little help please? Karen Barnes
- Re: [aseek-users] Just a little help please? Searcher
- Re: [aseek-users] Just a little help please? Matt Sullivan
