At 9:46 PM +0100 12/1/01, Ralph Ballier wrote:
>I have a big problem using htdig-3.2.0b3.

First off, I'd suggest grabbing a snapshot of 3.2.0b4 not the least 
because it includes a variety of bugfixes and an important security 
fix.

>Last night there was a big trouble: I got a big mail (2 GByte !!!),
>containing 37.006.820 lines(!!!) with the same content:
>             WordKey::Compare: key length for a or b < info.num_length

I don't know how big your max_doc_size attribute is set, but one 
reason for this attribute is to prevent problems arising from "mail 
bombing" and the like.

In your case, I wonder how large your databases are. Remember that on 
some operating systems (Linux on Intel in particular), files are 
limited to 2GB in size. So if your word database gets larger than 
this size, there's little htdig can do when indexing--it gets strange 
error messages back from the OS because the file is too large and 
things come to a halt.

You may also have somewhat corrupted databases from your earlier 
problems. One thing you can do with 3.2 betas is to run the "htdump" 
program to write the databases out to ASCII text files, then you can 
delete the binary files (db.words.db, db.docs.index db.docdb and 
db.excerpts) and run the "htload" program to rebuild the files from 
the text archives.

(Please note that the ASCII files are almost always significantly 
larger than the old databases--so if your databases are large and you 
face the 2GB limit, this won't help. Or if you don't have free 
diskspace, this will also not help.)

-- 
--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to