According to Peter Derr: > First I'll admit to working with htdig for only a few weeks, so I apologize > if this has been asked before (I tried searching the mail list archives and > didn't find anything -- the search function for the 'general' list is > broken). > > I've ported htdig 3.2.0b3 to Compaq Tru64 UNIX (with difficulty -- I'll be > submitting diffs once I get b4 to build). > > Using the provided rundig script, I was having a problem indexing a certain > set of web pages (but not others). The first time indexing, no problem, but > on subsequent runs htpurge would dump core with messages like this: > > FATAL ERROR:WordDBPage::~WordDBPage: page not empty > FATAL ERROR at file:WordDBPage.h line:484 !!!
It's hard to say for sure whether this is a 3.2.0b3 bug or a problem specific to your platform. Certainly you should try b4 first, and see if the problem goes away. b3 has lots of bugs which are fixed in the latest b4 snapshots. > Debugging I found that it seemed to be reading bogus data from the database. > > I notice that in htdig.cc, if the -i (initial) option is set it unlinks all > the database files *except* for db.words.db_weakcmpr ! Why? Is this an > oversight? I found that if I remove this file before rerunning htdig (or > rundig) everything works fine. Also, if I modify main() in htdig.cc to > unlink() it in the initial case, everything works fine, too. Context diff > of this change is included below. Thanks for the patch. Yes, this is an oversight. The plan is to merge in the latest version of mifluz, which handles database compression a little differently, and supposedly does away with this file. However, until we do merge in the latest DB code, we should try to deal consistenly with this annoying weakcmpr file. > Could someone tell me what db.words.db_weakcmpr is for? I haven't been able > to figure it out yet. It's a file that contains overflow records from the words database in cases where the compression is weak and the DB code needs more room. It's a pretty inelegant solution to a sticky problem. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

