Hi,

First I'll admit to working with htdig for only a few weeks, so I apologize
if this has been asked before (I tried searching the mail list archives and
didn't find anything -- the search function for the 'general' list is
broken).

I've ported htdig 3.2.0b3 to Compaq Tru64 UNIX (with difficulty -- I'll be
submitting diffs once I get b4 to build).

Using the provided rundig script, I was having a problem indexing a certain
set of web pages (but not others).  The first time indexing, no problem, but
on subsequent runs  htpurge would dump core with messages like this:

FATAL ERROR:WordDBPage::~WordDBPage: page not empty
FATAL ERROR at file:WordDBPage.h line:484 !!!

Debugging I found that it seemed to be reading bogus data from the database.

I notice that in htdig.cc, if the -i (initial) option is set it unlinks all
the database files *except* for db.words.db_weakcmpr !   Why?  Is this an
oversight?  I found that if I remove this file before rerunning htdig (or
rundig) everything works fine.   Also, if I modify main() in htdig.cc to
unlink() it in the initial case, everything works fine, too.  Context diff
of this change is included below.

Could someone tell me what db.words.db_weakcmpr is for?  I haven't been able
to figure it out yet.


Thanks,
Peter

============
Peter Derr
Compaq Tru64 UNIX Internet Engineering Group
Tel: 01.603.884.2977
[EMAIL PROTECTED]


htdig/htdig.cc diffs:


***************
*** 255,263 ****
                         filename.get()));
      }

!     const String              word_filename = config["word_db"];
      if (initial)
         unlink(word_filename);

      // Initialize htword
      WordContext::Initialize(config);
--- 255,267 ----
                         filename.get()));
      }

!     String            word_filename = config["word_db"];
      if (initial)
+     {
         unlink(word_filename);
+        word_filename += (const char *)"_weakcmpr";
+        unlink(word_filename);
+     }

      // Initialize htword
      WordContext::Initialize(config);



_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to