According to Wanrong Qiu:
> We have been using Geoff Hutchison's multidig to index
> our intranet sites for about one year. It usually works
> fine. But in very rare case, one of our local site database
> will be corrupted, then the main intranet database will be
> corrupted too. Yesterday, this bad thing happened again. I
> got some error message like this:

I'm not all that familiar with Geoff's multidig script, so I don't
know if I can help pinpoint the problem(s).  I did notice a few things,
though.  First of all, the version of Geoff's multidig script that's
in with the contributed files in htdig doesn't include the message
"Merging db ... of collect ..." in it, so I'm assuming this was added
to it.  This raises the question of what other modifications may have
been made to the script.

Secondly, there seems to be multiple failures happening, and from the
error messages it doesn't seem all that likely that the first error
is the cause of all the other ones.

> Merging db rnd of collect htdig
> DB2 problem...: /home/web/htdig/db/rnd/db.docdb: page 1919837230 doesn't
> exist, create flag not set
> BAD TAG IN SERIALIZED DATA: 115
> BAD TAG IN SERIALIZED DATA: 114
> BAD TAG IN SERIALIZED DATA: 38
> BAD TAG IN SERIALIZED DATA: 38
> DB2 problem...: page 155296: illegal page type or format
> DB2 problem...: PANIC: Invalid argument
> DB2 problem...: missing or empty key value specified
> DB2 problem...: missing or empty key value specified

This does seem to be the most serious problem of them all, and suggests
database corruption.  A few things to look out for are:  1) make sure
you have lots of extra disk space available on the partition that holds
the databases - if you're running low, or if you have a disk quota that
you're exceeding, it can lead to truncation of a database file which would
corrupt the database; and 2) make sure that each database file is under
2 GB, as most systems don't support files larger than that, so this too
could cause corruption.

> Merging db sales of collect htdig
> Merging db stock of collect htdig
> Merging db testops of collect htdig
> Merging db traffic of collect htdig
> Merging db intranet_disti of collect htdig
> Merging db pinatubo of collect htdig
> Merging db oregon of collect htdig
> Merging db penpal of collect htdig
> htmerge: Unable to open word list file
> '/home/web/htdig/db/htdig/db.wordlist.work'
> 
> Merging db europa of collect htdig
> htmerge: Unable to open word list file
> '/home/web/htdig/db/htdig/db.wordlist.work'
> 
> Moving files htdig at: Mon Dec 10 11:36:14 PST 2001
> mv: cannot access /home/web/htdig/db/htdig/db.words.db.work
> chmod: WARNING: can't access /home/web/htdig/db/htdig/db.words.db
> Done with htdig at: Mon Dec 10 11:36:14 PST 2001

When htmerge can't open db.wordlist (or db.wordlist.work) it usually
means htdig didn't index anything (see FAQ 5.16), but in this context
I can't tell if the problem is with the penpal and europa databases,
of if the htdig collection's wordlist disappeared after merging oregon
and before merging penpal.  If it's the latter, it would suggest a bug
in your script, or perhaps some bugs in htmerge.

You don't mention which version of htdig you're running, but if it's
3.1.5 or older, I suggest you try the 3.1.6 development snapshot at
http://www.htdig.org/files/snapshots/

In any case, it's also a good idea to add "LC_COLLATE=C ; export LC_COLLATE"
to your multidig script to prevent mis-sorting of the wordlist by htmerge.

> Generating it_eng at: Mon Dec 10 11:36:14 PST 2001
> Merging db eng of collect it_eng
> /home/web/htdig/db/eng/db.wordlist: No such file or directory
> htmerge: Sorting...
> Bus Error - core dumped
> htmerge: Word sort failed

Now these last two failures are very strange indeed.  The error about
db.wordlist above is not the one that htmerge usually reports (which
is more like the ones further above), so I don't know which program is
reporting this "No such file or directory" error.  To have the sort
program die on a Bus error is truly odd.  I'm at a loss to explain
this one.  Even really wierd data in db.wordlist shouldn't cause sort
to crash, but in this case it seems the db.wordlist isn't even there,
as though it disappeared between when htmerge checked for it and when
it called "sort".  Maybe you have a buggy sort program on your system.

What type of system are you running this on anyway?

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to