Hello,
We're running HTDIG 3.1.5 on a ix86 Redhat 6.1.
Htdig currently indexes about 50000 documents (HTML and PDF)
(http://www.unesco.org/).
A Cron job make an incremental indexation every day (from Monday to
Saturday). Sunday is a special day where an initial indexation is done.
The following gives the situation after the initial (or incremental)
indexation.
# ls -al /usr/local/htdig/db
total 2948548
drwxr-xr-x 2 root root 110592 May 7 03:21 .
drwxr-xr-x 6 root root 4096 May 3 17:52 ..
-rw-r--r-- 1 root root 480542720 May 7 03:23 db.docdb
-rw-r--r-- 1 root root 480542720 May 7 03:21 db.docdb.work
-rw-r--r-- 1 root root 263168 May 7 03:23 db.docs.index
-rw-r--r-- 1 root root 263168 May 7 03:21 db.docs.index.work
-rw-r--r-- 1 root root 568581620 May 7 03:24 db.wordlist
-rw-r--r-- 1 root root 568581620 May 7 03:21 db.wordlist.work
-rw-r--r-- 1 root root 458707968 May 7 03:25 db.words.db
-rw-r--r-- 1 root root 458707968 May 7 03:21 db.words.db.work
Incremental and initial indexation frequently reports the following
problems :
...
BAD TAG IN SERIALIZED DATA: 110
...
DB2 problem...: /usr/local/htdig/db/db.docdb.work: page 1191531215
doesn't exist, create flag not set
...
As a consequence, the search doesn't work properly.
To solve this problem i have to reindex from scratch (i.e. initial
indexation).
I had a look at Htdig discussion list but unfortunately though this
problem is clearly reported i don't know how to solve definitively
this problem.
Best regards,
--
Alain FORCIOLI
-------------------------------------------------------------------
RISC Technology http://www.risc.fr/ [EMAIL PROTECTED]
APRIL http://www.april.org/ [EMAIL PROTECTED]
Debian GNU/Linux http://www.debian.org/
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.