I have an (almost) standard installation of htdig - I have changed two
variables: max_head_length which is 150000 and max_doc_size which is also
150000 - this is because some of the index files that point to the data are
around 150k (the data files themselves tend to sit at around 8k max) so if
there is any way of reducing this, but still let htdig find all of the
linked files, then this would probably be a start.

I have indexed around 97k files, total size is 787MB. Here is the contents
of the db dir:

drwxrwxr-x   2 root     root         1024 Mar 18 17:14 .
drwxr-xr-x  18 root     root         1024 Mar 18 14:02 ..
-rw-rw-r--   1 root     root     348661760 Mar 18 17:31 db.docdb
-rw-rw-r--   1 root     root     11640832 Mar 18 17:31 db.docs.index
-rw-rw-r--   1 root     root      4623360 Mar 18 17:47 db.metaphone.db
-rw-rw-r--   1 root     root      3689472 Mar 18 17:47 db.soundex.db
-rw-rw-r--   1 root     root     376777998 Mar 18 17:14 db.wordlist
-rw-rw-r--   1 root     root     291355648 Mar 18 17:14 db.words.db

As you can see, this is quite large :)

Currently, searches take up to 25-30 seconds to perform, and I was wondering
whether there is any way to tune this to improve performance somewhat. I'm
willing to trade off some functionality, but I simply don't know where to
start!

Answers on a postcard, (well by email would be nice too :))

Barry Zubel
Technical Manager
City Mutual Ltd
www.citymutual.com

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to