Hallo, I'm using htdig for indexing an intranet site, several gigabytes of
documents (almost all .doc and .pdf), several tens of thousand of files. Well
the until now the size of db files are:

161809408 Nov 23 11:10 db.docdb
 13569024 Nov 23 11:10 db.docs.index
295576571 Nov 23 11:05 db.wordlist
238107648 Nov 23 11:05 db.words.db

and I'm at the 20% of the work. The pc is a [EMAIL PROTECTED] with 128mB of 
RAM. Can
this hw do the job?

I see that I can use mysql with htdig instead of berkeleydb, can it make the
search faster? Where can I find information on using mysql with htdig?

Another question, the directories tree is:

year
+-->month_1
    +--> day_1
    +--> day_2
...
    +--> day_n
+-->month_2
    +--> day_1
    +--> day_2
...
    +--> day_n
...
+-->month_n
    +--> day_1
    +--> day_2
...
    +--> day_n

Into every day_x directory I have the same directories, so when I search for the
word "giustizia" I obtain one entry for every diectory and for every file in
directories called "giustizia" (now, after more then a minute of work, it
returns 12.000 result), how can I manage this situation?

Why htdig show me only 10 pages with 10 results per page? How can I see all
results if I have more than 100? (not a strange situation if I scan 50.000
files) Apart from say to htdig to show more then 10 doc per page.

Thanks, Pietro.
-- 
I will build myself a copper tower
With four ways out and no way in
But mine the glory, mine the power
(So I chose Amiga and GNU/Linux)


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
ht://Dig general mailing list: <htdig-general@lists.sourceforge.net>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to