Dear all, after many many unsuccessful attempts I really hope that the htdig community can help me. My problem is as follows: I have a quite large server with more than 100000 PDFs on it. For indexing I create an HTML file with links to all PDFs and use this file as start_url. But now it seems that I have found a magical 2GByte limit, because indexing (a htdig run) stops as soon as db.docdb reaches a size of 2147483647 (2^31 - 1) bytes. I can see in the log-files (htdig -vv) that htdig simply stops and does not process the remaining PDFs.
Unsuccessful attempts have been so far: - installation of htdig 3.1.5/3.1.6 (self compiled, i.e. no package) - db-directory on a ext2/ext3/reiser partition - kernel 2.4.10 (Suse 7.3) - kernel 2.4.21 (Suse 9) I've read in http://www.geocrawler.com/mail/msg.php3?msg_id=9056546&list=8822 that Reiser-FS could be an option, but it didn't work for me. Besides I can easily create files bigger than 2 GByte already on a ext2 partition (I really checked that with a shell script). Htdig 3.2.0b5 is not really an option since diging is by a factor of ten slower than 3.1.6 (which would mean full ten days of indexing) despite of possible optimisations described in the FAQ. I know that file sizes could be a matter of architecture (I run an x86 one), but also of the kernel (older kernels have had this 2 GByte limit, but I have a brandnew one?!?). What makes me wonder is that the author of the link above could overcome his problems with a simple change of his file system, but I can't... Any help is really appreciated. Thanks, Anton ------------------------------------------------------- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

