According to StR: > I have an list.php that generates a list of all sites in my domain > > like this > > <a href="dir1/files.php"> Dir 1</a> > <a href="dir2/files.php"> Dir 2</a> > <a href="dir3/files.php"> Dir 3</a> > > and dir1/files.php looks like: > > <a href="dir1/file1.php"> 1 File 1 </a> > <a href="dir1/file2.php"> 1 File 2 </a> > <a href="dir1/file3.php"> 1 File 3 </a> > <a href="dir1/file4.php"> 1 File 4 </a> > ...
Well, first of all, if you have a relative href like dir1/file1.php inside the file dir1/files.php, then the web client (htdig, or a web browser) would piece the URL together as dir1/dir1/file1.php. Is that what you want? If not, i.e. if file1.php is at the same level as files.php in dir1, then the URLs within dir1/files.php shouldn't also contain the "dir1/" portion of the path. > File 1, 2,3,4... are the files i want htdig to index.. > > each dir has like 500 links > > but if i search a word of dir2.. it only finds me matches from file1 to > file...250... from 251 to 500 it does not find them... > > and if i search words of the files in dir3 it does find them.. .why is that? > > Thanks every1... > > PD: max_head_length: 10000 > max_doc_size: 2000000 Well, this behaviour is certainly consistent with document truncation, as described in http://www.htdig.org/FAQ.html#q5.1 . However, at approximately 40 bytes per link line in dir1/files.php, for 500 files you'd only have about 20 KB plus overhead for that file. Even if htdig wasn't picking up your max_doc_size setting above, for whatever reason, the default value should still be adequate. A couple other things to look into: - point your web browser to the dir1/files.php page, and do a View File Info to see the total size. It may be that there's a lot of "padding" in it, making it bigger than it should be. - have a close look at the max_doc_size setting in your htdig.conf, perhaps using "cat -v", to make sure there isn't some control character or something that slipped into the value for it. If neither of these provide any relief, try running htdig -vvv to see what htdig sees when it indexes one of these URL list files. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) ---------------------------------------------------------------------------- Bringing you mounds of caffeinated joy >>> http://thinkgeek.com/sf <<< _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

