Hi,
to clarify my earlier problem with indexing files.
I have trailed through the FAQ and htDig documentation, but to no avail. I am
indexing a short URL directory (of approx 20 .html files). Just to test the index
that is created (since I am having problems getting it running).
running rundig using -vvv gives me a more indepth anaylsis of what is going on.
Out of the 20-odd files only three or four are being indexed. Even though all the
.html files are similar in HTML code, structure and content. Looking at the output
from rundig, all the .html files seem to be indexed. But when I search the index
afterwards, using htsearch, only 3 or 4 are "searchable".
e.g. mpg_6-08.html and mpg_6-09.html are both almost identical files. On the output
from rundig, both have the associated output:
pushing http://10.5.1.199/marketing/mpg_6-08.html
href: http://10.5.1.199/marketing/mpg_6-08.html (mpg_6-08.html)
resolving 'http://10.5.1.199/marketing/mpg_6-08.html
'
pushing http://10.5.1.199/marketing/mpg_6-09.html
href: http://10.5.1.199/marketing/mpg_6-09.html (mpg_6-09.html)
resolving 'http://10.5.1.199/marketing/mpg_6-09.html
'
But by searching the index, only mpg_6-09.html comes up in the reults. mpg_6-08.html
won't be returned no matter what I enter in the search engine!
Any ideas or help ?
thanks,
Shams
--------------------
talk21 your FREE portable and private address on the net at http://www.talk21.com
-------------------------------------------------------
This sf.net email is sponsored by: Dice - The leading online job board
for high-tech professionals. Search and apply for tech jobs today!
http://seeker.dice.com/seeker.epl?rel_code=31
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html