According to [EMAIL PROTECTED]:
> to clarify my earlier problem with indexing files.
> 
> I have trailed through the FAQ and htDig documentation, but to no avail.
> I am indexing a short URL directory (of approx 20 .html files).
> Just to test the index that is created (since I am having problems
> getting it running).
> 
> running rundig using -vvv gives me a more indepth anaylsis of what is
> going on.
> 
> Out of the 20-odd files only three or four are being indexed.
> Even though all the .html files are similar in HTML code, structure
> and content.  Looking at the output from rundig, all the .html files
> seem to be indexed.  But when I search the index afterwards, using
> htsearch, only 3 or 4 are "searchable".
> 
> e.g. mpg_6-08.html and mpg_6-09.html are both almost identical files.
> On the output from rundig, both have the associated output:
> 
> pushing http://10.5.1.199/marketing/mpg_6-08.html
> href: http://10.5.1.199/marketing/mpg_6-08.html (mpg_6-08.html)
> resolving 'http://10.5.1.199/marketing/mpg_6-08.html
> '
> 
> pushing http://10.5.1.199/marketing/mpg_6-09.html
> href: http://10.5.1.199/marketing/mpg_6-09.html (mpg_6-09.html)
> resolving 'http://10.5.1.199/marketing/mpg_6-09.html
> '
> 
> But by searching the index, only mpg_6-09.html comes up in the reults.
> mpg_6-08.html won't be returned no matter what I enter in the search
> engine!
> 
> Any ideas or help ?

Wade through the voluminous htdig -vvv output to see where htdig is
actually fetching and indexing the mpg_6-08.html document.  Any clues
there?  If not, try -vvvv (4 v's) to see the words that htdig parses
from the document.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This sf.net email is sponsored by: Dice - The leading online job board
for high-tech professionals. Search and apply for tech jobs today!
http://seeker.dice.com/seeker.epl?rel_code=31
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to