According to scottb:
> Well, I've found something.
>
> The file in question (the one that was not getting picked up by htdig) had a filename
> of "test_index.html". If I renamed this file to "foo.html" it works fine. Renaming
> it to "test_index_2.html" worked fine. Renaming it to "foo_index.html" failed.
>
> It appears that if the filespec is "???index.html", htdig will not "push" it onto the
> queue - it sees the HREF to it, it just never resolves it....
D'oh!!! This bug was introduced by some changes that Leo and I worked on
back in December. Strange that with all the testing and use since then,
no one else reported this problem, and today, both you and Robin Bowes
run into the very same problem!
The problem happens during URL normalization. The removeIndex() function
strips off the index.html part (or any part in remove_default_doc). Unfortunately, it
uses the wrong test.
Around line 443 in htlib/URL.cc, it does:
if (defaultdoc->hasPattern() &&
defaultdoc->FindFirstWord(path.sub(filename)) >= 0)
It really should use CompareWord rather than FindFirstWord, so that it
doesn't search for a matching substring.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.