On Fri, 20 Oct 2000, Melon, Jack wrote:
> I seem to be missing something to get it to drill down the directories. The
> index page on localhost is 3frames and its indexing the header.htm,
> main.htm, footer.htm - but that's all. The "header" and "main" frames both
> contain links to vast documentation in *.html, *.doc, *.txt, etc.; but it's
> not being retrieved. I've left the "limit_url_to:" attribute set to the
> default ${start_url}
I'd run htdig with the -vvv flag and take a look at the debugging output.
It will give you specific reasons for rejecting links.
> Also I'm getting an "Invalid Header- WWW Authenticate NTLM / Not Authorized"
> from the non-secured Microsoft IIS Server here. I've tried the -u
> username:password with no success. I understand that htdig is using BASIC
> encription and the IIS server is using "NT Challenge Response" so that's
> probably the problem, but I don't know a fix.
Wait. I thought you said you were using a "non-secured" server. Or are you
saying that it has password authorization but not HTTPS?
There's really nothing you can do on the ht://Dig side. Unless someone
steps forward to offer support for NT authentication, it's unlikely to
happen. I don't know of any of the active developers with access to an
NT-IIS machine and I wouldn't have the faintest idea how you'd support
that authentication.
You can always do a few things:
1) Convince the admin of the web box to make a "special case" in your
server config for the indexer. Keep in mind it's probably coming from a
paritcular IP and can have a specific user-agent field.
2) Grab the CygWin pakcage for running UNIX apps on Win32 machines and try
using local_filesystem digging. (No, I don't know if this works.)
--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
FAQ: <http://www.htdig.org/FAQ.html>