Hi, I am having a little trouble indexing a database based document management system, which I am not allowed to modify, as it is supported by another company. There is no single index page, so I have created a hard-coded version, which points to every single ID that is possible on the system. Not surprisingly, many of these URL's don't lead to a valid document, but rather than getting a 404, or even a valid HTML page, I just get a plain text error message, with no html header or anything.
For example: http://cpol.edinburgh.gov.uk/getdoc_ext.asp?DocID=200000 The only thing that I can see to latch onto is the fact that this is always 140bytes, but otherwise I am at a loss to think of a way of keeping these documents from being indexed by htdig 3.1.6 Does anyone have any ideas? Thanks, Mike ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_idv37&alloc_id865&op=click _______________________________________________ ht://Dig general mailing list: <htdig-general@lists.sourceforge.net> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general