According to Philippe Carbonnier:
> the situation : redhat linux 7.0, glibc-2.2.4-18.7.0.3, htdig 3.1.6
> compiled from scratch.
> the problem : htdig can't index a file with .T suffix (Ex: a.T) !
> a.T is an simple ascii file containing :   echo toto
> If I copy this file with the name a.S, it works !
> 
> Here is the trace :
> the directory /client/vivian/e contain only a.T and a.S
> 
> [root@intranet conf]# /opt/www/htdig/bin/rundig -v
> 
> New server: intranet.vif.tm.fr, 80
> 0:0:0:http://intranet.vif.tm.fr/client/vivian/e/: ++++-++ size = 787
> 1:1:1:http://intranet.vif.tm.fr/client/vivian/e/?N=D: +***-** size = 787
> 
> 2:2:1:http://intranet.vif.tm.fr/client/vivian/e/?M=A: *+**-** size = 787
> 
> 3:3:1:http://intranet.vif.tm.fr/client/vivian/e/?S=A: **+*-** size = 787
> 
> 4:4:1:http://intranet.vif.tm.fr/client/vivian/e/?D=A: ***+-** size = 787

See http://www.htdig.org/FAQ.html#q4.23 for tips on suppressing the extra
sorted directory listings generated by Apache's FancyIndexing feature.

> 5:5:1:http://intranet.vif.tm.fr/client/vivian/e/a.S:  size = 10
> 6:6:1:http://intranet.vif.tm.fr/client/vivian/e/a.T:  not HTML

The "not HTML" error message is a bit of a misnomer, carried over from
the early days when htdig only parsed HTML.  Now, a more appropriate
error message would be "unknown Content-type".  (That error message is
on line 526 of htdig/Retriever.cc if anyone cares to fix it.)

The trick is to figure out why your web server is returning different
Content-type headers for .S and .T files.  If you run htdig -i -vvv
you'll see what headers are returned by the server for each document.
Chances are your server is configured to recognize .S files as assembly
language text, while .T may be treated as unknown, and may default to
a non-text content-type.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)

_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to