Cannot handle incorrectly cased Content-Type
--------------------------------------------
Key: NUTCH-24
URL: http://issues.apache.org/jira/browse/NUTCH-24
Project: Nutch
Type: Bug
Components: fetcher
Reporter: Stefan Grroschupf
Priority: Minor
transfered from:
http://sourceforge.net/tracker/index.php?func=detail&aid=1014459&group_id=59548&atid=491356
submitted by:
boconnor
Not really a bug but in many cases web servers give the
incorrect header of "Content-type" instead of "Content-
Type" - notice the lowercase "t".
This is particullary true of IIS. This results in the inability
to resolve the content type (or in fact resolving it to
nothing) resulting in the document not getting indexed.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
http://www.atlassian.com/software/jira