[ http://issues.apache.org/jira/browse/NUTCH-135?page=comments#action_12360025 ]
Stefan Groschupf commented on NUTCH-135: ---------------------------------------- Andrzej, that is easy to add to the ContentProperties object and sure I can do that. However first I would love to get a OK for this patch, before I invest more time in it, since I spend to many time writing stuff just for the issue archive. As soon this patch is in the sources I will write a small new patch (as Doug suggested, do it in small steps) to solve NUTCH-3 > http header meta data are case insensitive in the real world (e.g. > Content-Type or content-type) > ------------------------------------------------------------------------------------------------ > > Key: NUTCH-135 > URL: http://issues.apache.org/jira/browse/NUTCH-135 > Project: Nutch > Type: Bug > Components: fetcher > Versions: 0.7, 0.7.1 > Reporter: Stefan Groschupf > Priority: Critical > Fix For: 0.8-dev, 0.7.2-dev > Attachments: contentProperties_patch.txt > > As described in issue nutch-133, some webservers return http header meta data > not standard conform case insensitive. > This provides many negative side effects, for example query thet content type > from the meta data return null also in case the webserver returns a content > type, but the key is not standard conform e.g. lower case. Also this has > effects to the pdf parser that queries the content length etc. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
