+1 too. This is very useful.
John Xing (JIRA) wrote:
[ http://issues.apache.org/jira/browse/NUTCH-33?page=comments#action_62877 ]
John Xing commented on NUTCH-33:
--------------------------------
My +1 vote for this contribution. If no objection, I will commit it over the weekend.
John
MIME content type detector (using magic char sequences) -------------------------------------------------------
Key: NUTCH-33 URL: http://issues.apache.org/jira/browse/NUTCH-33 Project: Nutch Type: New Feature Reporter: Jerome Charron Assignee: John Xing Priority: Minor Attachments: NUTCH-33-050415.patch, NUTCH-33.patch, mime-types-050415.tar.gz, mime-types.tar.gz
Extension based content-type detector is not suffisant in some cases. The solution is to add a content type detector based on some magic char sequences like in apache httpd for instance. (Note: I created this issue only to keep a trace, but I'm currently working on it)
-- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers