[ http://issues.apache.org/jira/browse/NUTCH-33?page=history ]

Jerome Charron updated NUTCH-33:
--------------------------------

    Attachment: NUTCH-33-050415.patch
                mime-types-050415.tar.gz

Here are new attachements that solve this issue (regarding previous comments).

New from previous comments:
* No more dependencies on nutch code (neither logger nor conf)
* Add a mime.type.magic flag in conf. (to enable or disable magic resolver)
* protocol-ftp, protocol-http, protocol-file based on this mime-util code.
* index-more is activated by default in build.xml (no more dependency on jaf)
* <pathelement location="${test.src.dir}"/> is added in unit test classpath in 
order to be able to perform tests that uses some external files (my unit tests 
use some referenctial png, gif, ... files to test detection).
* ...

Jerome


> MIME content type detector (using magic char sequences)
> -------------------------------------------------------
>
>          Key: NUTCH-33
>          URL: http://issues.apache.org/jira/browse/NUTCH-33
>      Project: Nutch
>         Type: New Feature
>     Reporter: Jerome Charron
>     Assignee: John Xing
>     Priority: Minor
>  Attachments: NUTCH-33-050415.patch, NUTCH-33.patch, 
> mime-types-050415.tar.gz, mime-types.tar.gz
>
> Extension based content-type detector is not suffisant in some cases.
> The solution is to add a content type detector based on some magic char 
> sequences like in apache httpd for instance.
> (Note: I created this issue only to keep a trace, but I'm currently working 
> on it)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira

Reply via email to