[ http://issues.apache.org/jira/browse/NUTCH-33?page=comments#action_62341 
]
     
Jerome Charron commented on NUTCH-33:
-------------------------------------

[John]
Though not ideal, a system wide property is probably the easiest way to ensure 
behavior
consistency among tools and plugins.

[Jerome]
Ok, I will add a system wide property in nutch-default.xml (the caller code can 
choose to do magic resolution by calling the right method)
What is your opinion about this point:
1. Is it the calling code that check the mime.magic property and call the 
getMimeType(String) or getMimeType(String, byte[]) depending on the value,
2. or is it the getMimeType(String, byte[]) method that must check the 
mime.magic property and uses the magic resolution if the flag is true?
The first second one is better for consistency.
But the second one is strange: You call a getMimeType(String, byte[]) instead 
of getMimeType(String) so the developper expects to uses the magic analyzis to 
be performed....

[John]
Yes, it's better to follow jaf's api.

[Jerome]
That's done:
* Refactoring to org.apache.nutch.util.mime
* Uses a public MimeType object with parsing capabilites (using Hari 
Kodungallu's code)
* new patch version for protocol-file and protocol-ftp plugins
* add new patch for protocol-http and index-more plugins (index-more no more 
needs jaf).
* unit regression tests are ok

Todo:
* Add the mime.magic property
* Perform some functional tests



> MIME content type detector (using magic char sequences)
> -------------------------------------------------------
>
>          Key: NUTCH-33
>          URL: http://issues.apache.org/jira/browse/NUTCH-33
>      Project: Nutch
>         Type: New Feature
>     Reporter: Jerome Charron
>     Assignee: John Xing
>     Priority: Minor
>  Attachments: NUTCH-33.patch, mime-types.tar.gz
>
> Extension based content-type detector is not suffisant in some cases.
> The solution is to add a content type detector based on some magic char 
> sequences like in apache httpd for instance.
> (Note: I created this issue only to keep a trace, but I'm currently working 
> on it)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to