[ 
https://issues.apache.org/jira/browse/JCR-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469564
 ] 

Paco Avila commented on JCR-728:
--------------------------------

Why LGPL is troublesome? Source code using a LGPL library does not have to be 
LGPL or GPL. A port of libmagic to Java should be nice because there is lots of 
MIME definitions in its format.

And yes, I think that is more useful to add more functionality to 
jackrabbit-index-filters. By the way some MS Office files thows errors when 
they are indexed. I know this is a POI issue, but is this project abandoned? 
There is no updates since 04-08-2004 :(

> Automatic MIME type detection
> -----------------------------
>
>                 Key: JCR-728
>                 URL: https://issues.apache.org/jira/browse/JCR-728
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>            Reporter: Jukka Zitting
>            Priority: Minor
>
> Currently only the jcr:mimeType property is used to determine the MIME type 
> and thus the applicable text extractor to use for indexing a document. If the 
> jcr:mimeType property is not available or is set to a generic value like 
> "application/octet-stream", then the indexer could also use some heuristics 
> based on the node name or magic numbers within the binary stream to determine 
> the type of the document.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to