[ 
https://issues.apache.org/jira/browse/TIKA-317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jukka Zitting updated TIKA-317:
-------------------------------

    Attachment: TIKA-317.patch

The attached patch introduces the following new Parser method:

    /**
     * Returns the set of media types supported by this parser when used
     * with the given parse context.
     *
     * @since Apache Tika 0.7
     * @param context parse context
     * @return immutable set of media types
     */
    Set<MediaType> getSupportedTypes(ParseContext context);

An explicit method is better than static annotations since it allows the 
parsers to better adapt to situations where optional functionality like certain 
parser libraries are not available. This approach also works for things like 
parser compositions and decorations.

The patch modifies the configuration mechanism so that the getSupportedTypes() 
method is used whenever a <parser/> entry without embedded <mime/> elements is 
encountered. This should maintain reasonable backwards compatibility with 
existing config files until Tika 1.0.


> Annotation-based Tika configuration
> -----------------------------------
>
>                 Key: TIKA-317
>                 URL: https://issues.apache.org/jira/browse/TIKA-317
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>            Priority: Minor
>         Attachments: TIKA-317.patch
>
>
> I'd like to simplify Tika configuration and make it easier to customize by 
> pushing the information in tika-config.xml to Parser annotations and Java SPI 
> service files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to