[ 
https://issues.apache.org/jira/browse/TIKA-411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026526#comment-14026526
 ] 

Nick Burch commented on TIKA-411:
---------------------------------

I'd suggest just using the Tika App, as the --list-<foo> type methods on that 
should provide most of what you need. Or ask the Tika server nicely, it offers 
the list as plain text, html or json, the latter should be fairly easy to 
process in code!

However, I'm not sure about generating all of the page automatically. The 
current formats page has quite a lot of manually written text in it around the 
support for each format, and manually groups related formats together along 
with links to the relevant parsers

Maybe it would be better to have something which calls the Tika App list 
parsers method, then warns you if that parser doesn't get mentioned in the 
formats page?

> Generate list of supported and detected types automatically
> -----------------------------------------------------------
>
>                 Key: TIKA-411
>                 URL: https://issues.apache.org/jira/browse/TIKA-411
>             Project: Tika
>          Issue Type: Improvement
>          Components: documentation
>            Reporter: Jukka Zitting
>            Priority: Minor
>
> Currently we edit the list of supported types 
> (http://lucene.apache.org/tika/0.7/formats.html) manually, which is bound to 
> leave the list outdated and incomplete. It would be better if the list was 
> automatically generated from the tika-mimetypes.xml file and the 
> getSupportedTypes() response of the AutoDetectParser class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to