[
https://issues.apache.org/jira/browse/TIKA-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343665#comment-14343665
]
Tim Allison commented on TIKA-944:
----------------------------------
There's a slight disconnect in how we handle extraction from embedded docs:
* Tika-app commandline -t extracts embedded content
* Tika-app gui does not
* /tika does not
> Extend tika-server API to be consistent with tika-app CLI
> ---------------------------------------------------------
>
> Key: TIKA-944
> URL: https://issues.apache.org/jira/browse/TIKA-944
> Project: Tika
> Issue Type: New Feature
> Components: server
> Affects Versions: 1.1
> Environment: Any
> Reporter: Jason Judge
> Assignee: Chris A. Mattmann
> Labels: exposed-functionality, tika-server
>
> The tika-server API (web service) provides a limited set of functionality
> compared to the tika-app command-line version. Notable things missing are:
> 1. Language recognition.
> 2. Output in various formats (JSON for metadata, XHTML for the extracted
> text).
> Those are the two main things that would be useful to me, but ideally the
> server should be able to provide all the functionality that the command-line
> app does, taking the command-line as the model to follow.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)