[
https://issues.apache.org/jira/browse/CONNECTORS-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036005#comment-14036005
]
Karl Wright commented on CONNECTORS-954:
----------------------------------------
Created branches/CONNECTORS-954 to work on this.
Implementation actually looks straightforward; only issue is that Tika sends
extracted content to a Writer, so there will need to be a buffer and a
secondary thread involved to do the actual Tika invocation.
> Amazon Cloud Search connector's use of Tika should be revisited after
> pipelines are added
> -----------------------------------------------------------------------------------------
>
> Key: CONNECTORS-954
> URL: https://issues.apache.org/jira/browse/CONNECTORS-954
> Project: ManifoldCF
> Issue Type: Task
> Components: Amazon CloudSearch output connector
> Affects Versions: ManifoldCF 1.7
> Reporter: Karl Wright
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.7
>
>
> Amazon Cloud Search connector uses Tika to extract content from binaries.
> When the pipeline support in CONNECTORS-946 is committed to trunk, we should
> do two things:
> (a) Create a Transformation Connection that extracts binary data into
> metadata, and
> (b) Remove the Tika dependency from the Amazon connector
--
This message was sent by Atlassian JIRA
(v6.2#6252)