[ https://issues.apache.org/jira/browse/TIKA-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147339#comment-13147339 ]
Jukka Zitting commented on TIKA-775: ------------------------------------ I'd like to have a concrete use case for introducing a new concept like this. What exact need are you addressing? Also, are there other existing tools that could be used instead of coming up with a new API. This seems like a pretty significant new feature, so it would be best if we did it right from the beginning. Design-wise it would be better for the embed() method to write it's results to an OutputStream given as an argument (just like the Parser interface takes a ContentHandler argument). Returning an InputStream brings up all sorts of issues about timing, error reporting, etc. > Embed Capabilities > ------------------ > > Key: TIKA-775 > URL: https://issues.apache.org/jira/browse/TIKA-775 > Project: Tika > Issue Type: Improvement > Components: general, metadata > Affects Versions: 1.0 > Environment: The default ExternalEmbedder requires that sed be > installed. > Reporter: Ray Gauss II > Labels: embed, patch > Fix For: 1.1 > > Attachments: tika-core-embed-patch.txt, tika-parsers-embed-patch.txt > > > This patch defines and implements the concept of embedding tika metadata into > a file stream, the reverse of extraction. > In the tika-core project an interface defining an Embedder and a generic sed > ExternalEmbedder implementation meant to be extended or configured are added. > These classes are essentially a reverse flow of the existing Parser and > ExternalParser classes. > In the tika-parsers project an ExternalEmbedderTest unit test is added which > uses the default ExternalEmbedder (calls sed) to embed a value placed in > Metadata.DESCRIPTION then verify the operation by parsing the resulting > stream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira