[ https://issues.apache.org/jira/browse/TIKA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663527#action_12663527 ]
Babak Farhang commented on TIKA-153: ------------------------------------ I suggest java.nio.FileChannel be used as the random access abstraction. This would allow implementations such as Skwish [ http://skwish.sourceforge.net/ ] be used as the source of a document. Ignoring certain of its niche capabilities (such as its map method), FileChannel, it turns out, allows one to slice and dice, construct filters (facades) in the same way java uses FilterInputStream and FilterOutputStream. As this idea is fleshed out a bit in skwish [see http://skwish.sourceforge.net/doc/com/faunos/util/io/package-summary.html ], thought I'd share.. -Babak > Allow passing of files or memory buffers to parsers > --------------------------------------------------- > > Key: TIKA-153 > URL: https://issues.apache.org/jira/browse/TIKA-153 > Project: Tika > Issue Type: New Feature > Components: parser > Reporter: Jukka Zitting > Priority: Minor > > Some of our parsers need to be able to go back and forth within a source > document, so need either a file or (for smaller documents) an in-memory > buffer that contains the full document. Currently we use temporary files for > such cases, which in some cases means doing an extra copy of a file before it > gets parsed. We should come up with some way for clients to pass in a file or > a memory buffer if one is available. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.