PeterAlfredLee edited a comment on pull request #356: URL: https://github.com/apache/tika/pull/356#issuecomment-696618151
> Do we have to reset the stream before reprocessing? +1. The stream should be `reset` or `relocation to the beginning of the file`. I think this is complicated here, cause the stream can may be not a `seekable` or `resetable` stream. Seems we got to write the stream to a temp file, which may cause some other problems cause zip archives may be pretty huge. Maybe you have other better ideas? @tballison > Can you create a unit test with a small document that shows that this works? +1. I can help out with this. Will update when this is done. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org