Hi, On Thu, Feb 18, 2010 at 8:41 PM, Ronan KERDUDOU - VirageGroup <r...@viragegroup.com> wrote: > Can we solve this in Tika or do you think it's a VFS bug and i should tell > them instead of you ?
IMHO it's a VFS bug, a reset() call should restore the stream to the state it was when mark() was called (assuming the limit wasn't exceeded, etc.). Otherwise there is no way for a client to really rely on the reset() method. > To solve the issue, i actually add the folowing code before calling it : > > stream = new BufferedInputStream(stream); Yep. BufferedInputStream does restore the stream state correctly on reset(). > I had this idea when reading this in the AutoDetectParser.parse() : > > if (!stream.markSupported()) { > stream = new BufferedInputStream(stream); > } Perhaps Tika should be more defensive and simply always wrap the stream into a BufferedInputStream regardless of whether the original stream claims to support the mark feature. This way we'd avoid the trouble you encountered. BR, Jukka Zitting