Thanks Jukka. Yes which basically means that Detector.detect should be passed a mark-supported stream, which could be either an inherently mark-supported stream or wrapped within a mark-supported stream like a TikaInputStream or a BufferedInputStream. This is explicitly stated in its API documentation as well. But its easy to miss it which can lead to hard to debug issues later. I think that this method should not proceed with processing if the passed stream isn't mark-supported. Maybe by just return at the method start with application/octet-stream or throw an IllegalArgumentException.
On Fri, Jun 19, 2015 at 7:36 PM, Jukka Zitting <[email protected]> wrote: > You can make the test pass by changing the assertion to: > > assertTrue(IOUtils.contentEquals(stream, originalStream)); > > Wrapping a stream with TikaInputStream doesn't magically add > mark/reset support to the original stream; only the wrapper instance > has this feature. >
