On Sun, 23 Sep 2012, naskoo wrote:
Probably TikaInputStream just adds some metadata to include the file extension in the detection.
Nope, as I said:
Try wrapping your InputStream as a TikaInputStream - for full container detection Tika needs to be able to read the whole file, but still have it available for the parser
TikaInputStream provides this buffering, which allows a detector to read the whole file to identify what it contains (which container formats need), whilst still allowing a parser to get at the whole contents to process it
Nick
