All,
Before I reinvent the wheel...is there an alternative to Java's
DigestInputStream that handles mark, reset and skip? If I read this JDK bug
[0] correctly, Java's DigestInputStream won't be fixed until Java 9.
Over on TIKA-1701, we found that pre-digesting an InputStream and then
resetting can lead to fewer attachments being extracted from truncated
(corrupt) package files -- the digester hits the EOF exception on the package
component before the still-intact child documents can be extracted.
Cheers,
Tim
[0] https://bugs.openjdk.java.net/browse/JDK-6587699