All, We recently made TikaInputStream's skip() inherently strict so that it throws an EOF if a parser tries to skip past the end of a file. We didn't notice any problems in our regression tests (aside from some likely truncated mp4s), but we recently got an issue [1] from a user where this is a problem for a tar file created by 7z [2]. Is this a valid tar, or are we right to throw an EOF?
Thank you. Best, Tim [1] https://issues.apache.org/jira/browse/TIKA-3110 [2] https://github.com/AlexOkayJ/apache-tika-tar-issue/blob/master/src/main/resources/7ztar.tar