On Mon, 6 Sep 2010, Ken Krugler wrote:
I recently updated the Bixo project to use Tika 0.8-SNAPSHOT, and a number of documents now fail during parsing that previously passed.

Any chance you could create a new jira issue, and upload one of the problem documents?

Did the Tika-0.7 image parsers (JPEG, GIF, PNG) not extract metadata, and thus not run into these types of issues?

The image metadata stuff has changed dramatically since 0.7, and we're now processing a lot more of the files in search of useful metadata than we used to.

Nick

Reply via email to