Tim Allison created TIKA-3849:
---------------------------------
Summary: Throw UnsupportedVersionException or similar for really
old mdb files
Key: TIKA-3849
URL: https://issues.apache.org/jira/browse/TIKA-3849
Project: Tika
Issue Type: Wish
Reporter: Tim Allison
In processing some digipres2022 bake-off files, I noticed that a number of mdb
files triggered the same exception:
{noformat}
java.io.IOException: Unrecognized map type: 75
at
com.healthmarketscience.jackcess.impl.UsageMap.initHandler(UsageMap.java:150)
at com.healthmarketscience.jackcess.impl.UsageMap.read(UsageMap.java:136)
at
com.healthmarketscience.jackcess.impl.PageChannel.initialize(PageChannel.java:118)
at
com.healthmarketscience.jackcess.impl.DatabaseImpl.<init>(DatabaseImpl.java:579)
at
com.healthmarketscience.jackcess.impl.DatabaseImpl.open(DatabaseImpl.java:440)
at
com.healthmarketscience.jackcess.DatabaseBuilder.open(DatabaseBuilder.java:267)
at
org.apache.tika.parser.microsoft.JackcessParser.parse(JackcessParser.java:94)
{noformat}
Googling this confirmed suspicions that these are pre-97 versions of access
databases. We should improve our exception type/labeling for this type of
exception.
ref: https://sourceforge.net/p/jackcess/bugs/101/
--
This message was sent by Atlassian Jira
(v8.20.10#820010)