Gregor Lang created TIKA-4464: --------------------------------- Summary: Parsing IWork files results in unknown mimetype Key: TIKA-4464 URL: https://issues.apache.org/jira/browse/TIKA-4464 Project: Tika Issue Type: Bug Components: detector, parser Affects Versions: 3.2.1 Reporter: Gregor Lang Attachments: sample-2.pages, sample.key, sample.numbers, sample.pages
When parsing *.pages or *.numbers files the resulting mime-type is always " application/vnd.apple.unknown.13" There seems to be a todo in *IWork13PackageParser* at line 319, which is probably related. {code:java} // Is it the main document? if (name.equals(IWORK13_MAIN_ENTRY)) { // TODO Decode the snappy stream, and check for the Message Type // = 2 (TN::SheetArchive), it is a numbers file; // = 10000 (TP::DocumentArchive), that's a pages file return null; } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)