Ugh. Thank you! https://issues.apache.org/jira/browse/TIKA-2886
On Wed, May 29, 2019 at 6:57 AM Tucker B <[email protected]> wrote: > > After upgrading to Tika 1.21 I have noticed several known XLSX files > are detected by Tika as "application/x-tika-ooxml". I think I've > narrowed it down to the new StreamingZipContainerDetector. After > inspecting the "[Content_Types].xml" of these XLSX files there is no > reference to any of the configured content types for XLSX in the > OOXML_CONTENT_TYPES in StreamingZipContainerDetector. Specifically, > > "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml" > "application/vnd.ms-excel.sheet.macroEnabled.main+xml" > "application/vnd.ms-excel.sheet.binary.macroEnabled.main" > > I do see a content type of > > "application/vnd.openxmlformats-officedocument.spreadsheetml.template.main+xml" > > in "[Content_Types].xml". Is the StreamingZipContainerDetector missing > the XSSFRelation TEMPLATE_WORKBOOK in OOXML_CONTENT_TYPES?
