MSI file is windows installer, but internally it's using MS-CFB file format to store data. To correctly detect it, detector should perform transformation of object names (7z can do this, if I remember correctly) into human-readable names, and then search for special entries
On Fri, Jun 15, 2012 at 10:31 PM, Vish Ramachandran <[email protected]> wrote: > Hi, > > Download the following file, which is MSI installer for 7zip, a zip utility. > > http://downloads.sourceforge.net/sevenzip/7z920-x64.msi > > The following code: > > String detectedType = new Tika().detect(new File("7z920-x64.msi")); > > results in mime: application/x-tika-msoffice > > which is wrong. > > Is this expected, or am I missing something else? > > Thanks > Vish > > > > > > > -- With best wishes, Alex Ott http://alexott.net/ Tiwtter: alexott_en (English), alexott (Russian) Skype: alex.ott
