Package: file Version: 5.04-5 Severity: normal file detects MS Outlook .msg file as CDF v2 file : A MS Outlook .msg file can be found here to test : * http://jahudson.wiki.hoover.k12.al.us/file/detail/nuclear+chemsitry.msg The header of MS Outlook files is d0 cf 11 e0 a1 b1 1a e1 that is the header of the microsoft Compound File Binary File Format that seems to be used for most of the documents in microsoft systems like word, excel, ppt, ...
A Documentation about MS-CFB can be found here : * http://msdn.microsoft.com/en-us/library/dd942138(v=prot.13).aspx So I think we have two distincts bugs here : 1) MS-CFB are detected as CDF v2 2) While desactivating CDF detection using `-e cdf', an outlook mail is detected as Microsoft Word Document (The rule that matched this is : [690 512 string,=R\000o\000o\000t\000 \000E\000n\000t\000r\000y,"Microsoft Word Document"] cause the string Root Entry is present in MS-CFB files.) I should provide some help implementing an MS-CFB parser to detect the kink of file we are facing, but I don't know if the ms patents allows me to do this...? -- System Information: Debian Release: 6.0.2 APT prefers stable APT policy: (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 2.6.32-5-amd64 (SMP w/1 CPU core) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages file depends on: ii libc6 2.11.2-10 Embedded GNU C Library: Shared lib ii libmagic1 5.04-5 File type determination library us ii zlib1g 1:1.2.3.4.dfsg-3 compression library - runtime file recommends no packages. file suggests no packages. -- no debconf information -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]

