On Mar 5, 2009, at 4:57 AM, Balazs Horvath wrote: > http://gaarai.com/2009/02/14/generating-mime-type-in-php-is-not-magic/
My summary : " None of the hosting providers I use has the right fileinfo software, nor can I install it. " > (Hey man, how do > you know that? I couldn't find that info in any documentation!) The man page for the " file " command has that documentation, I guess I am quite familiar with that command. I'd used a PHP exec() to the file command for file type detection since PHP 3 ( which kinda dates me ). I simply matched up the file command flags to the predefined constants for the fileinfo options <http://www.php.net/manual/en/fileinfo.constants.php> A few sample files and some sample PHP code produced the information I posted earlier. I just tried .docx, .odt, and .ods - Using options "1046" : docx : application/xml compressed-encoding=application/zip ods : text/plain charset=us-ascii compressed-encoding=application/ octet-stream odt : text/plain charset=us-ascii compressed-encoding=application/ vnd.oasis.opendocument.text Using options "38" : docx : XML document text ( Zip archive data, at least v2.0 to extract) ods : ASCII text, with no line terminators (OpenDocument Spreadsheet) odt : ASCII text, with no line terminators (OpenDocument Text) Those return strings seem to identify the file types fairly conclusively. If you find the file type is a zip file using "normal" 1040 options, poke at it again with different options. I find that opening the magic file with no options allows you to probe the file multiple times using different options, but you have to remember to specify the options at probe time instead of assuming the options you want have been globally specified. Maybe because I got burned on file type issues in the past I am sensitive to it ( and was forced to learn about it in detail ). Looking at the upstream fileinfo mail list, newer versions might be able to better determine Office 2007 file types. <http://mx.gw.com/pipermail/file/2009/000311.html> However my test of a Fedora 11 rpm rebuilt on F10 didn't show any improvement. Another interesting thread - <http://mx.gw.com/pipermail/file/2008/000283.html> BTW : OpenOffice.org uses a standard file format : <http://www.idpf.org/ocf/ocf1.0/download/ocf10.htm> This is the same format used by Adobe for Mars files. If you explode the zip, there is a "mimetype" file at the root level with the mime-type inside. The fileinfo library can see that in a odt but for some reason not a ods ( or a mars ). Not sure if code to peek inside the zip for that mimetype file is worthwhile. MS uses something similar to but not the same, <snarky>typical of MS</ snarky> <http://en.wikipedia.org/wiki/Open_Packaging_Convention> > If the user sends something bogus by playing with the extension, > who cares? I think passing the security buck to some other part of the system isn't good practice. If you look at the OWASP site at all, the preferred way is to validate and test all input _and_ output. -- Charles Dostale System Admin - Silver Oaks Communications http://www.silveroaks.com/ 824 17th Street, Moline IL 61265 _______________________________________________ List info: http://lists.roundcube.net/dev/
