MrC escreveu:
> Leonardo Rodrigues Magalhães wrote:
>
>> MrC escreveu:
>>
>>> Leonardo Rodrigues Magalhães wrote:
>>>
>>>
>>>> Hello,
>>>>
>>>> I have banned some file/MIME types in amavisd-2.6.0 using the 'old'
>>>> way of doing this, the $banned_filename_re.
>>>>
>>>> The banned file admin and user notifications, which i enabled,
>>>> brings me something like:
>>>>
>>>> Banned name: multipart/mixed |
>>>> application/vnd.ms-powerpoint,.doc,Chaplin.pps
>>>> Content type: Banned
>>>> Internal reference code for the message is 24067-22/NWynDBsexbaI
>>>>
>>>>
>>>> It's clearly a powerpoint file, because of it's extension (.pps) as
>>>> well as it's mime type 'application/vnd.ms-powerpoint'.
>>>>
>>>> What i'm trying to understand is where that '.doc' cames from !!!
>>>>
>>>>
>>>>
>>> What type of document does the file(1) utility indicate?
>>> What version of file is on your system?
>>>
>>>
>> I'm running on a Fedora 8 system, with:
>>
>> [EMAIL PROTECTED] ~]# rpm -qi file
>> Name : file Relocations: (not relocatable)
>> Version : 4.21 Vendor: Fedora Project
>> Release : 5.fc8 Build Date: Tue 29 Jan 2008
>> 06:58:26 AM BRST
>>
>> which is the latest file package from F8 repositories.
>>
>> on this system, file returns for .doc and .ppt documents:
>>
>> [EMAIL PROTECTED] user]# file Defesa.doc
>> Defesa.doc: Microsoft Office Document
>> [EMAIL PROTECTED] user]# file Projeto\ Final\ II\ VPN\ -\ última.ppt
>> Projeto Final II VPN - última.ppt: Microsoft Office Document
>> [EMAIL PROTECTED] user]#
>>
>> hmmmmm ...... seems file returns only 'Microsoft Office Document' .....
>>
>>
>
> In the past several versions, the file(1) utility has changed its
> opinion many times regarding how to treat PPT documents:
>
> $ file -v
> file-4.21
> magic file from /usr/share/file/magic
>
> $ file test.ppt
> test.ppt: Microsoft Installer
>
> $ file -i test.ppt
> test.ppt: \012- application/msword
> ---
> $ file -v
> file-4.24
> magic file from /usr/local/share/file/magic
>
> $ file ~/test.ppt
> /home/cappella/test.ppt: Microsoft Office Document
>
> $ file -i ~/test.ppt
> /home/cappella/test.ppt: application/octet-stream
>
> Amavis uses a mapping of full type names to short names, which are then
> later referenced in the $banned_filename_re maps. You can see in the
> latest amavisd that Microsoft Office Document is mapped to short name
> type "doc". This is where ".doc" comes from.
>
> $map_full_type_to_short_type_re = [
> ...
> [qr/^Rich Text Format data\b/ => 'rtf'],
> [qr/^Microsoft Office Document\b/i => 'doc'], # OLE2: doc, ppt,
> # xls, ...
>
> It is likely that the file formats for PPT, XLS, DOC, .etc have not been
> reverse engineered to uniquely distinguish them from each other, and
> instead they are all mapped to Microsoft Office Document (in more recent
> file versions). The file formats became much more complex with later
> versions of Office.
>
> Previous versions of file(1), like 4.21, were very broken in terms of
> identification, and there were many false identifications:
>
> $ file -v
> file-4.21
> magic file from /usr/share/file/magic
>
> $ file -i test.xls
> test.xls: \012- application/msword
>
> $ file test.xls
> test.xls: Microsoft Installer
>
> Clearly, this Excel spreadsheet is not an Word document. Fedora adds
> its own patches to the file utility, of which yours is based on 4.21.
> Since your PPT is identified as Microsoft Office Document, it is clear
> that Fedora has updated the magic database used for file identification
> bringing in line with the more recent 4.24/4.25 releases of file. Yet
> the problem still remains - file types from more recent Office pacakges
> are identified generically as Microsoft Office Document, and not Excel,
> PowerPoint, etc. Given that, I'm not sure what more you can do to
> distinguish the types.
>
> Here's a thread which discusses some of the file(1) issues a while ago:
>
> http://groups.google.com/group/mailing.unix.amavis-user/browse_thread/thread/7147b0a90573690c/04ca5171867925c1?lnk=gst&q=powerpoint+file#04ca5171867925c1
>
MrC, i think it's impossible to put the points clearer than what you
did :)
Unfortunelly, based on the facts you exposed, seems i cannot expect
exact identification of PowerPoint/Word/Excel files ... but, no problem
on that.
i have changed the 'doc' mapping to 'document' on
$map_full_type_to_short_type_re . I'm even thinking of changing it again
to 'msoffice'. With these small changes, i can get ride of the '.doc'
that is appearing on the blocked message.
Thank you very much for your points and clarifications.
--
Atenciosamente / Sincerily,
Leonardo Rodrigues
Solutti Tecnologia
http://www.solutti.com.br
Minha armadilha de SPAM, NÃO mandem email
[EMAIL PROTECTED]
My SPAMTRAP, do not email it
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
AMaViS-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/