Package: file
Version: 4.26-1
Severity: wishlist

The current magic data for ZIP files has a few small problems.

There are several formats, such as OpenDocument, that use zip files as
their packaging format, and have a "mimetype" file at the beginning of
the archive to help identify them.

The OpenDocument specification do not require a specific zip archive
version, yet the magic data will only match those that use v2.0.

Also, if a zip v2.0 archive include a "mimetype" file with unknown
contents, 'file' will not report it as a zip file. This is due to a hack
to prevent both "OpenDocument" and "Zip archive data" from being printed.

The correct approach to stop the matching is to start a new 'level 0' test.

I tried to fix those problems in the attached patch.

At first, zip files which include a mimetype file are tested.
If the match succeeds, the program prints the match text and stops.
If the match fails, the program will move on to a new test for "regular"
zip files, and will print "Zip archive data".

After changing the matching technique, I added a test for a new file
format: "OCF container (EPUB)", which is used for ebooks and defined in:
<http://www.idpf.org/ocf/ocf1.0/download/ocf10.htm>

-Ori
diff --git a/magic/Magdir/archive b/magic/Magdir/archive
index b75fac0..f2c92fc 100644
--- a/magic/Magdir/archive
+++ b/magic/Magdir/archive
@@ -560,28 +560,16 @@
 # [JW] see exe section for self-extracting version
 0	string		UC2\x1a		UC2 archive data
 
-# ZIP archives (Greg Roelofs, c/o [EMAIL PROTECTED])
+# ZIP archives with a mimetype file
+# (stops if match is found. Does not match regular ZIP archives)
 0	string		PK\003\004
->4	byte		0x00		Zip archive data
-!:mime	application/zip
->4	byte		0x09		Zip archive data, at least v0.9 to extract
-!:mime	application/zip
->4	byte		0x0a		Zip archive data, at least v1.0 to extract
-!:mime	application/zip
->4	byte		0x0b		Zip archive data, at least v1.1 to extract
-!:mime	application/zip
->0x161	string		WINZIP          Zip archive data, WinZIP self-extracting
-!:mime	application/zip
->4	byte		0x14
->>30	ubelong		!0x6d696d65	Zip archive data, at least v2.0 to extract
-!:mime	application/zip
+>30	string		mimetype
+>>38	string	application/
 
 # OpenOffice.org / KOffice / StarOffice documents
 # Listed here because they ARE zip files
 #
 # From: Abel Cheung <[EMAIL PROTECTED]>
->4	byte		0x14
->>30	string		mimetype
 
 # KOffice (1.2 or above) formats
 >>>50	string	vnd.kde.		KOffice (>=1.2)
@@ -623,7 +611,7 @@
 >>>>>77	string	-master			Master Document
 >>>>73	string	graphics		Drawing
 >>>>>81	string	-template		Template
->>>>73	string	presentation		Presentation
+>>>>73	string	presentation	Presentation
 >>>>>85	string	-template		Template
 >>>>73	string	spreadsheet		Spreadsheet
 >>>>>84	string	-template		Template
@@ -634,6 +622,28 @@
 >>>>73	string	database		Database
 >>>>73	string	image			Image
 
+# OCF container (EPUB)
+# http://www.idpf.org/ocf/ocf1.0/download/ocf10.htm
+#
+# From: Ori Avtalion <[EMAIL PROTECTED]>
+>>>50	string	epub+zip	OCF container (EPUB)
+!:mime	application/epub+zip
+
+# ZIP archives (Greg Roelofs, c/o [EMAIL PROTECTED])
+0	string		PK\003\004	Zip archive data
+>4	byte		0x00	
+!:mime	application/zip
+>4	byte		0x09		\b, at least v0.9 to extract
+!:mime	application/zip
+>4	byte		0x0a		\b, at least v1.0 to extract
+!:mime	application/zip
+>4	byte		0x0b		\b, at least v1.1 to extract
+!:mime	application/zip
+>4	byte		0x14		\b, at least v2.0 to extract
+!:mime	application/zip
+>0x161	string		WINZIP  \b, WinZIP self-extracting
+!:mime	application/zip
+
 # Zoo archiver
 20	lelong		0xfdc4a7dc	Zoo archive data
 !:mime	application/x-zoo

Reply via email to