Script 'mail_helper' called by obssrc Hello community, here is the log from the commit of package python-filetype for openSUSE:Factory checked in at 2022-11-08 11:48:09 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/python-filetype (Old) and /work/SRC/openSUSE:Factory/.python-filetype.new.1597 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "python-filetype" Tue Nov 8 11:48:09 2022 rev:5 rq:1034476 version:1.2.0 Changes: -------- --- /work/SRC/openSUSE:Factory/python-filetype/python-filetype.changes 2022-03-28 17:01:31.161047476 +0200 +++ /work/SRC/openSUSE:Factory/.python-filetype.new.1597/python-filetype.changes 2022-11-08 11:50:02.736482028 +0100 @@ -1,0 +2,87 @@ +Mon Nov 7 23:01:24 UTC 2022 - Yogalakshmi Arunachalam <yarunacha...@suse.com> + +- Update to 1.2.0 + * Merge pull request #147 from sayanarijit/fix-146 + * Add tests for m4a + * Try matching audio before video + * Merge pull request #145 from RSabet/master + * update README to include avif + * added image filetype avif + * Update __init__.py + * Merge pull request #141 from ferstar/master + * test: remove unused imported(F401) + * refactor: duck-typing reading magic string and try to restore the reader position if possible + * test: fix E275 missing whitespace after keyword + * test: Use tox pipeline instead of pytest + * test: ignore E501 error for flake8 check + * fix: CLI params parser + * Merge pull request #137 from ferstar/master + * fix: guess ".docx" func and add another doc file test case + * fix: guess ".doc" func and add another doc file test case + * test: skip benchmark test in tox config + * fix: restore reader position after retrieving signature bytes + * Merge pull request #136 from ferstar/master + * test: no need to skip zstd test case + * Merge pull request #135 from ferstar/master + * fix: regression for file-like obj file type detection + * Merge pull request #134 from babenek/actions + * Merge pull request #129 from ferstar/master + +- Update to 1.1.0 + * Merge branch 'master' into master + * Merge pull request #133 from magbyr/master + * Merge pull request #131 from babenek/master + * CI workflow in github actions + * Changed to if statements in matching method + * Changed return method because of coverage calculation problems + * Apply suggestions from code review + * README changes + * Linter changes + * Added document filetypes for doc, docx, odt, xls, xlsx, ods, ppt, pptx and odp. Added tests and sample documents for document filetypes + * Fix undocumented exception + * style: Simplify binary to integer method + * feat: add zstd skippable frames support + * test: fix the tox config and missing test sample files + * test: fix the zst test sample file + * fix(readme): rst syntax wtf + +- Update to 1.0.13 + * feat(history): update changes + +- Update to 1.0.12 + * Merge pull request #127 from ferstar/master + * Merge pull request #123 from levrik/patch-1 + * Merge pull request #126 from babenek/master + * docs: add zstd type + * fix: remove unnecessary duck-typing try + * feat: add zst(d) type + * chore: fix lint errors + * test: fix memoryview test cases + * BugFix for uncaught exceptions + * Support PDF with BOM + +- Update to 1.0.11 + * chore(version): bump patch + * chore(version): bump patch + * refactor(apng) + * refactor(apng) + * Merge pull request #120 from CatKasha/apng + * fix typo + * add APNG support (part 3) + * add APNG support (part 2) + * add APNG support (part 1) + * chore(history): version notes + * Merge branch 'master' of https://github.com/h2non/filetype.py + * feat: version bump + * Merge pull request #118 from smasty/woff-flavors-support + * fix(font): minimum length check (woff) + * Update __init__.py + * Update setup.py + * Merge pull request #109 from fraang/master + * Add support for more WOFF/WOFF2 flavors + * Merge pull request #114 from andersk/m4a + * fix(base): remove property decorator + * Use correct audio/mp4 type for m4a. + + +------------------------------------------------------------------- Old: ---- filetype-1.0.10.tar.gz New: ---- filetype-1.2.0.tar.gz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ python-filetype.spec ++++++ --- /var/tmp/diff_new_pack.tcx5DT/_old 2022-11-08 11:50:03.164484191 +0100 +++ /var/tmp/diff_new_pack.tcx5DT/_new 2022-11-08 11:50:03.168484211 +0100 @@ -18,7 +18,7 @@ %{?!python_module:%define python_module() python-%{**} python3-%{**}} Name: python-filetype -Version: 1.0.10 +Version: 1.2.0 Release: 0 Summary: Infer file type and MIME type of any file/buffer. No external dependencies License: MIT ++++++ filetype-1.0.10.tar.gz -> filetype-1.2.0.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/History.md new/filetype-1.2.0/History.md --- old/filetype-1.0.10/History.md 2022-02-03 20:34:19.000000000 +0100 +++ new/filetype-1.2.0/History.md 2022-11-02 18:30:57.000000000 +0100 @@ -1,4 +1,90 @@ +v1.2.0 / 2022-11-02 +=================== + + * chore(version): bump minor + * Merge pull request #147 from sayanarijit/fix-146 + * Add tests for m4a + * Try matching audio before video + * Merge pull request #145 from RSabet/master + * update README to include avif + * added image filetype avif + * Update __init__.py + * Merge pull request #141 from ferstar/master + * test: remove unused imported(F401) + * refactor: duck-typing reading magic string and try to restore the reader position if possible + * test: fix E275 missing whitespace after keyword + * test: Use tox pipeline instead of pytest + * test: ignore E501 error for flake8 check + * fix: CLI params parser + * Merge pull request #137 from ferstar/master + * fix: guess ".docx" func and add another doc file test case + * fix: guess ".doc" func and add another doc file test case + * test: skip benchmark test in tox config + * fix: restore reader position after retrieving signature bytes + * Merge pull request #136 from ferstar/master + * test: no need to skip zstd test case + * Merge pull request #135 from ferstar/master + * fix: regression for file-like obj file type detection + * Merge pull request #134 from babenek/actions + * Merge pull request #129 from ferstar/master + * Merge branch 'master' into master + * Merge pull request #133 from magbyr/master + * Merge pull request #131 from babenek/master + * CI workflow in github actions + * Changed to if statements in matching method + * Changed return method because of coverage calculation problems + * Extra line at EOF + * Extra line at EOF + * Extra line at EOF + * Apply suggestions from code review + * README changes + * Linter changes + * Added document filetypes for doc, docx, odt, xls, xlsx, ods, ppt, pptx and odp. Added tests and sample documents for document filetypes + * Fix undocumented exception + * style: Simplify binary to integer method + * feat: add zstd skippable frames support + * test: fix the tox config and missing test sample files + * test: fix the zst test sample file + * fix(readme): rst syntax wtf + +v1.1.0 / 2022-07-12 +=================== + + * feat(version): bump minor + * Merge pull request #127 from ferstar/master + * Merge pull request #123 from levrik/patch-1 + * Merge pull request #126 from babenek/master + * docs: add zstd type + * fix: remove unnecessary duck-typing try + * feat: add zst(d) type + * chore: fix lint errors + * test: fix memoryview test cases + * BugFix for uncaught exceptions + * Support PDF with BOM + +v1.0.13 / 2022-04-21 +==================== + + * chore(version): bump patch + * chore(version): bump patch + * refactor(apng) + * refactor(apng) + * Merge pull request #120 from CatKasha/apng + * fix typo + * add APNG support (part 3) + * add APNG support (part 2) + * add APNG support (part 1) + +v1.0.12 / 2022-04-19 +==================== + + * Merge branch 'master' of https://github.com/h2non/filetype.py + * feat: version bump + * Merge pull request #118 from smasty/woff-flavors-support + * fix(font): minimum length check (woff) + * Add support for more WOFF/WOFF2 flavors + v1.0.10 / 2022-02-03 ==================== diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/PKG-INFO new/filetype-1.2.0/PKG-INFO --- old/filetype-1.0.10/PKG-INFO 2022-02-03 20:35:45.000000000 +0100 +++ new/filetype-1.2.0/PKG-INFO 2022-11-02 18:33:56.000000000 +0100 @@ -1,6 +1,6 @@ Metadata-Version: 1.1 Name: filetype -Version: 1.0.10 +Version: 1.2.0 Summary: Infer file type and MIME type of any file/buffer. No external dependencies. Home-page: https://github.com/h2non/filetype.py Author: Tomas Aparicio @@ -76,6 +76,7 @@ - **jpg** - ``image/jpeg`` - **jpx** - ``image/jpx`` - **png** - ``image/png`` + - **apng** - ``image/apng`` - **gif** - ``image/gif`` - **webp** - ``image/webp`` - **cr2** - ``image/x-canon-cr2`` @@ -85,6 +86,7 @@ - **psd** - ``image/vnd.adobe.photoshop`` - **ico** - ``image/x-icon`` - **heic** - ``image/heic`` + - **avif** - ``image/avif`` Video ^^^^^ @@ -106,7 +108,7 @@ - **aac** - ``audio/aac`` - **mid** - ``audio/midi`` - **mp3** - ``audio/mpeg`` - - **m4a** - ``audio/m4a`` + - **m4a** - ``audio/mp4`` - **ogg** - ``audio/ogg`` - **flac** - ``audio/x-flac`` - **wav** - ``audio/x-wav`` @@ -130,7 +132,6 @@ - **pdf** - ``application/pdf`` - **exe** - ``application/x-msdownload`` - **swf** - ``application/x-shockwave-flash`` - - **rtf** - ``application/rtf`` - **eot** - ``application/octet-stream`` - **ps** - ``application/postscript`` @@ -144,6 +145,20 @@ - **lzo** - ``application/x-lzop`` - **lz** - ``application/x-lzip`` - **lz4** - ``application/x-lz4`` + - **zstd** - ``application/zstd`` + + Document + ^^^^^^^^ + + - **doc** - ``application/msword`` + - **docx** - ``application/vnd.openxmlformats-officedocument.wordprocessingml.document`` + - **odt** - ``application/vnd.oasis.opendocument.text`` + - **xls** - ``application/vnd.ms-excel`` + - **xlsx** - ``application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`` + - **ods** - ``application/vnd.oasis.opendocument.spreadsheet`` + - **ppt** - ``application/vnd.ms-powerpoint`` + - **pptx** - ``application/vnd.openxmlformats-officedocument.presentationml.presentation`` + - **odp** - ``application/vnd.oasis.opendocument.presentation`` Font ^^^^ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/README.rst new/filetype-1.2.0/README.rst --- old/filetype-1.0.10/README.rst 2021-12-20 14:33:10.000000000 +0100 +++ new/filetype-1.2.0/README.rst 2022-11-02 18:33:53.000000000 +0100 @@ -67,6 +67,7 @@ - **jpg** - ``image/jpeg`` - **jpx** - ``image/jpx`` - **png** - ``image/png`` +- **apng** - ``image/apng`` - **gif** - ``image/gif`` - **webp** - ``image/webp`` - **cr2** - ``image/x-canon-cr2`` @@ -76,6 +77,7 @@ - **psd** - ``image/vnd.adobe.photoshop`` - **ico** - ``image/x-icon`` - **heic** - ``image/heic`` +- **avif** - ``image/avif`` Video ^^^^^ @@ -97,7 +99,7 @@ - **aac** - ``audio/aac`` - **mid** - ``audio/midi`` - **mp3** - ``audio/mpeg`` -- **m4a** - ``audio/m4a`` +- **m4a** - ``audio/mp4`` - **ogg** - ``audio/ogg`` - **flac** - ``audio/x-flac`` - **wav** - ``audio/x-wav`` @@ -121,7 +123,6 @@ - **pdf** - ``application/pdf`` - **exe** - ``application/x-msdownload`` - **swf** - ``application/x-shockwave-flash`` - - **rtf** - ``application/rtf`` - **eot** - ``application/octet-stream`` - **ps** - ``application/postscript`` @@ -135,6 +136,20 @@ - **lzo** - ``application/x-lzop`` - **lz** - ``application/x-lzip`` - **lz4** - ``application/x-lz4`` +- **zstd** - ``application/zstd`` + +Document +^^^^^^^^ + +- **doc** - ``application/msword`` +- **docx** - ``application/vnd.openxmlformats-officedocument.wordprocessingml.document`` +- **odt** - ``application/vnd.oasis.opendocument.text`` +- **xls** - ``application/vnd.ms-excel`` +- **xlsx** - ``application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`` +- **ods** - ``application/vnd.oasis.opendocument.spreadsheet`` +- **ppt** - ``application/vnd.ms-powerpoint`` +- **pptx** - ``application/vnd.openxmlformats-officedocument.presentationml.presentation`` +- **odp** - ``application/vnd.oasis.opendocument.presentation`` Font ^^^^ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/__init__.py new/filetype-1.2.0/filetype/__init__.py --- old/filetype-1.0.10/filetype/__init__.py 2022-02-03 20:34:06.000000000 +0100 +++ new/filetype-1.2.0/filetype/__init__.py 2022-10-14 17:57:19.000000000 +0200 @@ -7,4 +7,4 @@ from .match import * # noqa # Current package semver version -__version__ = version = '1.0.10' +__version__ = version = '1.2.0' diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/__main__.py new/filetype-1.2.0/filetype/__main__.py --- old/filetype-1.0.10/filetype/__main__.py 2021-09-22 18:50:21.000000000 +0200 +++ new/filetype-1.2.0/filetype/__main__.py 2022-10-14 17:57:19.000000000 +0200 @@ -1,3 +1,5 @@ +import sys + import filetype @@ -12,17 +14,23 @@ def main(): import argparse - parser = argparse.ArgumentParser(description='Determine type of FILEs.') - parser.add_argument("file", nargs='+') - parser.add_argument('-v', '--version', action='store_true', - help='output version information and exit') + parser = argparse.ArgumentParser( + prog='filetype', description='Determine type of FILEs.' + ) + parser.add_argument('-f', '--file', nargs='+') + parser.add_argument( + '-v', '--version', action='version', + version='%(prog)s ' + filetype.version, + help='output version information and exit' + ) + args = parser.parse_args() + if len(sys.argv) < 2: + parser.print_help() + sys.exit(1) - if args.version: - print(filetype.version) - else: - for i in args.file: - guess(i) + for i in args.file: + guess(i) if __name__ == '__main__': diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/helpers.py new/filetype-1.2.0/filetype/helpers.py --- old/filetype-1.0.10/filetype/helpers.py 2022-02-03 20:33:43.000000000 +0100 +++ new/filetype-1.2.0/filetype/helpers.py 2022-10-14 17:57:19.000000000 +0200 @@ -3,7 +3,7 @@ from __future__ import absolute_import from .types import TYPES from .match import ( - image_match, font_match, + image_match, font_match, document_match, video_match, audio_match, archive_match ) @@ -122,3 +122,19 @@ TypeError: if obj is not a supported type. """ return font_match(obj) is not None + + +def is_document(obj): + """ + Checks if a given input is a supported type document. + + Args: + obj: path to file, bytes or bytearray. + + Returns: + True if obj is a valid document. Otherwise False. + + Raises: + TypeError: if obj is not a supported type. + """ + return document_match(obj) is not None diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/match.py new/filetype-1.2.0/filetype/match.py --- old/filetype-1.0.10/filetype/match.py 2021-09-22 18:50:21.000000000 +0200 +++ new/filetype-1.2.0/filetype/match.py 2022-10-14 17:57:19.000000000 +0200 @@ -5,6 +5,7 @@ from .types import ARCHIVE as archive_matchers from .types import AUDIO as audio_matchers from .types import APPLICATION as application_matchers +from .types import DOCUMENT as document_matchers from .types import FONT as font_matchers from .types import IMAGE as image_matchers from .types import VIDEO as video_matchers @@ -135,3 +136,20 @@ TypeError: if obj is not a supported type. """ return match(obj, application_matchers) + + +def document_match(obj): + """ + Matches the given input against the available + document type matchers. + + Args: + obj: path to file, bytes or bytearray. + + Returns: + Type instance if matches. Otherwise None. + + Raises: + TypeError: if obj is not a supported type. + """ + return match(obj, document_matchers) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/types/__init__.py new/filetype-1.2.0/filetype/types/__init__.py --- old/filetype-1.0.10/filetype/types/__init__.py 2021-12-20 14:33:10.000000000 +0100 +++ new/filetype-1.2.0/filetype/types/__init__.py 2022-11-02 18:30:43.000000000 +0100 @@ -5,6 +5,7 @@ from . import archive from . import audio from . import application +from . import document from . import font from . import image from . import video @@ -16,6 +17,7 @@ image.Xcf(), image.Jpeg(), image.Jpx(), + image.Apng(), image.Png(), image.Gif(), image.Webp(), @@ -27,6 +29,7 @@ image.Ico(), image.Heic(), image.Dcm(), + image.Avif(), ) # Supported video types @@ -89,6 +92,7 @@ archive.Lz(), archive.Elf(), archive.Lz4(), + archive.Zstd(), ) # Supported archive container types @@ -96,6 +100,19 @@ application.Wasm(), ) +# Supported document types +DOCUMENT = ( + document.Doc(), + document.Docx(), + document.Odt(), + document.Xls(), + document.Xlsx(), + document.Ods(), + document.Ppt(), + document.Pptx(), + document.Odp(), +) + # Expose supported type matchers -TYPES = list(VIDEO + IMAGE + AUDIO + FONT + ARCHIVE + APPLICATION) +TYPES = list(IMAGE + AUDIO + VIDEO + FONT + DOCUMENT + ARCHIVE + APPLICATION) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/types/archive.py new/filetype-1.2.0/filetype/types/archive.py --- old/filetype-1.0.10/filetype/types/archive.py 2021-09-22 18:50:21.000000000 +0200 +++ new/filetype-1.2.0/filetype/types/archive.py 2022-10-14 17:57:19.000000000 +0200 @@ -2,6 +2,8 @@ from __future__ import absolute_import +import struct + from .base import Type @@ -184,6 +186,13 @@ ) def match(self, buf): + # Detect BOM and skip first 3 bytes + if (len(buf) > 3 and + buf[0] == 0xEF and + buf[1] == 0xBB and + buf[2] == 0xBF): # noqa E129 + buf = buf[3:] + return (len(buf) > 3 and buf[0] == 0x25 and buf[1] == 0x50 and @@ -628,3 +637,51 @@ def match(self, buf): return buf[:4] == bytearray([0xed, 0xab, 0xee, 0xdb]) + + +class Zstd(Type): + """ + Implements the Zstd archive type matcher. + https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md + """ + MIME = 'application/zstd' + EXTENSION = 'zst' + MAGIC_SKIPPABLE_START = 0x184D2A50 + MAGIC_SKIPPABLE_MASK = 0xFFFFFFF0 + + def __init__(self): + super(Zstd, self).__init__( + mime=Zstd.MIME, + extension=Zstd.EXTENSION + ) + + @staticmethod + def _to_little_endian_int(buf): + # return int.from_bytes(buf, byteorder='little') + return struct.unpack('<L', buf)[0] + + def match(self, buf): + # Zstandard compressed data is made of one or more frames. + # There are two frame formats defined by Zstandard: + # Zstandard frames and Skippable frames. + # See more details from + # https://tools.ietf.org/id/draft-kucherawy-dispatch-zstd-00.html#rfc.section.2 + is_zstd = ( + len(buf) > 3 and + buf[0] in (0x22, 0x23, 0x24, 0x25, 0x26, 0x27, 0x28) and + buf[1] == 0xb5 and + buf[2] == 0x2f and + buf[3] == 0xfd) + if is_zstd: + return True + # skippable frames + if len(buf) < 8: + return False + magic = self._to_little_endian_int(buf[:4]) & Zstd.MAGIC_SKIPPABLE_MASK + if magic == Zstd.MAGIC_SKIPPABLE_START: + user_data_len = self._to_little_endian_int(buf[4:8]) + if len(buf) < 8 + user_data_len: + return False + next_frame = buf[8 + user_data_len:] + return self.match(next_frame) + return False diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/types/audio.py new/filetype-1.2.0/filetype/types/audio.py --- old/filetype-1.0.10/filetype/types/audio.py 2021-12-20 14:33:10.000000000 +0100 +++ new/filetype-1.2.0/filetype/types/audio.py 2022-04-19 22:21:02.000000000 +0200 @@ -56,7 +56,7 @@ """ Implements the M4A audio type matcher. """ - MIME = 'audio/m4a' + MIME = 'audio/mp4' EXTENSION = 'm4a' def __init__(self): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/types/base.py new/filetype-1.2.0/filetype/types/base.py --- old/filetype-1.0.10/filetype/types/base.py 2020-01-18 17:45:56.000000000 +0100 +++ new/filetype-1.2.0/filetype/types/base.py 2022-04-19 22:21:02.000000000 +0200 @@ -19,11 +19,9 @@ def extension(self): return self.__extension - @property def is_extension(self, extension): return self.__extension is extension - @property def is_mime(self, mime): return self.__mime is mime diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/types/document.py new/filetype-1.2.0/filetype/types/document.py --- old/filetype-1.0.10/filetype/types/document.py 1970-01-01 01:00:00.000000000 +0100 +++ new/filetype-1.2.0/filetype/types/document.py 2022-10-14 17:57:19.000000000 +0200 @@ -0,0 +1,256 @@ +# -*- coding: utf-8 -*- + +from __future__ import absolute_import + +from .base import Type + + +class ZippedDocumentBase(Type): + def match(self, buf): + # start by checking for ZIP local file header signature + idx = self.search_signature(buf, 0, 6000) + if idx != 0: + return + + return self.match_document(buf) + + def match_document(self, buf): + raise NotImplementedError + + def compare_bytes(self, buf, subslice, start_offset): + sl = len(subslice) + + if start_offset + sl > len(buf): + return False + + return buf[start_offset:start_offset + sl] == subslice + + def search_signature(self, buf, start, rangeNum): + signature = b"PK\x03\x04" + length = len(buf) + + end = start + rangeNum + end = length if end > length else end + + if start >= end: + return -1 + + try: + return buf.index(signature, start, end) + except ValueError: + return -1 + + +class OpenDocument(ZippedDocumentBase): + def match_document(self, buf): + # Check if first file in archive is the identifying file + if not self.compare_bytes(buf, b"mimetype", 0x1E): + return + + # Check content of mimetype file if it matches current mime + return self.compare_bytes(buf, bytes(self.mime, "ASCII"), 0x26) + + +class OfficeOpenXml(ZippedDocumentBase): + def match_document(self, buf): + # Check if first file in archive is the identifying file + ft = self.match_filename(buf, 0x1E) + if ft: + return ft + + # Otherwise check that the fist file is one of these + if ( + not self.compare_bytes(buf, b"[Content_Types].xml", 0x1E) + and not self.compare_bytes(buf, b"_rels/.rels", 0x1E) + and not self.compare_bytes(buf, b"docProps", 0x1E) + ): + return + + # Loop through next 3 files and check if they match + # NOTE: OpenOffice/Libreoffice orders ZIP entry differently, so check the 4th file + # https://github.com/h2non/filetype/blob/d730d98ad5c990883148485b6fd5adbdd378364a/matchers/document.go#L134 + idx = 0 + for i in range(4): + # Search for next file header + idx = self.search_signature(buf, idx + 4, 6000) + if idx == -1: + return + + # Filename is at file header + 30 + ft = self.match_filename(buf, idx + 30) + if ft: + return ft + + def match_filename(self, buf, offset): + if self.compare_bytes(buf, b"word/", offset): + return ( + self.mime + == "application/vnd.openxmlformats-officedocument.wordprocessingml.document" + ) + if self.compare_bytes(buf, b"ppt/", offset): + return ( + self.mime + == "application/vnd.openxmlformats-officedocument.presentationml.presentation" + ) + if self.compare_bytes(buf, b"xl/", offset): + return ( + self.mime + == "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" + ) + + +class Doc(Type): + """ + Implements the Microsoft Word (Office 97-2003) document type matcher. + """ + + MIME = "application/msword" + EXTENSION = "doc" + + def __init__(self): + super(Doc, self).__init__(mime=Doc.MIME, extension=Doc.EXTENSION) + + def match(self, buf): + if len(buf) > 515 and buf[0:8] == b"\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1": + if buf[512:516] == b"\xEC\xA5\xC1\x00": + return True + if ( + len(buf) > 2142 + and b"\x00\x0A\x00\x00\x00MSWordDoc\x00\x10\x00\x00\x00Word.Document.8\x00\xF49\xB2q" + in buf[2075:2142] + ): + return True + + return False + + +class Docx(OfficeOpenXml): + """ + Implements the Microsoft Word OOXML (Office 2007+) document type matcher. + """ + + MIME = "application/vnd.openxmlformats-officedocument.wordprocessingml.document" + EXTENSION = "docx" + + def __init__(self): + super(Docx, self).__init__(mime=Docx.MIME, extension=Docx.EXTENSION) + + +class Odt(OpenDocument): + """ + Implements the OpenDocument Text document type matcher. + """ + + MIME = "application/vnd.oasis.opendocument.text" + EXTENSION = "odt" + + def __init__(self): + super(Odt, self).__init__(mime=Odt.MIME, extension=Odt.EXTENSION) + + +class Xls(Type): + """ + Implements the Microsoft Excel (Office 97-2003) document type matcher. + """ + + MIME = "application/vnd.ms-excel" + EXTENSION = "xls" + + def __init__(self): + super(Xls, self).__init__(mime=Xls.MIME, extension=Xls.EXTENSION) + + def match(self, buf): + if len(buf) > 520 and buf[0:8] == b"\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1": + if buf[512:516] == b"\xFD\xFF\xFF\xFF" and ( + buf[518] == 0x00 or buf[518] == 0x02 + ): + return True + if buf[512:520] == b"\x09\x08\x10\x00\x00\x06\x05\x00": + return True + if ( + len(buf) > 2095 + and b"\xE2\x00\x00\x00\x5C\x00\x70\x00\x04\x00\x00Calc" + in buf[1568:2095] + ): + return True + + return False + + +class Xlsx(OfficeOpenXml): + """ + Implements the Microsoft Excel OOXML (Office 2007+) document type matcher. + """ + + MIME = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" + EXTENSION = "xlsx" + + def __init__(self): + super(Xlsx, self).__init__(mime=Xlsx.MIME, extension=Xlsx.EXTENSION) + + +class Ods(OpenDocument): + """ + Implements the OpenDocument Spreadsheet document type matcher. + """ + + MIME = "application/vnd.oasis.opendocument.spreadsheet" + EXTENSION = "ods" + + def __init__(self): + super(Ods, self).__init__(mime=Ods.MIME, extension=Ods.EXTENSION) + + +class Ppt(Type): + """ + Implements the Microsoft PowerPoint (Office 97-2003) document type matcher. + """ + + MIME = "application/vnd.ms-powerpoint" + EXTENSION = "ppt" + + def __init__(self): + super(Ppt, self).__init__(mime=Ppt.MIME, extension=Ppt.EXTENSION) + + def match(self, buf): + if len(buf) > 524 and buf[0:8] == b"\xD0\xCF\x11\xE0\xA1\xB1\x1A\xE1": + if buf[512:516] == b"\xA0\x46\x1D\xF0": + return True + if buf[512:516] == b"\x00\x6E\x1E\xF0": + return True + if buf[512:516] == b"\x0F\x00\xE8\x03": + return True + if buf[512:516] == b"\xFD\xFF\xFF\xFF" and buf[522:524] == b"\x00\x00": + return True + if ( + len(buf) > 2096 + and buf[2072:2096] + == b"\x00\xB9\x29\xE8\x11\x00\x00\x00MS PowerPoint 97" + ): + return True + + return False + + +class Pptx(OfficeOpenXml): + """ + Implements the Microsoft PowerPoint OOXML (Office 2007+) document type matcher. + """ + + MIME = "application/vnd.openxmlformats-officedocument.presentationml.presentation" + EXTENSION = "pptx" + + def __init__(self): + super(Pptx, self).__init__(mime=Pptx.MIME, extension=Pptx.EXTENSION) + + +class Odp(OpenDocument): + """ + Implements the OpenDocument Presentation document type matcher. + """ + + MIME = "application/vnd.oasis.opendocument.presentation" + EXTENSION = "odp" + + def __init__(self): + super(Odp, self).__init__(mime=Odp.MIME, extension=Odp.EXTENSION) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/types/font.py new/filetype-1.2.0/filetype/types/font.py --- old/filetype-1.0.10/filetype/types/font.py 2020-01-18 17:45:56.000000000 +0100 +++ new/filetype-1.2.0/filetype/types/font.py 2022-04-19 22:25:27.000000000 +0200 @@ -24,10 +24,18 @@ buf[1] == 0x4F and buf[2] == 0x46 and buf[3] == 0x46 and - buf[4] == 0x00 and - buf[5] == 0x01 and - buf[6] == 0x00 and - buf[7] == 0x00) + ((buf[4] == 0x00 and + buf[5] == 0x01 and + buf[6] == 0x00 and + buf[7] == 0x00) or + (buf[4] == 0x4F and + buf[5] == 0x54 and + buf[6] == 0x54 and + buf[7] == 0x4F) or + (buf[4] == 0x74 and + buf[5] == 0x72 and + buf[6] == 0x75 and + buf[7] == 0x65))) class Woff2(Type): @@ -49,10 +57,18 @@ buf[1] == 0x4F and buf[2] == 0x46 and buf[3] == 0x32 and - buf[4] == 0x00 and - buf[5] == 0x01 and - buf[6] == 0x00 and - buf[7] == 0x00) + ((buf[4] == 0x00 and + buf[5] == 0x01 and + buf[6] == 0x00 and + buf[7] == 0x00) or + (buf[4] == 0x4F and + buf[5] == 0x54 and + buf[6] == 0x54 and + buf[7] == 0x4F) or + (buf[4] == 0x74 and + buf[5] == 0x72 and + buf[6] == 0x75 and + buf[7] == 0x65))) class Ttf(Type): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/types/image.py new/filetype-1.2.0/filetype/types/image.py --- old/filetype-1.0.10/filetype/types/image.py 2021-09-22 18:50:21.000000000 +0200 +++ new/filetype-1.2.0/filetype/types/image.py 2022-11-02 18:30:43.000000000 +0100 @@ -48,6 +48,45 @@ ) +class Apng(Type): + """ + Implements the APNG image type matcher. + """ + MIME = 'image/apng' + EXTENSION = 'apng' + + def __init__(self): + super(Apng, self).__init__( + mime=Apng.MIME, + extension=Apng.EXTENSION + ) + + def match(self, buf): + if (len(buf) > 8 and + buf[:8] == bytearray([0x89, 0x50, 0x4e, 0x47, + 0x0d, 0x0a, 0x1a, 0x0a])): + # cursor in buf, skip already readed 8 bytes + i = 8 + while len(buf) > i: + data_length = int.from_bytes(buf[i:i+4], byteorder="big") + i += 4 + + chunk_type = buf[i:i+4].decode("ascii", errors='ignore') + i += 4 + + # acTL chunk in APNG should appears first than IDAT + # IEND is end of PNG + if (chunk_type == "IDAT" or chunk_type == "IEND"): + return False + elif (chunk_type == "acTL"): + return True + + # move to the next chunk by skipping data and crc (4 bytes) + i += data_length + 4 + + return False + + class Png(Type): """ Implements the PNG image type matcher. @@ -152,12 +191,12 @@ ) def match(self, buf): - return (len(buf) > 3 and + return (len(buf) > 9 and ((buf[0] == 0x49 and buf[1] == 0x49 and buf[2] == 0x2A and buf[3] == 0x0) or (buf[0] == 0x4D and buf[1] == 0x4D and buf[2] == 0x0 and buf[3] == 0x2A)) - and not(buf[8] == 0x43 and buf[9] == 0x52)) + and not (buf[8] == 0x43 and buf[9] == 0x52)) class Bmp(Type): @@ -317,3 +356,28 @@ def match(self, buf): return buf[:10] == bytearray([0x67, 0x69, 0x6d, 0x70, 0x20, 0x78, 0x63, 0x66, 0x20, 0x76]) + + +class Avif(IsoBmff): + """ + Implements the AVIF image type matcher. + """ + MIME = 'image/avif' + EXTENSION = 'avif' + + def __init__(self): + super(Avif, self).__init__( + mime=Avif.MIME, + extension=Avif.EXTENSION + ) + + def match(self, buf): + if not self._is_isobmff(buf): + return False + + major_brand, minor_version, compatible_brands = self._get_ftyp(buf) + if major_brand == 'avif': + return True + if major_brand in ['mif1', 'msf1'] and 'avif' in compatible_brands: + return True + return False diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/types/isobmff.py new/filetype-1.2.0/filetype/types/isobmff.py --- old/filetype-1.0.10/filetype/types/isobmff.py 2020-01-18 17:45:56.000000000 +0100 +++ new/filetype-1.2.0/filetype/types/isobmff.py 2022-10-14 17:57:19.000000000 +0200 @@ -24,10 +24,10 @@ def _get_ftyp(self, buf): ftyp_len = int(codecs.encode(buf[0:4], 'hex'), 16) - major_brand = buf[8:12].decode() + major_brand = buf[8:12].decode(errors='ignore') minor_version = int(codecs.encode(buf[12:16], 'hex'), 16) compatible_brands = [] for i in range(16, ftyp_len, 4): - compatible_brands.append(buf[i:i+4].decode()) + compatible_brands.append(buf[i:i+4].decode(errors='ignore')) return major_brand, minor_version, compatible_brands diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/types/video.py new/filetype-1.2.0/filetype/types/video.py --- old/filetype-1.0.10/filetype/types/video.py 2021-09-22 18:50:21.000000000 +0200 +++ new/filetype-1.2.0/filetype/types/video.py 2022-07-12 16:48:24.000000000 +0200 @@ -67,20 +67,9 @@ ) def match(self, buf): - return ((len(buf) > 15 and - buf[0] == 0x1A and buf[1] == 0x45 and - buf[2] == 0xDF and buf[3] == 0xA3 and - buf[4] == 0x93 and buf[5] == 0x42 and - buf[6] == 0x82 and buf[7] == 0x88 and - buf[8] == 0x6D and buf[9] == 0x61 and - buf[10] == 0x74 and buf[11] == 0x72 and - buf[12] == 0x6F and buf[13] == 0x73 and - buf[14] == 0x6B and buf[15] == 0x61) or - (len(buf) > 38 and - buf[31] == 0x6D and buf[32] == 0x61 and - buf[33] == 0x74 and buf[34] == 0x72 and - buf[35] == 0x6f and buf[36] == 0x73 and - buf[37] == 0x6B and buf[38] == 0x61)) + contains_ebml_element = buf.startswith(b'\x1A\x45\xDF\xA3') + contains_doctype_element = buf.find(b'\x42\x82\x88matroska') > -1 + return contains_ebml_element and contains_doctype_element class Webm(Type): @@ -97,11 +86,9 @@ ) def match(self, buf): - return (len(buf) > 3 and - buf[0] == 0x1A and - buf[1] == 0x45 and - buf[2] == 0xDF and - buf[3] == 0xA3) + contains_ebml_element = buf.startswith(b'\x1A\x45\xDF\xA3') + contains_doctype_element = buf.find(b'\x42\x82\x84webm') > -1 + return contains_ebml_element and contains_doctype_element class Mov(IsoBmff): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype/utils.py new/filetype-1.2.0/filetype/utils.py --- old/filetype-1.0.10/filetype/utils.py 2021-09-22 18:50:21.000000000 +0200 +++ new/filetype-1.2.0/filetype/utils.py 2022-10-14 17:57:19.000000000 +0200 @@ -7,19 +7,19 @@ pass -_NUM_SIGNATURE_BYTES = 262 +_NUM_SIGNATURE_BYTES = 8192 def get_signature_bytes(path): """ - Reads file from disk and returns the first 262 bytes + Reads file from disk and returns the first 8192 bytes of data representing the magic number header signature. Args: path: path string to file. Returns: - First 262 bytes of the file content as bytearray type. + First 8192 bytes of the file content as bytearray type. """ with open(path, 'rb') as fp: return bytearray(fp.read(_NUM_SIGNATURE_BYTES)) @@ -27,14 +27,14 @@ def signature(array): """ - Returns the first 262 bytes of the given bytearray + Returns the first 8192 bytes of the given bytearray as part of the file header signature. Args: array: bytearray to extract the header signature. Returns: - First 262 bytes of the file content as bytearray type. + First 8192 bytes of the file content as bytearray type. """ length = len(array) index = _NUM_SIGNATURE_BYTES if length > _NUM_SIGNATURE_BYTES else length @@ -44,39 +44,41 @@ def get_bytes(obj): """ - Infers the input type and reads the first 262 bytes, + Infers the input type and reads the first 8192 bytes, returning a sliced bytearray. Args: - obj: path to readable, file, bytes or bytearray. + obj: path to readable, file-like object(with read() method), bytes, + bytearray or memoryview Returns: - First 262 bytes of the file content as bytearray type. + First 8192 bytes of the file content as bytearray type. Raises: TypeError: if obj is not a supported type. """ - try: - obj = obj.read(_NUM_SIGNATURE_BYTES) - except AttributeError: - # duck-typing as readable failed - we'll try the other options - pass - - kind = type(obj) - - if kind is bytearray: + if isinstance(obj, bytearray): return signature(obj) - if kind is str: + if isinstance(obj, str): return get_signature_bytes(obj) - if kind is bytes: + if isinstance(obj, bytes): return signature(obj) - if kind is memoryview: - return signature(obj).tolist() + if isinstance(obj, memoryview): + return bytearray(signature(obj).tolist()) if isinstance(obj, pathlib.PurePath): return get_signature_bytes(obj) - raise TypeError('Unsupported type as file input: %s' % kind) + if hasattr(obj, 'read'): + if hasattr(obj, 'tell') and hasattr(obj, 'seek'): + start_pos = obj.tell() + obj.seek(0) + magic_bytes = obj.read(_NUM_SIGNATURE_BYTES) + obj.seek(start_pos) + return get_bytes(magic_bytes) + return get_bytes(obj.read(_NUM_SIGNATURE_BYTES)) + + raise TypeError('Unsupported type as file input: %s' % type(obj)) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype.egg-info/PKG-INFO new/filetype-1.2.0/filetype.egg-info/PKG-INFO --- old/filetype-1.0.10/filetype.egg-info/PKG-INFO 2022-02-03 20:35:45.000000000 +0100 +++ new/filetype-1.2.0/filetype.egg-info/PKG-INFO 2022-11-02 18:33:56.000000000 +0100 @@ -1,6 +1,6 @@ Metadata-Version: 1.1 Name: filetype -Version: 1.0.10 +Version: 1.2.0 Summary: Infer file type and MIME type of any file/buffer. No external dependencies. Home-page: https://github.com/h2non/filetype.py Author: Tomas Aparicio @@ -76,6 +76,7 @@ - **jpg** - ``image/jpeg`` - **jpx** - ``image/jpx`` - **png** - ``image/png`` + - **apng** - ``image/apng`` - **gif** - ``image/gif`` - **webp** - ``image/webp`` - **cr2** - ``image/x-canon-cr2`` @@ -85,6 +86,7 @@ - **psd** - ``image/vnd.adobe.photoshop`` - **ico** - ``image/x-icon`` - **heic** - ``image/heic`` + - **avif** - ``image/avif`` Video ^^^^^ @@ -106,7 +108,7 @@ - **aac** - ``audio/aac`` - **mid** - ``audio/midi`` - **mp3** - ``audio/mpeg`` - - **m4a** - ``audio/m4a`` + - **m4a** - ``audio/mp4`` - **ogg** - ``audio/ogg`` - **flac** - ``audio/x-flac`` - **wav** - ``audio/x-wav`` @@ -130,7 +132,6 @@ - **pdf** - ``application/pdf`` - **exe** - ``application/x-msdownload`` - **swf** - ``application/x-shockwave-flash`` - - **rtf** - ``application/rtf`` - **eot** - ``application/octet-stream`` - **ps** - ``application/postscript`` @@ -144,6 +145,20 @@ - **lzo** - ``application/x-lzop`` - **lz** - ``application/x-lzip`` - **lz4** - ``application/x-lz4`` + - **zstd** - ``application/zstd`` + + Document + ^^^^^^^^ + + - **doc** - ``application/msword`` + - **docx** - ``application/vnd.openxmlformats-officedocument.wordprocessingml.document`` + - **odt** - ``application/vnd.oasis.opendocument.text`` + - **xls** - ``application/vnd.ms-excel`` + - **xlsx** - ``application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`` + - **ods** - ``application/vnd.oasis.opendocument.spreadsheet`` + - **ppt** - ``application/vnd.ms-powerpoint`` + - **pptx** - ``application/vnd.openxmlformats-officedocument.presentationml.presentation`` + - **odp** - ``application/vnd.oasis.opendocument.presentation`` Font ^^^^ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/filetype.egg-info/SOURCES.txt new/filetype-1.2.0/filetype.egg-info/SOURCES.txt --- old/filetype-1.0.10/filetype.egg-info/SOURCES.txt 2022-02-03 20:35:45.000000000 +0100 +++ new/filetype-1.2.0/filetype.egg-info/SOURCES.txt 2022-11-02 18:33:56.000000000 +0100 @@ -25,6 +25,7 @@ filetype/types/archive.py filetype/types/audio.py filetype/types/base.py +filetype/types/document.py filetype/types/font.py filetype/types/image.py filetype/types/isobmff.py @@ -37,11 +38,28 @@ tests/test_types.py tests/test_utils.py tests/fixtures/LICENSE +tests/fixtures/sample.avif +tests/fixtures/sample.doc +tests/fixtures/sample.docx tests/fixtures/sample.gif tests/fixtures/sample.heic tests/fixtures/sample.jpg tests/fixtures/sample.jpx +tests/fixtures/sample.m4a tests/fixtures/sample.mov tests/fixtures/sample.mp4 +tests/fixtures/sample.odp +tests/fixtures/sample.ods +tests/fixtures/sample.odt tests/fixtures/sample.png -tests/fixtures/sample.tif \ No newline at end of file +tests/fixtures/sample.ppt +tests/fixtures/sample.pptx +tests/fixtures/sample.tar +tests/fixtures/sample.tif +tests/fixtures/sample.xls +tests/fixtures/sample.xlsx +tests/fixtures/sample.zip +tests/fixtures/sample.zst +tests/fixtures/sample_1.doc +tests/fixtures/sample_1.docx +tests/fixtures/sample_skippable.zst \ No newline at end of file diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/setup.py new/filetype-1.2.0/setup.py --- old/filetype-1.0.10/setup.py 2022-02-03 20:33:54.000000000 +0100 +++ new/filetype-1.2.0/setup.py 2022-11-02 18:30:43.000000000 +0100 @@ -6,7 +6,7 @@ setup( name='filetype', - version='1.0.10', + version='1.2.0', description='Infer file type and MIME type of any file/buffer. ' 'No external dependencies.', long_description=codecs.open('README.rst', 'r', Binary files old/filetype-1.0.10/tests/fixtures/sample.avif and new/filetype-1.2.0/tests/fixtures/sample.avif differ Binary files old/filetype-1.0.10/tests/fixtures/sample.doc and new/filetype-1.2.0/tests/fixtures/sample.doc differ Binary files old/filetype-1.0.10/tests/fixtures/sample.docx and new/filetype-1.2.0/tests/fixtures/sample.docx differ Binary files old/filetype-1.0.10/tests/fixtures/sample.m4a and new/filetype-1.2.0/tests/fixtures/sample.m4a differ Binary files old/filetype-1.0.10/tests/fixtures/sample.odp and new/filetype-1.2.0/tests/fixtures/sample.odp differ Binary files old/filetype-1.0.10/tests/fixtures/sample.ods and new/filetype-1.2.0/tests/fixtures/sample.ods differ Binary files old/filetype-1.0.10/tests/fixtures/sample.odt and new/filetype-1.2.0/tests/fixtures/sample.odt differ Binary files old/filetype-1.0.10/tests/fixtures/sample.ppt and new/filetype-1.2.0/tests/fixtures/sample.ppt differ Binary files old/filetype-1.0.10/tests/fixtures/sample.pptx and new/filetype-1.2.0/tests/fixtures/sample.pptx differ Binary files old/filetype-1.0.10/tests/fixtures/sample.tar and new/filetype-1.2.0/tests/fixtures/sample.tar differ Binary files old/filetype-1.0.10/tests/fixtures/sample.xls and new/filetype-1.2.0/tests/fixtures/sample.xls differ Binary files old/filetype-1.0.10/tests/fixtures/sample.xlsx and new/filetype-1.2.0/tests/fixtures/sample.xlsx differ Binary files old/filetype-1.0.10/tests/fixtures/sample.zip and new/filetype-1.2.0/tests/fixtures/sample.zip differ Binary files old/filetype-1.0.10/tests/fixtures/sample.zst and new/filetype-1.2.0/tests/fixtures/sample.zst differ Binary files old/filetype-1.0.10/tests/fixtures/sample_1.doc and new/filetype-1.2.0/tests/fixtures/sample_1.doc differ Binary files old/filetype-1.0.10/tests/fixtures/sample_1.docx and new/filetype-1.2.0/tests/fixtures/sample_1.docx differ Binary files old/filetype-1.0.10/tests/fixtures/sample_skippable.zst and new/filetype-1.2.0/tests/fixtures/sample_skippable.zst differ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/tests/test_filetype.py new/filetype-1.2.0/tests/test_filetype.py --- old/filetype-1.0.10/tests/test_filetype.py 2020-01-18 17:45:56.000000000 +0100 +++ new/filetype-1.2.0/tests/test_filetype.py 2022-07-12 16:48:24.000000000 +0200 @@ -84,3 +84,13 @@ mime = filetype.guess_mime(buf) self.assertTrue(mime is not None) self.assertEqual(mime, 'image/jpeg') + + def test_guess_video_invalid(self): + buf = bytearray([0x0, 0x0, 0x0, 0x0, 0x66, 0x74, 0x79, 0x70, 0xf2, 0xf2, 0xf2, 0xf2, 0xf6, 0xf2, 0xf2, 0x90]) + mime = filetype.guess_mime(buf) + self.assertTrue(mime is None) + + def test_guess_image_invalid(self): + buf = bytearray([0x49, 0x49, 0x2a, 0x0]) + mime = filetype.guess_mime(buf) + self.assertTrue(mime is None) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/filetype-1.0.10/tests/test_types.py new/filetype-1.2.0/tests/test_types.py --- old/filetype-1.0.10/tests/test_types.py 2020-01-18 17:45:56.000000000 +0100 +++ new/filetype-1.2.0/tests/test_types.py 2022-11-02 18:30:43.000000000 +0100 @@ -13,10 +13,18 @@ class TestFileType(unittest.TestCase): def test_guess_jpeg(self): - kind = filetype.guess(FIXTURES + '/sample.jpg') - self.assertTrue(kind is not None) - self.assertEqual(kind.mime, 'image/jpeg') - self.assertEqual(kind.extension, 'jpg') + img_path = FIXTURES + '/sample.jpg' + with open(img_path, 'rb') as fp: + for obj in (img_path, fp): + kind = filetype.guess(obj) + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'image/jpeg') + self.assertEqual(kind.extension, 'jpg') + # reset reader position test + kind = filetype.guess(fp) + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'image/jpeg') + self.assertEqual(kind.extension, 'jpg') def test_guess_jpx(self): kind = filetype.guess(FIXTURES + '/sample.jpx') @@ -36,6 +44,19 @@ self.assertEqual(kind.mime, 'image/heic') self.assertEqual(kind.extension, 'heic') + def test_guess_avif(self): + kind = filetype.guess(FIXTURES + '/sample.avif') + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'image/avif') + self.assertEqual(kind.extension, 'avif') + + def test_guess_m4a(self): + kind = filetype.guess(FIXTURES + '/sample.m4a') + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'audio/mp4') + self.assertEqual(kind.extension, 'm4a') + + def test_guess_mp4(self): kind = filetype.guess(FIXTURES + '/sample.mp4') self.assertTrue(kind is not None) @@ -59,3 +80,66 @@ self.assertTrue(kind is not None) self.assertEqual(kind.mime, 'video/quicktime') self.assertEqual(kind.extension, 'mov') + + def test_guess_zstd(self): + for name in 'sample.zst', 'sample_skippable.zst': + kind = filetype.guess(FIXTURES + '/' + name) + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'application/zstd') + self.assertEqual(kind.extension, 'zst') + + def test_guess_doc(self): + for name in 'sample.doc', 'sample_1.doc': + kind = filetype.guess(os.path.join(FIXTURES, name)) + self.assertIsNotNone(kind) + self.assertEqual(kind.mime, 'application/msword') + self.assertEqual(kind.extension, 'doc') + + def test_guess_docx(self): + for name in 'sample.docx', 'sample_1.docx': + kind = filetype.guess(os.path.join(FIXTURES, name)) + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'application/vnd.openxmlformats-officedocument.wordprocessingml.document') + self.assertEqual(kind.extension, 'docx') + + def test_guess_odt(self): + kind = filetype.guess(FIXTURES + '/sample.odt') + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'application/vnd.oasis.opendocument.text') + self.assertEqual(kind.extension, 'odt') + + def test_guess_xls(self): + kind = filetype.guess(FIXTURES + '/sample.xls') + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'application/vnd.ms-excel') + self.assertEqual(kind.extension, 'xls') + + def test_guess_xlsx(self): + kind = filetype.guess(FIXTURES + '/sample.xlsx') + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet') + self.assertEqual(kind.extension, 'xlsx') + + def test_guess_ods(self): + kind = filetype.guess(FIXTURES + '/sample.ods') + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'application/vnd.oasis.opendocument.spreadsheet') + self.assertEqual(kind.extension, 'ods') + + def test_guess_ppt(self): + kind = filetype.guess(FIXTURES + '/sample.ppt') + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'application/vnd.ms-powerpoint') + self.assertEqual(kind.extension, 'ppt') + + def test_guess_pptx(self): + kind = filetype.guess(FIXTURES + '/sample.pptx') + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'application/vnd.openxmlformats-officedocument.presentationml.presentation') + self.assertEqual(kind.extension, 'pptx') + + def test_guess_odp(self): + kind = filetype.guess(FIXTURES + '/sample.odp') + self.assertTrue(kind is not None) + self.assertEqual(kind.mime, 'application/vnd.oasis.opendocument.presentation') + self.assertEqual(kind.extension, 'odp')