Script 'mail_helper' called by obssrc Hello community, here is the log from the commit of package diffoscope for openSUSE:Factory checked in at 2021-11-27 23:42:36 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/diffoscope (Old) and /work/SRC/openSUSE:Factory/.diffoscope.new.1895 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "diffoscope" Sat Nov 27 23:42:36 2021 rev:22 rq:934266 version:193 Changes: -------- --- /work/SRC/openSUSE:Factory/diffoscope/diffoscope.changes 2021-11-03 17:26:30.585345225 +0100 +++ /work/SRC/openSUSE:Factory/.diffoscope.new.1895/diffoscope.changes 2021-11-27 23:43:07.420335669 +0100 @@ -1,0 +2,44 @@ +Sat Nov 20 13:01:25 UTC 2021 - Sebastian Wagner <sebix+novell....@sebix.at> + +- - update to version 193: + - Don't duplicate file lists at each directory level. + (Closes: #989192, reproducible-builds/diffoscope#263) + - When pretty-printing JSON, mark the difference as such, additionally + avoiding including the full path. + (Closes: reproducible-builds/diffoscope#205) + - Codebase improvements: + - Update a bunch of %-style string interpolations into f-strings or + str.format. + - Import itertools top-level directly. + - Drop some unused imports. + - Use isinstance(...) over type(...) == + - Avoid aliasing variables if we aren't going to use them. + - Fix missing diff output on large diffs. + - Ignore a Python warning coming from a dependent library (triggered by + supporting Python 3.10) + - Document that support both Python 3.9 and 3.10. + +------------------------------------------------------------------- +Sun Nov 14 21:14:04 UTC 2021 - Sebastian Wagner <sebix+novell....@sebix.at> + +- update to version 192: + - Update .epub test methodology after improving XML file parsing. +- update to version 191: + - Detect XML files as XML files if either file(1) claims if they are XML + files, or if they are named .xml. + (Closes: #999438, reproducible-builds/diffoscope#287) + - Don't reject Debian .changes files if they contain non-printable + characters. (Closes: reproducible-builds/diffoscope#286) + - Continue loading a .changes file even if the referenced files inside it do + not exist, but include a comment in the diff as a result. + - Log the reason if we cannot load a Debian .changes file. + - Fix inverted logic in the assert_diff_startswith() utility. +- update to version 190: + - Don't raise a traceback if we cannot de-marshal Python bytecode to support + Python 3.7 loading newer .pyc files. + (Closes: reproducible-builds/diffoscope#284) + - Fix Python tests under Python 3.7 with file 5.39+. + - Skip Python bytecode testing when "file" is older than 5.39. + - Detect whether the GNU_BUILD_ID field has been modified. + +------------------------------------------------------------------- Old: ---- diffoscope-189.tar.bz2 diffoscope-189.tar.bz2.asc New: ---- diffoscope-193.tar.bz2 diffoscope-193.tar.bz2.asc ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ diffoscope.spec ++++++ --- /var/tmp/diff_new_pack.4umysR/_old 2021-11-27 23:43:07.796334436 +0100 +++ /var/tmp/diff_new_pack.4umysR/_new 2021-11-27 23:43:07.800334423 +0100 @@ -17,7 +17,7 @@ Name: diffoscope -Version: 189 +Version: 193 Release: 0 Summary: In-depth comparison of files, archives, and directories License: GPL-3.0-or-later ++++++ diffoscope-189.tar.bz2 -> diffoscope-193.tar.bz2 ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/debian/changelog new/diffoscope-193/debian/changelog --- old/diffoscope-189/debian/changelog 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/debian/changelog 2021-11-19 16:35:12.000000000 +0100 @@ -1,3 +1,69 @@ +diffoscope (193) unstable; urgency=medium + + [ Chris Lamb ] + * Don't duplicate file lists at each directory level. + (Closes: #989192, reproducible-builds/diffoscope#263) + * When pretty-printing JSON, mark the difference as such, additionally + avoiding including the full path. + (Closes: reproducible-builds/diffoscope#205) + + * Codebase improvements: + - Update a bunch of %-style string interpolations into f-strings or + str.format. + - Import itertools top-level directly. + - Drop some unused imports. + - Use isinstance(...) over type(...) == + - Avoid aliasing variables if we aren't going to use them. + + [ Brandon Maier ] + * Fix missing diff output on large diffs. + + [ Mattia Rizzolo ] + * Ignore a Python warning coming from a dependent library (triggered by + supporting Python 3.10) + * Document that support both Python 3.9 and 3.10. + + -- Chris Lamb <la...@debian.org> Fri, 19 Nov 2021 07:35:10 -0800 + +diffoscope (192) unstable; urgency=medium + + * Update .epub test methodology after improving XML file parsing. + + -- Chris Lamb <la...@debian.org> Fri, 12 Nov 2021 08:17:14 -0800 + +diffoscope (191) unstable; urgency=medium + + [ Chris Lamb ] + * Detect XML files as XML files if either file(1) claims if they are XML + files, or if they are named .xml. + (Closes: #999438, reproducible-builds/diffoscope#287) + * Don't reject Debian .changes files if they contain non-printable + characters. (Closes: reproducible-builds/diffoscope#286) + * Continue loading a .changes file even if the referenced files inside it do + not exist, but include a comment in the diff as a result. + * Log the reason if we cannot load a Debian .changes file. + + [ Zbigniew J??drzejewski-Szmek ] + * Fix inverted logic in the assert_diff_startswith() utility. + + -- Chris Lamb <la...@debian.org> Fri, 12 Nov 2021 06:43:57 -0800 + +diffoscope (190) unstable; urgency=medium + + [ Chris Lamb ] + * Don't raise a traceback if we cannot de-marshal Python bytecode to support + Python 3.7 loading newer .pyc files. + (Closes: reproducible-builds/diffoscope#284) + * Fix Python tests under Python 3.7 with file 5.39+. + + [ Vagrant Cascadian ] + * Skip Python bytecode testing when "file" is older than 5.39. + + [ Roland Clobus ] + * Detect whether the GNU_BUILD_ID field has been modified. + + -- Chris Lamb <la...@debian.org> Fri, 05 Nov 2021 08:47:27 +0000 + diffoscope (189) unstable; urgency=medium [ Chris Lamb ] diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/__init__.py new/diffoscope-193/diffoscope/__init__.py --- old/diffoscope-189/diffoscope/__init__.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/__init__.py 2021-11-19 16:35:12.000000000 +0100 @@ -17,4 +17,4 @@ # You should have received a copy of the GNU General Public License # along with diffoscope. If not, see <https://www.gnu.org/licenses/>. -VERSION = "189" +VERSION = "193" diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/deb.py new/diffoscope-193/diffoscope/comparators/deb.py --- old/diffoscope-189/diffoscope/comparators/deb.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/deb.py 2021-11-19 16:35:12.000000000 +0100 @@ -184,7 +184,7 @@ with open(self.path, "r", encoding="utf-8") as f: for line in f: md5sum, path = re.split(r"\s+", line.strip(), maxsplit=1) - md5sums["./%s" % path] = md5sum + md5sums[f"./{path}"] = md5sum return md5sums except (UnicodeDecodeError, ValueError): logger.debug("Malformed md5sums, ignoring.") diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/debian.py new/diffoscope-193/diffoscope/comparators/debian.py --- old/diffoscope-189/diffoscope/comparators/debian.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/debian.py 2021-11-19 16:35:12.000000000 +0100 @@ -217,7 +217,10 @@ class DotChangesFile(DebControlFile): DESCRIPTION = "Debian .changes files" FILE_EXTENSION_SUFFIX = {".changes"} - FILE_TYPE_RE = re.compile(r"^(ASCII text|UTF-8 Unicode text)") + + # .changes files can be identified "data" if they contain non-printable + # characters (Re: reproducible-builds/diffoscope#286) + FILE_TYPE_RE = re.compile(r"^(ASCII text|UTF-8 Unicode text|data)") @classmethod def recognizes(cls, file): @@ -226,13 +229,19 @@ try: file._deb822 = Changes(filename=file.path) - except ChangesFileException: + except ChangesFileException as exc: + logger.warning( + f"Rejecting {file.path} as a Debian .changes file: {exc}" + ) return False try: + file.validation_msg = None file._deb822.validate("sha256", check_signature=False) - except FileNotFoundError: - return False + except FileNotFoundError as exc: + # Continue even though this .changes file may be invalid + file.validation_msg = f"{os.path.basename(file.path)} is missing referenced files: {exc}" + logger.warning(file.validation_msg) return True @@ -242,6 +251,10 @@ if differences is None: return None + for x in (self, other): + if x.validation_msg: + differences.add_comment(x.validation_msg) + other_deb822 = self._get_deb822(other) files = zip(self._deb822.get("Files"), other_deb822.get("Files")) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/decompile.py new/diffoscope-193/diffoscope/comparators/decompile.py --- old/diffoscope-189/diffoscope/comparators/decompile.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/decompile.py 2021-11-19 16:35:12.000000000 +0100 @@ -18,7 +18,6 @@ # along with diffoscope. If not, see <https://www.gnu.org/licenses/>. import re -import sys import abc import logging @@ -26,7 +25,6 @@ from .utils.operation import Operation from .utils.container import Container -from diffoscope.config import Config from diffoscope.difference import Difference from diffoscope.excludes import operation_excluded from diffoscope.tools import ( diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/device.py new/diffoscope-193/diffoscope/comparators/device.py --- old/diffoscope-189/diffoscope/comparators/device.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/device.py 2021-11-19 16:35:12.000000000 +0100 @@ -88,4 +88,4 @@ kind = "block" else: kind = "weird" - return "device:%s\nmajor: %d\nminor: %d\n" % (kind, major, minor) + return f"device:{kind}\nmajor: {major}\nminor: {minor}\n" diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/directory.py new/diffoscope-193/diffoscope/comparators/directory.py --- old/diffoscope-189/diffoscope/comparators/directory.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/directory.py 2021-11-19 16:35:12.000000000 +0100 @@ -35,20 +35,6 @@ logger = logging.getLogger(__name__) -def list_files(path): - path = os.path.realpath(path) - all_files = [] - for root, dirs, names in os.walk(path): - all_files.extend( - [os.path.join(root[len(path) + 1 :], dir) for dir in dirs] - ) - all_files.extend( - [os.path.join(root[len(path) + 1 :], name) for name in names] - ) - all_files.sort() - return all_files - - if os.uname()[0] == "FreeBSD": class Stat(Command): @@ -174,7 +160,7 @@ try: stat1 = os.lstat(path1) stat2 = os.lstat(path2) - except Exception as e: + except Exception: return [] differences = [] @@ -267,6 +253,11 @@ def compare(self, other, source=None): differences = [] + # We don't need to recurse into subdirectories; DirectoryContainer will + # find them and do that for us. + def list_files(path): + return sorted(os.listdir(os.path.realpath(path))) + listing_diff = Difference.from_text( "\n".join(list_files(self.path)), "\n".join(list_files(other.path)), diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/elf.py new/diffoscope-193/diffoscope/comparators/elf.py --- old/diffoscope-189/diffoscope/comparators/elf.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/elf.py 2021-11-19 16:35:12.000000000 +0100 @@ -22,6 +22,7 @@ import logging import subprocess import collections +import hashlib from diffoscope.exc import OutputParsingError from diffoscope.tools import get_tool_name, tool_required @@ -34,7 +35,6 @@ from .decompile import DecompilableContainer from .utils.file import File from .utils.command import Command, our_check_output -from .utils.container import Container DEBUG_SECTION_GROUPS = ( "rawline", @@ -157,7 +157,7 @@ class ReadelfDebugDump(Readelf): def readelf_options(self): - return ["--debug-dump=%s" % self._debug_section_group] + return [f"--debug-dump={self._debug_section_group}"] READELF_DEBUG_DUMP_COMMANDS = [ @@ -460,6 +460,7 @@ ] output = our_check_output(cmd, shell=False, stderr=subprocess.DEVNULL) has_debug_symbols = False + has_build_id = False try: output = output.decode("utf-8").split("\n") @@ -481,6 +482,9 @@ if name.startswith(".debug") or name.startswith(".zdebug"): has_debug_symbols = True + if name == ".note.gnu.build-id" and type == "NOTE": + has_build_id = True + if _should_skip_section(name, type): continue @@ -515,6 +519,13 @@ if not has_debug_symbols: self._install_debug_symbols() + if has_build_id: + try: + self._verify_build_id() + except Exception: + # It is fine to skip the verification of the build_id + pass + @tool_required("objcopy") def _install_debug_symbols(self): if Config().use_dbgsym == "no": @@ -632,6 +643,36 @@ logger.debug("Installed debug symbols at %s", dest_path) + def _verify_build_id(self): + """ + Verify whether the NT_GNU_BUILD_ID field contains a sha1 checksum + that matches the binary. (#260) + """ + + with open(self.source.path, "rb") as f: + blob = f.read() + + # Magic value: length=0x14, owner length=3, owner='GNU', followed by the sha1 checksum + m = re.search( + b"\x14\x00\x00\x00\x03\x00\x00\x00\x47\x4e\x55\x00.{20}", blob + ) + build_id = blob[m.end() - 20 : m.end()].hex() + blob_with_reset_build_id = ( + blob[: m.end() - 20] + b"\x00" * 20 + blob[m.end() :] + ) + + if hashlib.sha1(blob_with_reset_build_id).hexdigest() != build_id: + logger.warning( + "The file (%s) has been modified after NT_GNU_BUILD_ID has been applied", + self.source.path, + ) + logger.debug( + "Expected value: %s Current value: %s", + hashlib.sha1(blob_with_reset_build_id).hexdigest(), + build_id, + ) + return + def get_member_names(self): decompiled_members = super().get_member_names() return list(decompiled_members) + list(self._sections.keys()) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/fit.py new/diffoscope-193/diffoscope/comparators/fit.py --- old/diffoscope-189/diffoscope/comparators/fit.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/fit.py 2021-11-19 16:35:12.000000000 +0100 @@ -81,7 +81,7 @@ dest_path, ) - output = command.our_check_output(cmd) + command.our_check_output(cmd) # Cannot rely on dumpimage returning a non-zero exit code on failure. if not os.path.exists(dest_path): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/json.py new/diffoscope-193/diffoscope/comparators/json.py --- old/diffoscope-189/diffoscope/comparators/json.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/json.py 2021-11-19 16:35:12.000000000 +0100 @@ -58,7 +58,11 @@ def compare_details(self, other, source=None): difference = Difference.from_text( - self.dumps(self), self.dumps(other), self.path, other.path + self.dumps(self), + self.dumps(other), + self.path, + other.path, + source="Pretty-printed", ) if difference: @@ -71,6 +75,7 @@ self.dumps(other, sort_keys=False), self.path, other.path, + source="Pretty-printed", comment="ordering differences only", ) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/missing_file.py new/diffoscope-193/diffoscope/comparators/missing_file.py --- old/diffoscope-189/diffoscope/comparators/missing_file.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/missing_file.py 2021-11-19 16:35:12.000000000 +0100 @@ -40,7 +40,7 @@ @classmethod def recognizes(cls, file): if isinstance(file, FilesystemFile) and not os.path.lexists(file.name): - assert Config().new_file, "%s does not exist" % file.name + assert Config().new_file, f"{file.name} does not exist" return True return False diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/python.py new/diffoscope-193/diffoscope/comparators/python.py --- old/diffoscope-189/diffoscope/comparators/python.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/python.py 2021-11-19 16:35:12.000000000 +0100 @@ -38,15 +38,19 @@ FILE_TYPE_RE = re.compile(r"^python .*byte-compiled$") def compare_details(self, other, source=None): - return [ - Difference.from_text( - describe_pyc(self.path), - describe_pyc(other.path), - self.path, - other.path, - source="Python bytecode", - ) - ] + try: + return [ + Difference.from_text( + describe_pyc(self.path), + describe_pyc(other.path), + self.path, + other.path, + source="Python bytecode", + ) + ] + except ValueError as exc: + self.add_comment("Could not decombile bytecode: {}".format(exc)) + return [] def describe_pyc(filename): @@ -87,7 +91,7 @@ yield f"{indent}consts" for const in code.co_consts: - if type(const) == types.CodeType: + if isinstance(const, types.CodeType): yield from show_code(const, f"{indent} ") else: yield f" {indent}{const!r}" diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/rpm.py new/diffoscope-193/diffoscope/comparators/rpm.py --- old/diffoscope-189/diffoscope/comparators/rpm.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/rpm.py 2021-11-19 16:35:12.000000000 +0100 @@ -70,7 +70,7 @@ for rpmtag in sorted(rpm.tagnames): if rpmtag not in hdr: continue - s.write(u"%s: " % rpm.tagnames[rpmtag]) + s.write(u"{}: ".format(rpm.tagnames[rpmtag])) convert_header_field(s, hdr[rpmtag]) s.write(u"\n") return s.getvalue() diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/squashfs.py new/diffoscope-193/diffoscope/comparators/squashfs.py --- old/diffoscope-189/diffoscope/comparators/squashfs.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/squashfs.py 2021-11-19 16:35:12.000000000 +0100 @@ -187,22 +187,20 @@ d["mode"] = SquashfsDevice.KIND_MAP[d["kind"]] del d["kind"] except KeyError: - raise SquashfsInvalidLineFormat( - "unknown device kind %s" % d["kind"] - ) + raise SquashfsInvalidLineFormat(f"unknown device kind {d['kind']}") try: d["major"] = int(d["major"]) except ValueError: raise SquashfsInvalidLineFormat( - "unable to parse major number %s" % d["major"] + f"unable to parse major number {d['major']}" ) try: d["minor"] = int(d["minor"]) except ValueError: raise SquashfsInvalidLineFormat( - "unable to parse minor number %s" % d["minor"] + f"unable to parse minor number {d['minor']}" ) return d diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/symlink.py new/diffoscope-193/diffoscope/comparators/symlink.py --- old/diffoscope-189/diffoscope/comparators/symlink.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/symlink.py 2021-11-19 16:35:12.000000000 +0100 @@ -41,7 +41,7 @@ def create_placeholder(self): with get_named_temporary_file("w+", delete=False) as f: - f.write("destination: %s\n" % self.symlink_destination) + f.write(f"destination: {self.symlink_destination}\n") f.flush() return f.name diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/utils/archive.py new/diffoscope-193/diffoscope/comparators/utils/archive.py --- old/diffoscope-189/diffoscope/comparators/utils/archive.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/utils/archive.py 2021-11-19 16:35:12.000000000 +0100 @@ -85,7 +85,7 @@ basename = os.path.basename(self.source.name) if not basename.endswith(expected_extension): - return "%s-content" % basename + return f"{basename}-content" return basename[: -len(expected_extension)] diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/utils/compare.py new/diffoscope-193/diffoscope/comparators/utils/compare.py --- old/diffoscope-189/diffoscope/comparators/utils/compare.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/utils/compare.py 2021-11-19 16:35:12.000000000 +0100 @@ -171,5 +171,7 @@ hexdump = io.StringIO() with open(path, "rb") as f: for buf in iter(lambda: f.read(32), b""): - hexdump.write("%s\n" % binascii.hexlify(buf).decode("us-ascii")) + hexdump.write( + "{}\n".format(binascii.hexlify(buf).decode("us-ascii")) + ) return hexdump.getvalue() diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/utils/file.py new/diffoscope-193/diffoscope/comparators/utils/file.py --- old/diffoscope-189/diffoscope/comparators/utils/file.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/utils/file.py 2021-11-19 16:35:12.000000000 +0100 @@ -113,7 +113,7 @@ self._container = container def __repr__(self): - return "<%s %s>" % (self.__class__, self.name) + return f"<{self.__class__} {self.name}>" # This should return a path that allows to access the file content @property @@ -567,8 +567,7 @@ if difference is None: return None difference.add_comment( - "Error parsing output of `%s` for %s" - % (e.operation, e.object_class) + f"Error parsing output of `{e.operation}` for {e.object_class}" ) except ContainerExtractionError as e: difference = self.compare_bytes(other, source=source) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/utils/libarchive.py new/diffoscope-193/diffoscope/comparators/utils/libarchive.py --- old/diffoscope-189/diffoscope/comparators/utils/libarchive.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/utils/libarchive.py 2021-11-19 16:35:12.000000000 +0100 @@ -260,7 +260,7 @@ for entry in archive: if entry.pathname == member_name: return self.get_subclass(entry) - raise KeyError("%s not found in archive" % member_name) + raise KeyError(f"{member_name} not found in archive") def get_filtered_members(self): try: diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/comparators/xml.py new/diffoscope-193/diffoscope/comparators/xml.py --- old/diffoscope-189/diffoscope/comparators/xml.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/comparators/xml.py 2021-11-19 16:35:12.000000000 +0100 @@ -2,7 +2,7 @@ # diffoscope: in-depth comparison of files, archives, and directories # # Copyright ?? 2017 Juliana Rodrigues <juliana.o...@gmail.com> -# Copyright ?? 2017-2020 Chris Lamb <la...@debian.org> +# Copyright ?? 2017-2021 Chris Lamb <la...@debian.org> # # diffoscope is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by @@ -17,6 +17,8 @@ # You should have received a copy of the GNU General Public License # along with diffoscope. If not, see <https://www.gnu.org/licenses/>. +import re + from xml.parsers.expat import ExpatError from diffoscope.comparators.utils.file import File @@ -83,7 +85,7 @@ """ DESCRIPTION = "XML files" - FILE_EXTENSION_SUFFIX = {".xml"} + FILE_TYPE_RE = re.compile(r"^XML \S+ document") @classmethod def recognizes(cls, file): @@ -96,7 +98,9 @@ Returns: False if file is not a XML File, True otherwise """ - if not super().recognizes(file): + + # Emulate FALLBACK_FILE_EXTENSION_SUFFIX = {".xml"} + if not super().recognizes(file) and not file.name.endswith(".xml"): return False with open(file.path) as f: diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/diff.py new/diffoscope-193/diffoscope/diff.py --- old/diffoscope-189/diffoscope/diff.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/diff.py 2021-11-19 16:35:12.000000000 +0100 @@ -24,6 +24,7 @@ import fcntl import hashlib import logging +import itertools import threading import subprocess @@ -99,7 +100,7 @@ found = DiffParser.RANGE_RE.match(line) if not found: - raise ValueError("Unable to parse diff headers: %r" % line) + raise ValueError(f"Unable to parse diff headers: {line!r}") self._diff.write(line + b"\n") if found.group("len1"): @@ -136,7 +137,7 @@ elif self._remaining_hunk_lines == 0: return self.read_headers(line) else: - raise ValueError("Unable to parse diff hunk: %r" % line) + raise ValueError(f"Unable to parse diff hunk: {line!r}") self._diff.write(line + b"\n") @@ -551,11 +552,11 @@ if len(l0) + len(l1) > 750: # difflib.Differ.compare is at least O(n^2), so don't call it if # our inputs are too large. - logger.debug( - "Not calling difflib.Differ.compare(x, y) with len(x) == %d and len(y) == %d", - len(l0), - len(l1), + yield "C", "Diff chunk too large, falling back to line-by-line diff ({} lines added, {} lines removed)".format( + self.add_cpt, self.del_cpt ) + for line0, line1 in itertools.zip_longest(l0, l1, fillvalue=""): + yield from self.yield_line(line0, line1) return saved_line = None diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/difference.py new/diffoscope-193/diffoscope/difference.py --- old/diffoscope-189/diffoscope/difference.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/difference.py 2021-11-19 16:35:12.000000000 +0100 @@ -76,10 +76,8 @@ self._size_cache = None def __repr__(self): - return "<Difference %s -- %s %s>" % ( - self._source1, - self._source2, - self._details, + return ( + f"<Difference {self._source1} -- {self._source2} {self._details}>" ) def map_lines(self, f_diff, f_comment): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/main.py new/diffoscope-193/diffoscope/main.py --- old/diffoscope-189/diffoscope/main.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/main.py 2021-11-19 16:35:12.000000000 +0100 @@ -32,12 +32,10 @@ from .path import set_path from .tools import ( get_tools, - tool_check_installed, tool_prepend_prefix, python_module_missing, tool_required, OS_NAMES, - get_current_os, ) from .config import Config from .environ import normalize_environment diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/presenters/html/html.py new/diffoscope-193/diffoscope/presenters/html/html.py --- old/diffoscope-189/diffoscope/presenters/html/html.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/presenters/html/html.py 2021-11-19 16:35:12.000000000 +0100 @@ -134,9 +134,9 @@ for c in s: # used by diffs if c == DIFFON: - t.write("<%s>" % tag) + t.write(f"<{tag}>") elif c == DIFFOFF: - t.write("</%s>" % tag) + t.write(f"</{tag}>") # special highlighted chars elif c == "\t" and ponct == 1: @@ -150,7 +150,7 @@ t.write('<br/><span class="dp">\\</span>') elif ord(c) < 32: conv = "\\x%x" % ord(c) - t.write("<em>%s</em>" % conv) + t.write(f"<em>{conv}</em>") i += len(conv) else: t.write(html.escape(c)) @@ -305,15 +305,12 @@ def output_header(css_url, our_css_url=False, icon_url=None): if css_url: css_link = ( - ' <link href="%s" type="text/css" rel="stylesheet" />\n' % css_url + f' <link href="{css_url}" type="text/css" rel="stylesheet" />\n' ) else: css_link = "" if our_css_url: - css_style = ( - ' <link href="%s" type="text/css" rel="stylesheet" />\n' - % our_css_url - ) + css_style = f' <link href="{our_css_url}" type="text/css" rel="stylesheet" />\n' else: css_style = "<style>\n{}</style>\n".format(templates.STYLES) if icon_url: @@ -414,7 +411,7 @@ def output_line( self, has_internal_linenos, type_name, s1, line1, s2, line2 ): - self.spl_print_func('<tr class="diff%s">' % type_name) + self.spl_print_func(f'<tr class="diff{type_name}">') try: if s1: if has_internal_linenos: @@ -499,7 +496,7 @@ _, rotation_params = self.spl_print_ctrl ctx, mainname = rotation_params self.spl_current_page += 1 - filename = "%s.html" % (mainname) + filename = f"{mainname}.html" # rotate to the next child page memory = self.write_memory @@ -543,7 +540,7 @@ elif t == "H": self.output_hunk_header(*args) elif t == "C": - self.spl_print_func('<td colspan="2">%s</td>\n' % args) + self.spl_print_func(f'<td colspan="2">{args}</td>\n') else: raise AssertionError() self.spl_rows += 1 @@ -634,7 +631,7 @@ if truncated: text += " (truncated)" parent_last_row = templates.UD_TABLE_FOOTER % { - "filename": html.escape("%s.html" % mainname), + "filename": html.escape(f"{mainname}.html"), "text": text, } yield self.bytes_written, parent_last_row @@ -803,7 +800,7 @@ printers[node] = ( (make_printer, ctx.target) if ctx.single_page - else (file_printer, ctx.target, "%s.html" % pagename) + else (file_printer, ctx.target, f"{pagename}.html") ) stored = node @@ -893,7 +890,7 @@ os.makedirs(directory) if not os.path.isdir(directory): - raise ValueError("%s is not a directory" % directory) + raise ValueError(f"{directory} is not a directory") jquery_url = self.ensure_jquery(jquery_url, directory, "jquery.js") with open(os.path.join(directory, "common.css"), "w") as fp: diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/diffoscope/presenters/utils.py new/diffoscope-193/diffoscope/presenters/utils.py --- old/diffoscope-189/diffoscope/presenters/utils.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/diffoscope/presenters/utils.py 2021-11-19 16:35:12.000000000 +0100 @@ -40,7 +40,7 @@ num /= 1024.0 else: unit = "Y" - return "%s %s%s" % (round_sigfig(num, sigfig), unit, suffix) + return "{} {}{}".format(round_sigfig(num, sigfig), unit, suffix) class Presenter: @@ -150,13 +150,13 @@ self.ident = str(ident) def __repr__(self): - return "%s(%r)" % (self.__class__.__name__, self.ident) + return f"{self.__class__.__name__}({self.ident!r})" def __format__(self, spec): result = self.ident if spec: result += ":" + spec - return "{" + result + "}" + return f"{{{result}}}" def __getitem__(self, key): return FormatPlaceholder(self.ident + "[" + str(key) + "]") @@ -354,7 +354,7 @@ mapping = {} real_mapping, new_holes = self._pformat(mapping, False) if new_holes: - raise ValueError("not all holes filled: %r" % new_holes) + raise ValueError(f"not all holes filled: {new_holes!r}") return self._fmtstr.format(*real_mapping) def formatl(self, *args): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/pytest.ini new/diffoscope-193/pytest.ini --- old/diffoscope-189/pytest.ini 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/pytest.ini 2021-11-19 16:35:12.000000000 +0100 @@ -4,6 +4,6 @@ # this comes from binwalk # https://github.com/ReFirmLabs/binwalk/issues/507 ignore:the imp module is deprecated in favour of importlib.*:DeprecationWarning -# this comes from h5py -# https://bugs.debian.org/994617 - ignore:h5py is running against HDF5 [\d\.]+ when it was built against [\d\.]+, this may cause problems +# this come from coverage, fixed in >= 6.0 +# https://github.com/nedbat/coveragepy/commit/90815d959dfff9c42629e3467d6e1a410cce6d04 + ignore:currentThread\(\) is deprecated, use current_thread\(\) instead diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/setup.py new/diffoscope-193/setup.py --- old/diffoscope-189/setup.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/setup.py 2021-11-19 16:35:12.000000000 +0100 @@ -80,6 +80,8 @@ "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.8", + "Programming Language :: Python :: 3.9", + "Programming Language :: Python :: 3.10", "Topic :: Utilities", ], # https://packaging.python.org/guides/distributing-packages-using-setuptools/#project-urls diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/tests/comparators/test_debian.py new/diffoscope-193/tests/comparators/test_debian.py --- old/diffoscope-189/tests/comparators/test_debian.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/tests/comparators/test_debian.py 2021-11-19 16:35:12.000000000 +0100 @@ -113,7 +113,8 @@ shutil.copy(TEST_DOT_CHANGES_FILE1_PATH, dot_changes_path) # we don't copy the referenced .deb identified = specialize(FilesystemFile(dot_changes_path)) - assert not isinstance(identified, DotChangesFile) + # ... but it is identified regardless + assert isinstance(identified, DotChangesFile) def test_dot_changes_no_differences(dot_changes1): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/tests/comparators/test_epub.py new/diffoscope-193/tests/comparators/test_epub.py --- old/diffoscope-189/tests/comparators/test_epub.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/tests/comparators/test_epub.py 2021-11-19 16:35:12.000000000 +0100 @@ -55,8 +55,18 @@ assert differences[2].source2 == "toc.ncx" assert differences[3].source1 == "ch001.xhtml" assert differences[3].source2 == "ch001.xhtml" - expected_diff = get_data("epub_expected_diffs") - assert expected_diff == "".join(map(lambda x: x.unified_diff, differences)) + + # Flatten everything recursively, as XMLFile will contain reformatted data + # under Difference.details. + def fn(difference): + if difference.unified_diff: + yield difference.unified_diff + for x in difference.details: + yield from fn(x) + + val = "\n".join("\n".join(fn(x)) for x in differences) + + assert val == get_data("epub_expected_diffs") @skip_unless_tools_exist("zipinfo") diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/tests/comparators/test_python.py new/diffoscope-193/tests/comparators/test_python.py --- old/diffoscope-189/tests/comparators/test_python.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/tests/comparators/test_python.py 2021-11-19 16:35:12.000000000 +0100 @@ -22,13 +22,16 @@ from diffoscope.comparators.python import PycFile from ..utils.data import assert_diff_startswith, load_fixture -from ..utils.tools import skipif - +from ..utils.tools import ( + skipif, + skip_unless_file_version_is_at_least, +) pyc1 = load_fixture("test1.pyc-renamed") pyc2 = load_fixture("test2.pyc-renamed") +@skip_unless_file_version_is_at_least("5.39") def test_identification(pyc1, pyc2): assert isinstance(pyc1, PycFile) assert isinstance(pyc2, PycFile) @@ -47,9 +50,10 @@ return pyc1.compare(pyc2).details +@skip_unless_file_version_is_at_least("5.39") @skipif( - sys.version_info < (3, 9), - reason="pyc_expected_diff generated on Python 3.9", + sys.version_info < (3, 8), + reason="Python 3.7 cannot de-marshal test1.pyc-renamed", ) def test_diff(differences): assert_diff_startswith( diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/tests/data/epub_expected_diffs new/diffoscope-193/tests/data/epub_expected_diffs --- old/diffoscope-189/tests/data/epub_expected_diffs 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/tests/data/epub_expected_diffs 2021-11-19 16:35:12.000000000 +0100 @@ -21,9 +21,10 @@ +-rw---- 0.0 fat 655 b- defX 15-Oct-27 11:33 nav.xhtml +-rw---- 0.0 fat 615 b- defX 15-Oct-27 11:33 ch001.xhtml +9 files, 4535 bytes uncompressed, 2325 bytes compressed: 48.7% + @@ -1,13 +1,13 @@ - <?xml version="1.0" encoding="UTF-8"?> - <package version="2.0" xmlns="http://www.idpf.org/2007/opf" unique-identifier="epub-id-1"> + <?xml version="1.0" encoding="utf-8"?> + <package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-identifier="epub-id-1"> <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf"> - <dc:identifier id="epub-id-1">urn:uuid:488fe0b5-29d9-4d64-a023-ca08947d78ae</dc:identifier> + <dc:identifier id="epub-id-1">urn:uuid:c3082605-195c-4273-acad-528beceba843</dc:identifier> @@ -33,34 +34,35 @@ <dc:language>en-US</dc:language> </metadata> <manifest> - <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml" /> - <item id="style" href="stylesheet.css" media-type="text/css" /> - <item id="nav" href="nav.xhtml" media-type="application/xhtml+xml" /> - <item id="title_page_xhtml" href="title_page.xhtml" media-type="application/xhtml+xml" /> + <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/> + <item id="style" href="stylesheet.css" media-type="text/css"/> + <item id="nav" href="nav.xhtml" media-type="application/xhtml+xml"/> + <item id="title_page_xhtml" href="title_page.xhtml" media-type="application/xhtml+xml"/> + @@ -1,11 +1,11 @@ - <?xml version="1.0" encoding="UTF-8"?> - <ncx version="2005-1" xmlns="http://www.daisy.org/z3986/2005/ncx/"> + <?xml version="1.0" encoding="utf-8"?> + <ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1"> <head> -- <meta name="dtb:uid" content="urn:uuid:488fe0b5-29d9-4d64-a023-ca08947d78ae" /> -+ <meta name="dtb:uid" content="urn:uuid:c3082605-195c-4273-acad-528beceba843" /> - <meta name="dtb:depth" content="1" /> - <meta name="dtb:totalPageCount" content="0" /> - <meta name="dtb:maxPageNumber" content="0" /> +- <meta name="dtb:uid" content="urn:uuid:488fe0b5-29d9-4d64-a023-ca08947d78ae"/> ++ <meta name="dtb:uid" content="urn:uuid:c3082605-195c-4273-acad-528beceba843"/> + <meta name="dtb:depth" content="1"/> + <meta name="dtb:totalPageCount" content="0"/> + <meta name="dtb:maxPageNumber" content="0"/> </head> <docTitle> <text>Test Ebook</text> </docTitle> -@@ -8,12 +8,12 @@ - <title>Test Ebook</title> - <link rel="stylesheet" type="text/css" href="stylesheet.css" /> - </head> - <body> - <div id="test-ebook" class="section level1 unnumbered"> - <h1>Test Ebook</h1> - <p>Hello World!</p> --<p>Time: 12:32</p> -+<p>Time: 12:33</p> - </div> - </body> + +@@ -10,11 +10,11 @@ + <title>Test Ebook</title> + <link rel="stylesheet" type="text/css" href="stylesheet.css"/> + </head> + <body> + <div id="test-ebook" class="section level1 unnumbered"> + <h1>Test Ebook</h1> + <p>Hello World!</p> +- <p>Time: 12:32</p> ++ <p>Time: 12:33</p> + </div> + </body> </html> - diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/tests/utils/data.py new/diffoscope-193/tests/utils/data.py --- old/diffoscope-189/tests/utils/data.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/tests/utils/data.py 2021-11-19 16:35:12.000000000 +0100 @@ -67,7 +67,7 @@ def assert_diff_startswith(difference, filename): haystack = difference.unified_diff needle = get_data(filename) - assert needle.startswith(haystack) + assert haystack.startswith(needle) # https://code.activestate.com/recipes/576620-changedirectory-context-manager/#c3 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/diffoscope-189/tests/utils/nonexisting.py new/diffoscope-193/tests/utils/nonexisting.py --- old/diffoscope-189/tests/utils/nonexisting.py 2021-10-29 10:47:06.000000000 +0200 +++ new/diffoscope-193/tests/utils/nonexisting.py 2021-11-19 16:35:12.000000000 +0100 @@ -32,4 +32,7 @@ assert difference.source2 == "/nonexisting" assert not has_details or len(difference.details) > 0 - assert not has_null_source or difference.details[-1].source2 == "/dev/null" + assert not has_null_source or ( + difference.details[-1].source2 == "/dev/null" + or difference.source2 == "/nonexisting" + )