* Gianfranco Costamagna: " Re: Embedded code copy of python-magic" (Tue, 20 Feb
  2018 15:14:46 +0100):

> control: tags -1 wontfix
> control: close -1
> 
> On Wed, 7 Feb 2018 18:42:51 +0100 Mathias Behrle <mbeh...@debian.org> wrote:
> > Package: sqlmap
> > Version: 1.2-1
> > Severity: normal
> > Usertags: embedded-code-copy
> > 
> > Dear maintainers,
> > 
> > your binary package embeds a code copy of the Python magic module. [1]
> > python-magic 2:0.4.15-1 providing a compatibility layer by Adam Hupp [2]
> > has now hit unstable. According to Debian Policy 4.13 you should now use
> > this package and remove the embedded code copy.
> >   
> 
> Hello, I reported this upstream [1], and I got a simple nack.
> Please try to cleanup and have a common implementation, convince upstream to
> use it, and then I'll import on the next release.
> I don't want to break sqlmap with your code version.
> 
> [1] https://github.com/sqlmapproject/sqlmap/pull/2933
> 
> G.
> 

Thanks for at least trying to push the change upstream.

I don't understand the meaning of

"
-> and now he is trying to force his own TRUE version for a simple wrapper.
Case closed
"

as there isn't anyone nowhere forcing to push anything.

Note: It is Adam Hupp, the author of the magic bindings that *sqlmap* *uses*,
who thankfully is implementing this change.

, but anyway I think you could still apply your really non-invasive patch in
Debian. If anything *should* break, it can be removed within seconds. But you
had tried to comply a little bit more with policy. FTR diff attached between
current magic in sqlmap vs. current magic [1].

Of course YMMV,
Mathias

[1] https://github.com/ahupp/python-magic/tree/libmagic-compat

-- 

    Mathias Behrle
    PGP/GnuPG key availabable from any keyserver, ID: 0xD6D09BE48405BBF6
    AC29 7E5C 46B9 D0B6 1C71  7681 D6D0 9BE4 8405 BBF6
--- magic.py	2018-02-20 16:12:16.468274517 +0100
+++ __init__.py	2018-02-20 16:13:23.735786956 +0100
@@ -1,6 +1,8 @@
 """
 magic is a wrapper around the libmagic file identification library.
 
+See README for more information.
+
 Usage:
 
 >>> import magic
@@ -12,195 +14,283 @@
 'PDF document, version 1.2'
 >>>
 
+
 """
 
 import sys
+import glob
 import os.path
+import ctypes
+import ctypes.util
+import threading
+import logging
+
+from ctypes import c_char_p, c_int, c_size_t, c_void_p
+
+# avoid shadowing the real open with the version from compat.py
+_real_open = open
 
 class MagicException(Exception):
-    pass
+    def __init__(self, message):
+        super(MagicException, self).__init__(message)
+        self.message = message
+
 
 class Magic:
     """
     Magic is a wrapper around the libmagic C library.
+
     """
 
-    def __init__(self, mime=False, magic_file=None, mime_encoding=False):
+    def __init__(self, mime=False, magic_file=None, mime_encoding=False,
+                 keep_going=False, uncompress=False):
         """
         Create a new libmagic wrapper.
 
         mime - if True, mimetypes are returned instead of textual descriptions
         mime_encoding - if True, codec is returned
         magic_file - use a mime database other than the system default
+        keep_going - don't stop at the first match, keep going
+        uncompress - Try to look inside compressed files.
         """
-
-        flags = MAGIC_NONE
+        self.flags = MAGIC_NONE
         if mime:
-            flags |= MAGIC_MIME
-        elif mime_encoding:
-            flags |= MAGIC_MIME_ENCODING
+            self.flags |= MAGIC_MIME
+        if mime_encoding:
+            self.flags |= MAGIC_MIME_ENCODING
+        if keep_going:
+            self.flags |= MAGIC_CONTINUE
 
-        self.cookie = magic_open(flags)
+        if uncompress:
+            self.flags |= MAGIC_COMPRESS
 
-        magic_load(self.cookie, magic_file)
+        self.cookie = magic_open(self.flags)
+        self.lock = threading.Lock()
 
+        magic_load(self.cookie, magic_file)
 
     def from_buffer(self, buf):
         """
         Identify the contents of `buf`
         """
+        with self.lock:
+            try:
+                # if we're on python3, convert buf to bytes
+                # otherwise this string is passed as wchar*
+                # which is not what libmagic expects
+                if type(buf) == str and str != bytes:
+                   buf = buf.encode('utf-8', errors='replace')
+                return maybe_decode(magic_buffer(self.cookie, buf))
+            except MagicException as e:
+                return self._handle509Bug(e)
 
-        return magic_buffer(self.cookie, buf)
+    def from_open_file(self, open_file):
+        with self.lock:
+            try:
+                return maybe_decode(magic_descriptor(self.cookie, open_file.fileno()))
+            except MagicException as e:
+                return self._handle509Bug(e)
 
     def from_file(self, filename):
-        """
-        Identify the contents of file `filename`
-        raises IOError if the file does not exist
-        """
-
-        if not os.path.exists(filename):
-            raise IOError("File does not exist: " + filename)
+        # raise FileNotFoundException or IOError if the file does not exist
+        with _real_open(filename):
+            pass
 
-        return magic_file(self.cookie, filename)
+        with self.lock:
+            try:
+                return maybe_decode(magic_file(self.cookie, filename))
+            except MagicException as e:
+                return self._handle509Bug(e)
+
+    def _handle509Bug(self, e):
+        # libmagic 5.09 has a bug where it might fail to identify the
+        # mimetype of a file and returns null from magic_file (and
+        # likely _buffer), but also does not return an error message.
+        if e.message is None and (self.flags & MAGIC_MIME):
+            return "application/octet-stream"
+        else:
+            raise e
 
     def __del__(self):
-        # during shutdown magic_close may have been cleared already
+        # no _thread_check here because there can be no other
+        # references to this object at this point.
+
+        # during shutdown magic_close may have been cleared already so
+        # make sure it exists before using it.
+
+        # the self.cookie check should be unnecessary and was an
+        # incorrect fix for a threading problem, however I'm leaving
+        # it in because it's harmless and I'm slightly afraid to
+        # remove it.
         if self.cookie and magic_close:
             magic_close(self.cookie)
             self.cookie = None
 
-_magic_mime = None
-_magic = None
-
-def _get_magic_mime():
-    global _magic_mime
-    if not _magic_mime:
-        _magic_mime = Magic(mime=True)
-    return _magic_mime
-
-def _get_magic():
-    global _magic
-    if not _magic:
-        _magic = Magic()
-    return _magic
+_instances = {}
 
 def _get_magic_type(mime):
-    if mime:
-        return _get_magic_mime()
-    else:
-        return _get_magic()
+    i = _instances.get(mime)
+    if i is None:
+        i = _instances[mime] = Magic(mime=mime)
+    return i
 
 def from_file(filename, mime=False):
+    """"
+    Accepts a filename and returns the detected filetype.  Return
+    value is the mimetype if mime=True, otherwise a human readable
+    name.
+
+    >>> magic.from_file("testdata/test.pdf", mime=True)
+    'application/pdf'
+    """
     m = _get_magic_type(mime)
     return m.from_file(filename)
 
 def from_buffer(buffer, mime=False):
+    """
+    Accepts a binary string and returns the detected filetype.  Return
+    value is the mimetype if mime=True, otherwise a human readable
+    name.
+
+    >>> magic.from_buffer(open("testdata/test.pdf").read(1024))
+    'PDF document, version 1.2'
+    """
     m = _get_magic_type(mime)
     return m.from_buffer(buffer)
 
-try:
-    libmagic = None
-
-    import ctypes
-    import ctypes.util
 
-    from ctypes import c_char_p, c_int, c_size_t, c_void_p
 
-    # Let's try to find magic or magic1
-    dll = ctypes.util.find_library('magic') or ctypes.util.find_library('magic1')
 
-    # This is necessary because find_library returns None if it doesn't find the library
-    if dll:
+libmagic = None
+# Let's try to find magic or magic1
+dll = ctypes.util.find_library('magic') or ctypes.util.find_library('magic1') or ctypes.util.find_library('cygmagic-1')
+
+# This is necessary because find_library returns None if it doesn't find the library
+if dll:
+    libmagic = ctypes.CDLL(dll)
+
+if not libmagic or not libmagic._name:
+    windows_dlls = ['magic1.dll','cygmagic-1.dll']
+    platform_to_lib = {'darwin': ['/opt/local/lib/libmagic.dylib',
+                                  '/usr/local/lib/libmagic.dylib'] +
+                         # Assumes there will only be one version installed
+                         glob.glob('/usr/local/Cellar/libmagic/*/lib/libmagic.dylib'),
+                       'win32': windows_dlls,
+                       'cygwin': windows_dlls,
+                       'linux': ['libmagic.so.1'],    # fallback for some Linuxes (e.g. Alpine) where library search does not work
+                      }
+    platform = 'linux' if sys.platform.startswith('linux') else sys.platform
+    for dll in platform_to_lib.get(platform, []):
         try:
             libmagic = ctypes.CDLL(dll)
-        except WindowsError:
+            break
+        except OSError:
             pass
 
-    if not libmagic or not libmagic._name:
-        import sys
-        platform_to_lib = {'darwin': ['/opt/local/lib/libmagic.dylib',
-                                      '/usr/local/lib/libmagic.dylib',
-                                      '/usr/local/Cellar/libmagic/5.10/lib/libmagic.dylib'],
-                           'win32':  ['magic1.dll']}
-        for dll in platform_to_lib.get(sys.platform, []):
-            try:
-                libmagic = ctypes.CDLL(dll)
-            except OSError:
-                pass
-
-    if not libmagic or not libmagic._name:
-        # It is better to raise an ImportError since we are importing magic module
-        raise ImportError('failed to find libmagic.  Check your installation')
+if not libmagic or not libmagic._name:
+    # It is better to raise an ImportError since we are importing magic module
+    raise ImportError('failed to find libmagic.  Check your installation')
 
-    magic_t = ctypes.c_void_p
+magic_t = ctypes.c_void_p
 
-    def errorcheck(result, func, args):
+def errorcheck_null(result, func, args):
+    if result is None:
         err = magic_error(args[0])
-        if err is not None:
-            raise MagicException(err)
-        else:
-            return result
-
-    def coerce_filename(filename):
-        if filename is None:
-            return None
-        return filename.encode(sys.getfilesystemencoding())
-
-    magic_open = libmagic.magic_open
-    magic_open.restype = magic_t
-    magic_open.argtypes = [c_int]
-
-    magic_close = libmagic.magic_close
-    magic_close.restype = None
-    magic_close.argtypes = [magic_t]
-
-    magic_error = libmagic.magic_error
-    magic_error.restype = c_char_p
-    magic_error.argtypes = [magic_t]
-
-    magic_errno = libmagic.magic_errno
-    magic_errno.restype = c_int
-    magic_errno.argtypes = [magic_t]
-
-    _magic_file = libmagic.magic_file
-    _magic_file.restype = c_char_p
-    _magic_file.argtypes = [magic_t, c_char_p]
-    _magic_file.errcheck = errorcheck
-
-    def magic_file(cookie, filename):
-        return _magic_file(cookie, coerce_filename(filename))
-
-    _magic_buffer = libmagic.magic_buffer
-    _magic_buffer.restype = c_char_p
-    _magic_buffer.argtypes = [magic_t, c_void_p, c_size_t]
-    _magic_buffer.errcheck = errorcheck
-
+        raise MagicException(err)
+    else:
+        return result
 
-    def magic_buffer(cookie, buf):
-        return _magic_buffer(cookie, buf, len(buf))
+def errorcheck_negative_one(result, func, args):
+    if result is -1:
+        err = magic_error(args[0])
+        raise MagicException(err)
+    else:
+        return result
 
-    _magic_load = libmagic.magic_load
-    _magic_load.restype = c_int
-    _magic_load.argtypes = [magic_t, c_char_p]
-    _magic_load.errcheck = errorcheck
 
-    def magic_load(cookie, filename):
-        return _magic_load(cookie, coerce_filename(filename))
+# return str on python3.  Don't want to unconditionally
+# decode because that results in unicode on python2
+def maybe_decode(s):
+    if str == bytes:
+        return s
+    else:
+        return s.decode('utf-8')
 
-    magic_setflags = libmagic.magic_setflags
-    magic_setflags.restype = c_int
-    magic_setflags.argtypes = [magic_t, c_int]
+def coerce_filename(filename):
+    if filename is None:
+        return None
+
+    # ctypes will implicitly convert unicode strings to bytes with
+    # .encode('ascii').  If you use the filesystem encoding
+    # then you'll get inconsistent behavior (crashes) depending on the user's
+    # LANG environment variable
+    is_unicode = (sys.version_info[0] <= 2 and
+                  isinstance(filename, unicode)) or \
+                  (sys.version_info[0] >= 3 and
+                   isinstance(filename, str))
+    if is_unicode:
+        return filename.encode('utf-8', 'surrogateescape')
+    else:
+        return filename
 
-    magic_check = libmagic.magic_check
-    magic_check.restype = c_int
-    magic_check.argtypes = [magic_t, c_char_p]
+magic_open = libmagic.magic_open
+magic_open.restype = magic_t
+magic_open.argtypes = [c_int]
+
+magic_close = libmagic.magic_close
+magic_close.restype = None
+magic_close.argtypes = [magic_t]
+
+magic_error = libmagic.magic_error
+magic_error.restype = c_char_p
+magic_error.argtypes = [magic_t]
+
+magic_errno = libmagic.magic_errno
+magic_errno.restype = c_int
+magic_errno.argtypes = [magic_t]
+
+_magic_file = libmagic.magic_file
+_magic_file.restype = c_char_p
+_magic_file.argtypes = [magic_t, c_char_p]
+_magic_file.errcheck = errorcheck_null
+
+def magic_file(cookie, filename):
+    return _magic_file(cookie, coerce_filename(filename))
+
+_magic_buffer = libmagic.magic_buffer
+_magic_buffer.restype = c_char_p
+_magic_buffer.argtypes = [magic_t, c_void_p, c_size_t]
+_magic_buffer.errcheck = errorcheck_null
+
+def magic_buffer(cookie, buf):
+    return _magic_buffer(cookie, buf, len(buf))
+
+magic_descriptor = libmagic.magic_descriptor
+magic_descriptor.restype = c_char_p
+magic_descriptor.argtypes = [magic_t, c_int]
+magic_descriptor.errcheck = errorcheck_null
+
+_magic_load = libmagic.magic_load
+_magic_load.restype = c_int
+_magic_load.argtypes = [magic_t, c_char_p]
+_magic_load.errcheck = errorcheck_negative_one
+
+def magic_load(cookie, filename):
+    return _magic_load(cookie, coerce_filename(filename))
+
+magic_setflags = libmagic.magic_setflags
+magic_setflags.restype = c_int
+magic_setflags.argtypes = [magic_t, c_int]
+
+magic_check = libmagic.magic_check
+magic_check.restype = c_int
+magic_check.argtypes = [magic_t, c_char_p]
+
+magic_compile = libmagic.magic_compile
+magic_compile.restype = c_int
+magic_compile.argtypes = [magic_t, c_char_p]
 
-    magic_compile = libmagic.magic_compile
-    magic_compile.restype = c_int
-    magic_compile.argtypes = [magic_t, c_char_p]
 
-except (ImportError, OSError):
-    from_file = from_buffer = lambda *args, **kwargs: "unknown"
 
 MAGIC_NONE = 0x000000 # No flags
 MAGIC_DEBUG = 0x000001 # Turn on debugging
@@ -214,6 +304,7 @@
 MAGIC_PRESERVE_ATIME = 0x000080 # Restore access time on exit
 MAGIC_RAW = 0x000100 # Don't translate unprintable chars
 MAGIC_ERROR = 0x000200 # Handle ENOENT etc as real errors
+
 MAGIC_NO_CHECK_COMPRESS = 0x001000 # Don't check for compressed files
 MAGIC_NO_CHECK_TAR = 0x002000 # Don't check for tar files
 MAGIC_NO_CHECK_SOFT = 0x004000 # Don't check magic entries
@@ -223,3 +314,48 @@
 MAGIC_NO_CHECK_TROFF = 0x040000 # Don't check ascii/troff
 MAGIC_NO_CHECK_FORTRAN = 0x080000 # Don't check ascii/fortran
 MAGIC_NO_CHECK_TOKENS = 0x100000 # Don't check ascii/tokens
+
+# This package name conflicts with the one provided by upstream
+# libmagic.  This is a common source of confusion for users.  To
+# resolve, We ship a copy of that module, and expose it's functions
+# wrapped in deprecation warnings.
+def add_compat(to_module):
+
+    import warnings, re
+    from magic import compat
+
+    def deprecation_wrapper(compat, fn, alternate):
+        def _(*args, **kwargs):
+            warnings.warn(
+                "Using compatability mode with libmagic's python binding",
+                DeprecationWarning)
+
+            return compat[fn](*args, **kwargs)
+        return _
+
+    fn = [('detect_from_filename', 'magic.from_file'),
+          ('detect_from_content', 'magic.from_buffer'),
+          ('detect_from_fobj', 'magic.Magic.from_open_file'),
+          ('open', 'magic.Magic')]
+    for (fname, alternate) in fn:
+        # for now, disable the deprecation warning until theres clarity on
+        # what the merged module should look like
+        to_module[fname] = compat.__dict__.get(fname)
+        #to_module[fname] = deprecation_wrapper(compat.__dict__, fname, alternate)
+
+    # copy constants over, ensuring there's no conflicts
+    is_const_re = re.compile("^[A-Z_]+$")
+    allowed_inconsistent = set(['MAGIC_MIME'])
+    for name, value in compat.__dict__.items():
+        if is_const_re.match(name):
+            if name in to_module:
+                if name in allowed_inconsistent:
+                    continue
+                if to_module[name] != value:
+                    raise Exception("inconsistent value for " + name)
+                else:
+                    continue
+            else:
+                to_module[name] = value
+
+add_compat(globals())

Attachment: pgpxHoUr75Tub.pgp
Description: Digitale Signatur von OpenPGP

Reply via email to