[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2014-07-30 Thread Mark Lawrence

Mark Lawrence added the comment:

@Christoph please raise a new issue regarding the problem you describe in 
msg219788.

--
nosy: +BreamoreBoy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2014-07-30 Thread Mark Lawrence

Mark Lawrence added the comment:

@Christoph sorry #21652 has already been raised to address the problem of mixed 
str and unicode objects.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2014-07-30 Thread Brian Curtin

Changes by Brian Curtin br...@python.org:


--
nosy:  -brian.curtin

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2014-06-05 Thread Christoph Zwerschke

Christoph Zwerschke added the comment:

After this patch, some of the values in mimetypes.types_map now appear as 
unicode instead of str in Python 2.7.7 under Windows. For compatibility and 
consistency reasons, I think this should be fixed so that all values are 
returned as str again under Python 2.7.

See https://groups.google.com/forum/#!topic/pylons-devel/bq8XiKlGgv0 for a real 
world issue which I think is caused by this bugfix.

--
nosy: +cito

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-10-22 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 95b88273683c by Tim Golden in branch '3.3':
Issue #15207: Fix mimetypes to read from correct area in Windows registry 
(Original patch by Dave Chambers)
http://hg.python.org/cpython/rev/95b88273683c

New changeset 12bf7fc1ba76 by Tim Golden in branch 'default':
Issue #15207: Fix mimetypes to read from correct area in Windows registry 
(Original patch by Dave Chambers)
http://hg.python.org/cpython/rev/12bf7fc1ba76

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-10-22 Thread Roundup Robot

Roundup Robot added the comment:

New changeset e8cead08c556 by Tim Golden in branch '2.7':
Issue #15207: Fix mimetypes to read from correct area in Windows registry 
(Original patch by Dave Chambers)
http://hg.python.org/cpython/rev/e8cead08c556

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-10-22 Thread Tim Golden

Tim Golden added the comment:

*cough* Somehow that didn't actually get pushed. Rebased against 2.7, 3.3  3.4 
and pushed.

--
assignee:  - tim.golden
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed
versions:  -Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-08-12 Thread Tim Golden

Tim Golden added the comment:

Thanks for the review, Ben. Updated patches attached.

1  3) default_encoding -- Your two points appear to contradict each 
other slightly. What's in the updated patches is: 3.x has no encoding 
(because everything's unicode end-to-end); 2.7 attempts to apply the 
default encoding -- which is probably ascii -- to the extension and the 
mimetype and continues on error. I'm not 100% sure about this because it 
seems possible if unlikely to have a non-ascii extension / mimetype, but 
this seems like the best compromise (and is no worse than what was there 
before). Does that seem to fit the bill?

2) subkeyname[0] -- done

4) throws EnvironmentError -- done

5) test for .png -- done

--
Added file: http://bugs.python.org/file31259/issue15207.27.2.patch
Added file: http://bugs.python.org/file31260/issue15207.33.2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___diff --git a/Doc/library/mimetypes.rst b/Doc/library/mimetypes.rst
--- a/Doc/library/mimetypes.rst
+++ b/Doc/library/mimetypes.rst
@@ -85,6 +85,9 @@
:const:`knownfiles` takes precedence over those named before it.  Calling
:func:`init` repeatedly is allowed.
 
+   Specifying an empty list for *files* will prevent the system defaults from
+   being applied: only the well-known values will be present from a built-in 
list.
+
.. versionchanged:: 2.7
   Previously, Windows registry settings were ignored.
 
diff --git a/Lib/mimetypes.py b/Lib/mimetypes.py
--- a/Lib/mimetypes.py
+++ b/Lib/mimetypes.py
@@ -254,23 +254,26 @@
 i += 1
 
 default_encoding = sys.getdefaultencoding()
-with _winreg.OpenKey(_winreg.HKEY_CLASSES_ROOT,
- r'MIME\Database\Content Type') as mimedb:
-for ctype in enum_types(mimedb):
+with _winreg.OpenKey(_winreg.HKEY_CLASSES_ROOT, '') as hkcr:
+for subkeyname in enum_types(hkcr):
 try:
-with _winreg.OpenKey(mimedb, ctype) as key:
-suffix, datatype = _winreg.QueryValueEx(key,
-'Extension')
+with _winreg.OpenKey(hkcr, subkeyname) as subkey:
+# Only check file extensions
+if not subkeyname.startswith(.):
+continue
+# raises EnvironmentError if no 'Content Type' value
+mimetype, datatype = _winreg.QueryValueEx(
+subkey, 'Content Type')
+if datatype != _winreg.REG_SZ:
+continue
+try:
+mimetype = mimetype.encode(default_encoding)
+subkeyname = subkeyname.encode(default_encoding)
+except UnicodeEncodeError:
+continue
+self.add_type(mimetype, subkeyname, strict)
 except EnvironmentError:
 continue
-if datatype != _winreg.REG_SZ:
-continue
-try:
-suffix = suffix.encode(default_encoding) # omit in 3.x!
-except UnicodeEncodeError:
-continue
-self.add_type(ctype, suffix, strict)
-
 
 def guess_type(url, strict=True):
 Guess the type of a file based on its URL.
diff --git a/Lib/test/test_mimetypes.py b/Lib/test/test_mimetypes.py
--- a/Lib/test/test_mimetypes.py
+++ b/Lib/test/test_mimetypes.py
@@ -85,6 +85,8 @@
 # Use file types that should *always* exist:
 eq = self.assertEqual
 eq(self.db.guess_type(foo.txt), (text/plain, None))
+eq(self.db.guess_type(image.jpg), (image/jpeg, None))
+eq(self.db.guess_type(image.png), (image/png, None))
 
 def test_main():
 test_support.run_unittest(MimeTypesTestCase,
diff --git a/Doc/library/mimetypes.rst b/Doc/library/mimetypes.rst
--- a/Doc/library/mimetypes.rst
+++ b/Doc/library/mimetypes.rst
@@ -85,6 +85,9 @@
:const:`knownfiles` takes precedence over those named before it.  Calling
:func:`init` repeatedly is allowed.
 
+   Specifying an empty list for *files* will prevent the system defaults from
+   being applied: only the well-known values will be present from a built-in 
list.
+
.. versionchanged:: 3.2
   Previously, Windows registry settings were ignored.
 
diff --git a/Lib/mimetypes.py b/Lib/mimetypes.py
--- a/Lib/mimetypes.py
+++ b/Lib/mimetypes.py
@@ -249,19 +249,21 @@
 yield ctype
 i += 1
 
-with _winreg.OpenKey(_winreg.HKEY_CLASSES_ROOT,
- r'MIME\Database\Content Type') as mimedb:
-for ctype in enum_types(mimedb):
+with 

[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-08-12 Thread Ben Hoyt

Ben Hoyt added the comment:

All looks great. I like what you've done with default_encoding now. Thanks, Tim 
(and Dave for the original report).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-08-11 Thread Ben Hoyt

Ben Hoyt added the comment:

Thanks, Tim! Works for me! A couple of code review comments:

1) On 2.7, guess_type(s)[0] is a byte string as usual if the type doesn't exist 
in the registry, but it's a unicode string if it came from the registry. Seems 
like it should be a byte string in all cases (the mime type can only be ASCII 
char). I would say .encode('ascii') and if it raises UnicodeError, ignore that 
key.

2) Would 'subkeyname[0] == .' be better as subkeyname.startswith(.)? More 
idiomatic, and won't bomb out if subkeyname is zero length (though that 
probably can't happen). Relatedly, not subkeyname.startswith() with an 
early-continue would avoid an indent and is what the rest of the code does.

3) I believe the default_encoding variable is not needed anymore. That was used 
in the old registry code.

4) Super-minor: raises EnvironmentError would be the Pythonic way to say 
throws EnvironmentError.

5) Would it be worth adding a test for 'foo.png' as well, as that was another 
super-common type that was wrong?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-08-10 Thread Tim Golden

Tim Golden added the comment:

I attach a patch against 3.3; this is substantially Dave Chambers' original 
patch with a trivial test added and a doc change. This means that HKCR is 
scanned to determine extensions and these will override anything in the 
mimetypes db. The doc change highlights the possibility of overriding this by 
passing files=[].

I can't see an easy solution for this which will suit everyone but I've sat on 
it waay too long already. The module startup time is increased but, for bugfix 
releases, I can't see any other solution which won't break compatibility 
somewhere. 

I'm taking the simplest view which says that: .jpg = image/pjpeg is broken but 
that the winreg code has been in place for too long to simply back it out 
altogether.

I'll commit appropriate versions of this within the next day to 2.7, 3.3 and 
3.4 unless anyone objects. Please understand: this *is* a compromise; but I 
don't think there's a perfect solution for this, short of the rethink which 
mimetypes needs per MALs suggestion or otherwise.

--
Added file: http://bugs.python.org/file31217/issue15207.33.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-04-17 Thread Tim Golden

Tim Golden added the comment:

Attached is a qd script to produce the list of extension - mimetype maps for 
a version of the mimetypes module.

--
Added file: http://bugs.python.org/file29900/mt.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-04-17 Thread Tim Golden

Tim Golden added the comment:

Three outputs produced by mt.py: tip as-is; tip without registry; tip
with new approach to registry. The results for 2.7 are near-enough
identical. Likewise the results for an elevated prompt.

--
Added file: http://bugs.python.org/file29901/mt-tip.txt
Added file: http://bugs.python.org/file29902/mt-tip-newregistry.txt
Added file: http://bugs.python.org/file29903/mt-tip-noregistry.txt

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___.jpg = image/jpg
.mid = audio/midi
.midi = audio/midi
.pct = image/pict
.pic = image/pict
.pict = image/pict
.rtf = application/rtf
.xul = text/xul
.3g2 = video/3gpp2
.3gp = video/3gpp
.AMR = audio/AMR
.a = application/octet-stream
.aac = audio/x-aac
.ac3 = audio/x-ac3
.acrobatsecuritysettings = application/vnd.adobe.acrobat-security-settings
.adts = audio/vnd.dlna.adts
.ai = application/postscript
.aif = audio/x-aiff
.aifc = audio/x-aiff
.aiff = audio/x-aiff
.amc = application/x-mpeg
.application = application/x-ms-application
.asx = video/x-ms-asf-plugin
.au = audio/basic
.avi = video/x-msvideo
.bat = text/plain
.bcpio = application/x-bcpio
.bin = application/octet-stream
.bmp = image/bmp
.c = text/plain
.c2r = text/vnd-ms.click2record+xml
.caf = audio/x-caf
.cat = vnd.ms-pki.seccat
.cdf = application/x-netcdf
.cer = x-x509-ca-cert
.contact = text/x-ms-contact
.cpio = application/x-cpio
.crl = pkix-crl
.csh = application/x-csh
.css = text/css
.dir = application/x-director
.dll = application/octet-stream
.doc = application/msword
.dot = application/msword
.dvi = application/x-dvi
.dwfx = model/vnd.dwfx+xps
.easmx = model/vnd.easmx+xps
.edrwx = model/vnd.edrwx+xps
.eml = message/rfc822
.eprtx = model/vnd.eprtx+xps
.eps = application/postscript
.etx = text/x-setext
.exe = application/octet-stream
.fdf = application/vnd.fdf
.fif = application/fractals
.flc = video/flc
.gif = image/gif
.gsm = audio/x-gsm
.gtar = application/x-gtar
.gz = application/x-gzip
.h = text/plain
.hdf = application/x-hdf
.hqx = application/mac-binhex40
.hta = application/hta
.htc = text/x-component
.htm = text/html
.html = text/html
.ico = image/x-icon
.ics = text/calendar
.ief = image/ief
.iqy = text/x-ms-iqy
.jnlp = application/x-java-jnlp-file
.jp2 = image/x-jpeg2000-image
.jpe = image/jpeg
.jpeg = image/jpeg
.jpg = image/pjpeg
.js = application/javascript
.jtx = application/x-jtx+xps
.ksh = text/plain
.latex = application/x-latex
.m1v = video/mpeg
.m3u = audio/x-mpegurl
.m3u8 = application/vnd.apple.mpegurl
.m4a = audio/x-m4a
.m4b = audio/x-m4b
.m4p = audio/x-m4p
.m4v = video/x-m4v
.man = application/x-troff-man
.mdi = image/vnd.ms-modi
.me = application/x-troff-me
.mht = message/rfc822
.mhtml = message/rfc822
.mid = midi/mid
.mif = application/x-mif
.mov = video/quicktime
.movie = video/x-sgi-movie
.mp2 = audio/mpeg
.mp3 = audio/x-mpg
.mp4 = video/mp4
.mpa = video/mpeg
.mpe = video/mpeg
.mpeg = video/x-mpeg2a
.mpf = application/vnd.ms-mediapackage
.mpg = video/mpeg
.ms = application/x-troff-ms
.nc = application/x-netcdf
.nix = application/x-mix-transfer
.nws = message/rfc822
.o = application/octet-stream
.obj = application/octet-stream
.oda = application/oda
.odc = text/x-ms-odc
.osdx = application/opensearchdescription+xml
.p10 = pkcs10
.p12 = x-pkcs12
.p7b = x-pkcs7-certificates
.p7c = application/pkcs7-mime
.p7m = pkcs7-mime
.p7r = x-pkcs7-certreqresp
.p7s = pkcs7-signature
.pbm = image/x-portable-bitmap
.pdf = application/pdf
.pdfxml = application/vnd.adobe.pdfxml
.pdx = application/vnd.adobe.pdx
.pfx = application/x-pkcs12
.pgm = image/x-portable-graymap
.pict = image/x-pict
.pko = vnd.ms-pki.pko
.pl = text/plain
.pls = audio/x-scpls
.png = image/x-png
.pnm = image/x-portable-anymap
.pntg = image/x-macpaint
.pot = application/vnd.ms-powerpoint
.ppa = application/vnd.ms-powerpoint
.ppm = image/x-portable-pixmap
.pps = application/vnd.ms-powerpoint
.ppt = application/x-mspowerpoint
.ps = application/postscript
.pwz = application/vnd.ms-powerpoint
.py = text/x-python
.pyc = application/x-python-code
.pyo = application/x-python-code
.qcp = audio/vnd.qcelp
.qt = video/quicktime
.qtif = image/x-quicktime
.qtl = application/x-quicktimeplayer
.ra = audio/x-pn-realaudio
.ram = application/x-pn-realaudio
.ras = image/x-cmu-raster
.rdf = application/xml
.rels = application/vnd.ms-package.relationships+xml
.rgb = image/x-rgb
.roff = application/x-troff
.rqy = text/x-ms-rqy
.rtsp = application/x-rtsp
.rtx = text/richtext
.sdp = application/x-sdp
.sdv = video/sd-video
.sgi = image/x-sgi
.sgm = text/x-sgml
.sgml = text/x-sgml
.sh = application/x-sh
.shar = application/x-shar
.sit = application/x-stuffit
.slupkg-ms = application/x-ms-license
.snd = audio/basic
.so = application/octet-stream
.spl = application/futuresplash
.src = application/x-wais-source
.sst = vnd.ms-pki.certstore
.stl = vnd.ms-pki.stl
.sv4cpio = application/x-sv4cpio
.sv4crc = application/x-sv4crc
.svg 

[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-04-17 Thread Tim Golden

Tim Golden added the comment:

There seems to be a consensus that the current behaviour is undesirable,
indeed broken for any meaningful use. 

The critical argument against the current Registry approach is that it
returns unexpected (or outright incorrect) mimetypes for very standard
extensions.

The arguments against reading the Registry at all are:

* That it requires some extra level of privilege to read the appropriate
keys.

* That there is a startup cost to reading the Registry

* That it can be and is updated by arbitrary programs (typically during
installation) and therefore its values cannot be relied upon.


We have 3.5 proposals on the table:

1) Don't read the registry at all, ie revert issue4969 (this is what Ben
Hoyt is advocating) [noregistry]

2) Read the registry *before* reading the standard types (this is not
strongly advocated by anyone).

3) Read the registry but in a different way, mapping from extension to
mimetype rather than vice versa. (This is Dave Chambers' patch from
issue15207). [newregistry]

3a) Lookup as per (3) but only on demand. This eliminates any startup cost.

I've produced three output files from a simple dump of the mimetypes database. 
For the purposes of taking this  forward, we're really comparing the noregistry 
and the newregistry variants.

One key issue is what to do when the same key occurs in both sets but with a 
different value. (Examples include .avi - video/x-msvideo vs video/avi; and 
.zip - application/zip vs application/x-zip-compressed).

And the other key issue is whether the overheads (security, speed) of using the 
registry outweigh its usefulness in any case.

Could I ask those who would remove the registry use altogether to comment on 
the newregistry output (generating your own if it helps) to see whether it 
changes your views at all.

Either approach -- no-registry or new-registry -- feasible and the code churn 
is about equal. I'm unsure about compatibility issues: it's hard to see anyone 
relying on the incorrect mimetypes; but it's certainly possible to see someone 
relying on the additional (correct) mimetypes.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-04-17 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

I think it's important to stick to established standards for
MIME types and to make sure that Python returns the same values
on all platforms using the default settings.

Apache comes with a mime.types file which includes both the
official IANA types and many common, but unregistered types:

http://svn.apache.org/viewvc/httpd/httpd/trunk/docs/conf/mime.types?view=markup

This can be used as reference (much like we use the X.org locale
database as reference for the locale module).

If an application wants to use the Windows registry settings instead,
that's fine, but it should not be the default if there's a difference
in output compared to the hard-coded list in mimetypes.

Note that this would probably require a redesign of the
internals of the mimetypes module. It currently provides only a
small subset as default mapping and then reads the full set from
one of the mime.types files it can find on the system.
Such a redesign would only be possible for Python 3.4, not
Python 2.7.

--
nosy: +lemburg

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-04-17 Thread Dave Chambers

Dave Chambers added the comment:

Enough with the bikeshedding... it's been 10 months... fix the bug.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-04-17 Thread Brian Curtin

Brian Curtin added the comment:

Just an FYI, but if it takes 10 more months to get it right, we'll do that.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-04-17 Thread Ben Hoyt

Ben Hoyt added the comment:

Okay, I'm looking at the diff between mt-tip-noregistry.txt and 
mt-tip-newregistry.txt, and I've attached a file showing the lines that are 
*different* between the two, as well as the Apache mime.types value for that 
file extension.

In most cases, noregistry gives the right mime type, and newregistry is wrong. 
However, in a few cases, the registry value is right (i.e., according to 
Apache's mime.types). However, I think that's a totally separate issue, and 
maybe we should probably open a bug to update a few of the hard-coded mappings 
in mimetypes.py.

The cases where noregistry is right (according to Apache):

* .aif
* .aifc
* .aiff
* .avi
* .sh
* .wav
* .xsl
*. zip

The cases where noregistry is wrong (according to Apache):

* .bmp is hard-coded as image/x-ms-bmp, but it should be image/bmp
* .dll and .exe are hard-coded as application/octet-stream, but should be 
application/x-msdownload
* .ico is hard-coded as image/vnd.microsoft.icon but should be image/x-icon
* .m3u is hard-coded as application/vnd.apple.mpegurl but should be 
audio/x-mpegurl

None of these are standardized IANA mime types, and they're not particularly 
common for web servers to be serving, so it probably doesn't matter too much 
that the current hard-coded values are wrong. Also, I'm guessing web browsers 
will interpret the older type image/x-ms-bmp as image/bmp anyway. So maybe we 
should open another issue to fix the hard-coded types in mimetypes.py shown 
above, but again, that's another issue.

The other thing here is all the *new types* that the registry adds, such as 
.acrobatsecuritysettings. I don't see that these add much value -- just a 
bunch of types that depend on the programs you have installed. And in my mind 
at least, the behaviour of mimetypes.guess_type() should not change based on 
what programs you have installed.

In short, noregistry gives more accurate results in most cases that matter, 
and I still strongly feel we should revert to that. (The only alternative, in 
my opinion, is to switch to Dave Chambers' version of read_windows_registry(), 
but not call it by default.)

--
Added file: http://bugs.python.org/file29913/different.txt

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-01-30 Thread Ben Hoyt

Ben Hoyt added the comment:

Any update on this, Tim or other Windows developers?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2013-01-30 Thread Brian Curtin

Brian Curtin added the comment:

I can't comment on what the change should be or how it should be done as I 
don't do anything with mimetypes, but nothing about how the patch was written 
jumps out at me for being incorrect (except I would not include ishimoto's name 
changes).

If there's a consensus that this is the appropriate change to be made, the 
patch still needs tests.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-10 Thread R. David Murray

R. David Murray added the comment:

I'm personally OK with the option of removing the registry support (or making 
it optional-by-default), but I'm not going to make that call, we need a windows 
dev opinion.

Maintaining the list of windows exceptions shouldn't be much worse than 
maintaining the list of mime types.  I can't imagine that Microsoft changes it 
all that often, given that you say they haven't bothered to update the zip type 
yet.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-10 Thread Dave Chambers

Dave Chambers added the comment:

(I'm a windows dev type)
I would say that there are 2 issues with relying on the registry:
1) Default values (ie. set by Windows upon OS install) are broken and MS never 
fixes them.
2) The values can be changed at any time, by any app. Thus the values are 
unreliable.

If I were to code it from scratch today, I'd create a three-pronged approach:
a) Hardcode a list of known types (fast  reliable).
b) Have a default case where unknown types are pulled from the registry. 
Whatever value is retrieved is likely better than returning e.g. 
application/octet-stream.
c) When we neither find it in hardcoded list or in the registry, return a 
default value (e.g. application/octet-stream)

For what it's worth, my workaround will be to have my app delete the 
HKCR\MIME\Database\Content Type\image/x-png regkey, thus forcing the original 
braindead mimetypes.py code to use HKCR\MIME\Database\Content Type\image/png

And, for what it's worth, my patch is actually faster than the current 
mimetypes.py code because I'm not doing reverse lookups. Thus any argument 
about a difference in speed is moot. Arguments about the speed of pulling 
mimetypes from registry are valid.

Another registry based approach would be to build a dictionary of mimetypes on 
demand. In this scenario, at startup, the dictionary would be empty. When 
python needs the mimetype for .png, on the 1st request  it would cause a 
slow registry lookup for only that type but on all subsequent requests for 
the type it would use the fast value from the dictionary.
Given that an app will probably use only a handful of mimetypes but will use 
that same handful over and over, such a solution would have the benefits of (a) 
not using hardcoded values (thus no ongoing maintenance), (b) performing slow 
stuff only on demand, (c) optimizing repeat calls, and (d) consuming zero 
startup time.

I'll code his up  run some timing tests if anyone thinks it's worthwhile.

BTW, who makes the final determination as to if/when any such changes would be 
incorporated?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-10 Thread R. David Murray

R. David Murray added the comment:

I would say Brian Curtin, Tim Golden, and/or Martin von Löwis, as
they are the currently active committers with significant Windows expertise.  
Other committers may have opinions as well.  If you don't get an answer here in 
a reasonable amount of time, please post a discussion of the issue to 
python-dev (it may end up there anyway).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-10 Thread Terry J. Reedy

Terry J. Reedy added the comment:

Gabriel and Antoine: As I understand it, the claim in this issue is that the 
patch in #4969 (G. wrote, A. committed) is unsatisfactory. I think it would 
help if either of you commented.

--
nosy: +gagenellina, pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-10 Thread Antoine Pitrou

Antoine Pitrou added the comment:

I'll leave it to a Windows expert.

--
versions:  -Python 2.7, Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-10 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
stage: commit review - patch review
versions: +Python 2.7, Python 3.3, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-10 Thread Tim Golden

Tim Golden added the comment:

Sorry; late to the party. I'll try to take a look at the patches. 
Basically I'm sympathetic to the problem (which seems quite 
straightforwardly buggish) but I want to take a look around the issue first.

TJG

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-09 Thread Ben Hoyt

Ben Hoyt added the comment:

Either way -- this needs to be reverted or fixed. It's a nasty gotcha for folks 
writing Python web services at the moment. I'm still for reverting, per my 
reasons above.

Dave Chambers, I'm not for faster but broken but for faster and fixed -- 
from what I've shown above, it's the Windows registry that's broken, so 
removing read_windows_registry() entirely would fix this (and as a bonus, be 
faster and simplify the code :-).

Per your suggestion http://bugs.python.org/issue15207#msg177092 -- I don't 
understand how mimetypes.py would know the types that aren't hardcoded.

R. David Murray, I don't understand the advantage of trying to maintain a list 
of Windows fixes. What if this list was wrong, or there was a Windows update 
which broke more mime types? Why can't we just avoid the complication and go 
back to the hardcoded types for Windows?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-09 Thread Dave Chambers

Dave Chambers added the comment:

 removing read_windows_registry()
If you're suggesting hardcoding *ALL* the mimetypes for *ALL* OSes, I think 
that's probably the best overall solution.
No variability, as fast as can be.
The downside is that there would occasionally be an unrecognized type, thus 
there'd need to be diligence to keep the hardcoded list up to date, but overall 
I think Ben Hoyt's suggestion is best.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-09 Thread Ben Hoyt

Ben Hoyt added the comment:

Actually, I was suggesting using the hardcoded types for Windows only (i.e., 
only removing read_windows_registry). Several bugs have been opened on problems 
with the Windows registry mimetypes, but as far as I know this isn't an issue 
on Linux -- in other words, if Linux/other systems ain't broke, no need to fix 
them.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-07 Thread Dave Chambers

Dave Chambers added the comment:

Disappointing that faster but broken is preferable to slower but fixed

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-07 Thread R. David Murray

R. David Murray added the comment:

I will note that on unix the user is also free to update the machine's mime 
types registry (that's more than half the point of the mimetypes module).  
Usually this is only done by installed software...as I believe is the case on 
Windows as well.

That said, there should be a way to explicitly bypass this loading of local 
data for a program that wishes to use only the Python supplied types.  And 
indeed, this is possible: just pass an empty list of filenames to init.  This 
bypasses the windows registry lookup.  (Note that this could be better 
documented...it is not made explicit that an empty list is different from not 
specifying a list or specifying it as None, but it is).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-07 Thread R. David Murray

R. David Murray added the comment:

That said, the fact that windows is just *wrong* about some mimetypes is 
definitely an issue.  We could call it a platform bug, but that would be a 
disservice to the user community.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-07 Thread Dave Chambers

Dave Chambers added the comment:

Seems to me that some hybrid would be a good solution: Hardcode the known types 
(which solves the windows is just wrong case) then as a default look in the 
registry for those that aren't hardcoded.
Therefore the hit of additional time would only be for lesser-known types.
In any case, it's pretty bad that python allows the wrong mimetype for PNG , 
even if it is a Windows registry issue.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-07 Thread R. David Murray

R. David Murray added the comment:

To be consistent with the overall philosophy of the mimetypes module, it should 
be instead a list of windows fixes which are applied if the broken mimetype 
is found in the windows registry.

If you want to avoid the overhead, pass an empty list to init.  A note about 
the overhead and fixes should be added to the docs.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-12-05 Thread Ben Hoyt

Ben Hoyt added the comment:

Ah, thanks for making this an issue of its own! As I commented over at 
Issue10551, it's a serious problem, and makes mimetypes.guess_type() unusable 
out of the box on Windows.

Yes, the fix in Issue4969 uses MIME\Database\Content Type, which is a mime 
type - file extension mapping, *not the other way around*.

So while this patch is definitely an improvement (for the most part it doesn't 
produce wrong values!), but I'm not sure it's the way to go, for a few reasons:

1) Many of the most important keys aren't in the Windows registry (in 
HKEY_CLASSES_ROOT, where this patch looks). This includes .png, .jpg, and .gif. 
All of these important types fall back to the hard-coded types_map in 
mimetypes.py anyway.

2) Some that do exist are wrong in the registry (or at the least, different 
from the built-in types_map). This includes .zip, which is 
application/x-zip-compressed (at least in my registry) but should be 
application/zip.

3) It's slowish, as it has to load about 6000 registry keys (and more as you 
install more stuff on your system), but only about 200 of those have the 
Content Type subkey. On my machine (Windows 7, 64 bit CPython) this adds over 
100ms to the startup time even on subsequent runs when cached -- and I think 
100ms is pretty significant. Issue4969's version takes about 25ms, and 
reverting this completely would of course take 0ms.

4) Users and other programs can (and sometimes do!) change the Content Type 
keys in the registry -- whereas one wants mime type mappings to be consistent 
across systems. This last point is debatable for various reasons, and I think 
the above three points should carry the day, but I include it here for 
completeness. ;-)

For these reasons, I think we should revert the fix for Issue4969 and leave 
Windows users to get the default types_map as before, which is at least 
consisent -- and for mimetypes.guess_type(), you want consistency.

--
nosy: +benhoyt

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-08-22 Thread R. David Murray

R. David Murray added the comment:

Unfortunately I don't feel qualified to review the patch itself since I'm not a 
windows user and don't currently even have a windows box to test on.  Hopefully 
one of the windows devs will take a look; the patch looks to be fairly 
straightforward to evaluate if one understands _winreg.

--
nosy: +terry.reedy
stage: needs patch - commit review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-08-21 Thread Yap Sok Ann

Yap Sok Ann added the comment:

On Python 2.7, I need to add this to the original diff by Dave, in the same 
try-except block:

mimetype = mimetype.encode(default_encoding) # omit in 3.x!

--
nosy: +sayap

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-07-24 Thread Atsuo Ishimoto

Atsuo Ishimoto ishim...@gembook.org added the comment:

This patch looks good to me.

I generated a patch for current trunk, with some cosmetic changes.

--
nosy: +ishimoto
Added file: http://bugs.python.org/file26507/issue15207.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-06-27 Thread Dave Chambers

New submission from Dave Chambers dlchamb...@aol.com:

The current mimetypes.read_windows_registry() enums the values under 
HKCR\MIME\Database\Content Type
However, this is the key for mimetype to extension lookups, NOT for extension 
to mimetype lookups.
As a result, when 1 MIME types are mapped to a particular extension, the 
last-found entry is used.
For example, both image/png and image/x-png map to the .png file 
extension.
Unfortunately, what happens is this code finds image/png, then later finds 
image/x-png and this steals the .png extension.


The solution is to use the correct regkey, which is the HKCR root.
This is the correct location for extension-to-mimetype lookups.
What we should do is enum the HKCR root, find all subkeys that start with a dot 
(i.e. file extensions), then inspect those for a 'Content Type' value.


The attached ZIP contains:
mimetype_flaw_demo.py  - this demonstrates the error (due to wrong regkey) and 
my fix (uses the correct regkey)
mimetypes_fixed.py   -   My suggested fix to the standard mimetypes.py module.

--
components: Windows
files: mimetype_flaw_demo.zip
messages: 164167
nosy: dlchambers
priority: normal
severity: normal
status: open
title: mimetypes.read_windows_registry() uses the wrong regkey, creates wrong 
mappings
type: behavior
versions: Python 2.7
Added file: http://bugs.python.org/file26180/mimetype_flaw_demo.zip

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-06-27 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

Thanks for working on this.  Could you please post the fix as a patch file?  If 
you don't have mercurial, you can generate the diff on windows using the python 
diff module (scripts/diff.py -u yourfile origfile).  Actually, I'm not sure 
exactly where diff is in the windows install, but I know it is there.

Do you know if image/x-png and image/png are included in the registry on all 
windows versions?  If so we could use that key for a unit test.

--
components: +email
nosy: +barry, r.david.murray
stage:  - needs patch
versions: +Python 3.2, Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-06-27 Thread Dave Chambers

Dave Chambers dlchamb...@aol.com added the comment:

My first diff file... I hope I did it right :)

--
keywords: +patch
Added file: http://bugs.python.org/file26181/mimetypes.py.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-06-27 Thread Dave Chambers

Dave Chambers dlchamb...@aol.com added the comment:

I added a diff file to the bug.
Dunno if that's the same as a patch file, or how to create a patchfile if it's 
not.

Do you know if image/x-png and image/png are included in the registry on all 
 windows versions?

I think your question is reversed, in the same way that the code was reversed.
You're not looking for image/png and/or image/x-png. You're looking for .png in 
order to retrieve its mimetype (aka Content Type).
While nothing is 100% certain on Windows :), I'm quite confident that every 
copy will have an HKCR\.png regkey, and that regkey will have a Content Type 
value, and that value's setting will be the appropriate mometype, which I'd 
expect to be image/png.

I was kinda surprised to find this bug as it's so obvious
I started chasing it because Chrome kept complaining that pngs were being 
served as image/x-png (by CherryPy).
There are other bugs (eg: 15199, 10551) that my patch should fix.

-Dave

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-06-27 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

Well, I had no involvement in the windows registry reading stuff, and it is 
relatively new.  And, as issue 10551 indicates, a bit controversial.  (issue 
15199 is a different bug, in python's own internal table).

Can you run that diff again and use the '-u' flag?  The -u (universal) format 
is the one we are used to working with.  The one you posted still lets us read 
the changes, though, which is very helpful.

--
nosy: +brian.curtin, tim.golden

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15207] mimetypes.read_windows_registry() uses the wrong regkey, creates wrong mappings

2012-06-27 Thread Dave Chambers

Changes by Dave Chambers dlchamb...@aol.com:


Added file: http://bugs.python.org/file26185/mimetypes.py.diff.u

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15207
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com