[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-12-12 Thread Dong-hee Na


Dong-hee Na  added the comment:

@ned.deily @maxking
I close this issue since all PRs were merged.
Thanks, everyone for actions for this issue :)

Have a warm and happy holiday and a hopeful new year.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-12-01 Thread miss-islington


miss-islington  added the comment:


New changeset 4f1eaf028058cc357030dfaa5e611c90662539f0 by Miss Islington (bot) 
in branch '3.8':
bpo-38449: Add URL delimiters test cases (GH-16729)
https://github.com/python/cpython/commit/4f1eaf028058cc357030dfaa5e611c90662539f0


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-12-01 Thread miss-islington


miss-islington  added the comment:


New changeset 926eabb6b46106e677d5e1ea25b7bab918da4110 by Miss Islington (bot) 
in branch '3.7':
bpo-38449: Add URL delimiters test cases (GH-16729)
https://github.com/python/cpython/commit/926eabb6b46106e677d5e1ea25b7bab918da4110


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-12-01 Thread miss-islington


Change by miss-islington :


--
pull_requests: +16910
pull_request: https://github.com/python/cpython/pull/17431

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-12-01 Thread miss-islington


Change by miss-islington :


--
pull_requests: +16911
pull_request: https://github.com/python/cpython/pull/17432

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-12-01 Thread Abhilash Raj


Abhilash Raj  added the comment:


New changeset 2fe4c48917c2d1b40cf063c6ed22ae2e71f4cb62 by Abhilash Raj 
(Dong-hee Na) in branch 'master':
bpo-38449: Add URL delimiters test cases (#16729)
https://github.com/python/cpython/commit/2fe4c48917c2d1b40cf063c6ed22ae2e71f4cb62


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-15 Thread Ned Deily


Ned Deily  added the comment:

(fix also released in 3.7.5)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-15 Thread Ned Deily


Ned Deily  added the comment:


New changeset 2a405598bbccbc42710dc5ecf3d44c8de4c16582 by Ned Deily (Abhilash 
Raj) in branch '3.7':
[3.7] bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper 
parsing of URLs (GH-15685)" (GH-16724) (GH-16727)
https://github.com/python/cpython/commit/2a405598bbccbc42710dc5ecf3d44c8de4c16582


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-14 Thread Łukasz Langa

Łukasz Langa  added the comment:

(3.8.0 is released with this fix)

--
priority: release blocker -> normal

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-14 Thread STINNER Victor


Change by STINNER Victor :


--
nosy:  -vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-12 Thread Ned Deily


Ned Deily  added the comment:

(On second thought, I'll leave this open as a release blocker until we've 
cherry-picked the fixes for 3.8.0 final and 3.7.5 final.)

--
priority: normal -> release blocker

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-12 Thread Ned Deily


Ned Deily  added the comment:

Thanks everyone for the quick action on this!

--
priority: release blocker -> normal

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-12 Thread miss-islington


miss-islington  added the comment:


New changeset 164bee296ab1f87cc05566b39ee8fb9fb64b3e5a by Miss Islington (bot) 
(Abhilash Raj) in branch '3.7':
[3.7] bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper 
parsing of URLs (GH-15685)" (GH-16724) (GH-16727)
https://github.com/python/cpython/commit/164bee296ab1f87cc05566b39ee8fb9fb64b3e5a


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-12 Thread Dong-hee Na


Change by Dong-hee Na :


--
pull_requests: +16311
pull_request: https://github.com/python/cpython/pull/16729

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-12 Thread Abhilash Raj


Abhilash Raj  added the comment:


New changeset 5a638a805503131f4a9cc2bbc5944611295c1500 by Abhilash Raj in 
branch '3.8':
[3.8] bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper 
parsing of URLs" (GH-16724) (GH-16728)
https://github.com/python/cpython/commit/5a638a805503131f4a9cc2bbc5944611295c1500


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-12 Thread Dong-hee Na


Dong-hee Na  added the comment:

> Yes, we should add a test case definitely, do you want to work on a PR?

Sure, I want to finalize this issue :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-12 Thread Abhilash Raj


Abhilash Raj  added the comment:

corona10: That's okay, it happens. I missed it too. There was really no way to 
foresee all the use cases, which is why we have beta and rc period to catch 
bugs.

Yes, we should add a test case definitely, do you want to work on a PR?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-12 Thread Abhilash Raj


Change by Abhilash Raj :


--
pull_requests: +16309
pull_request: https://github.com/python/cpython/pull/16728

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-12 Thread Abhilash Raj


Change by Abhilash Raj :


--
pull_requests: +16308
pull_request: https://github.com/python/cpython/pull/16727

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-12 Thread Dong-hee Na


Dong-hee Na  added the comment:

And I aplogize for my patch which makes regrssion issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-12 Thread Dong-hee Na


Dong-hee Na  added the comment:

I'd like to suggest add unit test for the report case.
So that we can detect future regression issue :)

--
nosy: +corona10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-11 Thread miss-islington


Change by miss-islington :


--
pull_requests: +16304
pull_request: https://github.com/python/cpython/pull/16725

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-11 Thread miss-islington


miss-islington  added the comment:


New changeset 19a3d873005e5730eeabdc394c961e93f2ec02f0 by Miss Islington (bot) 
(Abhilash Raj) in branch 'master':
bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper parsing 
of URLs (GH-15522)" (GH-16724)
https://github.com/python/cpython/commit/19a3d873005e5730eeabdc394c961e93f2ec02f0


--
nosy: +miss-islington

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-11 Thread Abhilash Raj


Change by Abhilash Raj :


--
keywords: +patch
pull_requests: +16302
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/16724

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-11 Thread Abhilash Raj


Abhilash Raj  added the comment:

Yeah, I agree. I'll submit a PR for reverting the commits.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-11 Thread Ned Deily


Ned Deily  added the comment:

Thanks for looking into this, @maxking. With both 3.8.0 final and 3.7.5 final 
scheduled for just a few days away, I wonder if the best thing to do at this 
point is to revert them and work on a more robust fix targeted for the next 
maintenance releases since the original issue was not identified as being a 
security issue or otherwise critical.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-11 Thread Abhilash Raj


Abhilash Raj  added the comment:

The bug is interesting due to some of the implementation details of 
"guess_type". The documentation says that it can parse either a URL or a 
filename.

Switching from urllib.parse._splittype to urllib.parse.urlparse changed what a 
valid "path" is. _splittype doesn't care about the rest of the URL except the 
scheme, but urlparse does. Previously, we used to split things like:

   >>> print(urllib.parse._splittype(';1.tar.gz')
   (None, ';1.tar.gz')

Then, we'd just treat the 2nd part as a filesystem path, which would rightfully 
guess the extension as .tar.gz

However, switching to using parsing via urllib.parse.urlparse, we get:

>>> print(urllib.parse.urlparse(';1.tar.gz')
ParseResult(scheme='', netloc='', path='', params='1.tar.gz', query='', 
fragment='')

And then we get the ".path" attribute for further processing, which being 
empty, returns (None, None).

The format of all these parts is:

scheme://netloc/path;parameters?query#fragment

A simple fix would be to just merge path, parameters, query and fragment 
together (with appropriate delimiters) and the proceed with further processing. 
That would fix parsing of Filesystem paths but would break (again) parsing of 
URLs like:

>>> mimetypes.guess_type('http://example.com/index.html;1.tar.gz')
('application/x-tar', 'gzip')

It should return 'text/html' as the type, since this is a URL and everything 
after the ';' should not be used to determine the mimetype. But, if there is no 
scheme provided, we should treat it as a filesystem path and in that case 
'application/x-tar' is the right type.

I hope I am not confusing everyone here. 

The right fix IMO would be to make "guess_type" not treat URLs and filesytem 
paths alike.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-11 Thread Abhilash Raj


Abhilash Raj  added the comment:

I am looking into the issue.

--
nosy: +maxking

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-11 Thread Ned Deily


Ned Deily  added the comment:

Marking as regression release blocker for 3.7.5 final and 3.8.0 final.

--
keywords: +3.7regression
nosy: +lukasz.langa, martin.panter, ned.deily, vstinner
priority: normal -> release blocker

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-11 Thread Kyle Meyer


Kyle Meyer  added the comment:

I've performed a bisect the issue with the following script:

#!/bin/sh
make -j3 || exit 125
./python <<\EOF || exit 1
import sys
import mimetypes
res = mimetypes.MimeTypes(strict=False).guess_type(";1.tar.gz")
if res[0] is None:
sys.exit(1)
EOF

That points to 87bd2071c7 (bpo-22347: Update mimetypes.guess_type to allow 
proper parsing of URLs (GH-15522), 2019-09-05).  That commit was included in 
3.7.5rc1 when it was cherry picked by 8873bff287.

--
nosy: +kyleam

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-11 Thread Yaroslav Halchenko


Yaroslav Halchenko  added the comment:

FWIW, our more complete test filename is 

# python3 -c 'import patoolib.util as ut; print(ut.guess_mime(r" \"\`;b 
|.tar.gz"))'
(None, None)

which works fine with older versions

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38449] regression - mimetypes guess_type is confused by ; in the filename

2019-10-11 Thread Yaroslav Halchenko


New submission from Yaroslav Halchenko :

Our tests in DataLad started to fail while building on Debian with Python 
3.7.5rc1 whenever they passed just fine previously with 3.7.3rc1. Analysis 
boiled down to mimetypes

$> ./python3.9 -c 'import mimetypes; mimedb = 
mimetypes.MimeTypes(strict=False); print(mimedb.guess_type(";1.tar.gz"))'   

(None, None)

$> ./python3.9 -c 'import mimetypes; mimedb = 
mimetypes.MimeTypes(strict=False); print(mimedb.guess_type("1.tar.gz"))' 
('application/x-tar', 'gzip')

$> git describe
v3.8.0b1-1174-g2b7dc40b2af


Ref: 

- original issue in DataLad: https://github.com/datalad/datalad/issues/3769

--
components: Library (Lib)
messages: 354455
nosy: Yaroslav.Halchenko
priority: normal
severity: normal
status: open
title: regression - mimetypes guess_type is confused by ; in the filename
type: behavior
versions: Python 3.7, Python 3.8, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com