[issue28200] Windows: path_converter() leaks memory for Unicode filenames

2017-03-31 Thread Donald Stufft

Changes by Donald Stufft :


--
pull_requests: +

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28200] Windows: path_converter() leaks memory for Unicode filenames

2016-09-19 Thread STINNER Victor

STINNER Victor added the comment:

> path_converter.patch LGTM for 3.6 (and 3.7, 3.8),

Ok, I pushed my simple fix.

Feel free to modify the code if you see a better way to encode paths on Windows 
;-) But it's a much larger project, and I'm not really interested to jump again 
in this silly deprecated APIs :-) It seems like the status quo is that 
Py_UNICODE is going to stay longer than expected and since it "just works", 
nobody really cares :-)

I close this issue since the initial bug (memory leak) is now fixed.

--
resolution:  -> fixed
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28200] Windows: path_converter() leaks memory for Unicode filenames

2016-09-19 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 6232e610e310 by Victor Stinner in branch '3.6':
Fix memory leak in path_converter()
https://hg.python.org/cpython/rev/6232e610e310

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28200] Windows: path_converter() leaks memory for Unicode filenames

2016-09-19 Thread STINNER Victor

STINNER Victor added the comment:

Serhiy Storchaka added the comment:
> Could you please add a comment that PyUnicode_AsUnicodeAndSize is deprecated, 
> but used because it is the simplest and the most efficient way for now?

See old issues #22271 and #22324.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28200] Windows: path_converter() leaks memory for Unicode filenames

2016-09-19 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

path_converter.patch LGTM for 3.6 (and 3.7, 3.8), but we should find better 
solution for future versions. Could you please add a comment that 
PyUnicode_AsUnicodeAndSize is deprecated, but used because it is the simplest 
and the most efficient way for now?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28200] Windows: path_converter() leaks memory for Unicode filenames

2016-09-19 Thread STINNER Victor

STINNER Victor added the comment:

Serhiy Storchaka added the comment:
> The :c:type:`Py_UNICODE` has been deprecated by :pep:`393` and will be
> removed in Python 4. All functions using this type are deprecated:

Right... I tried to deprecate and remove all functions using
Py_UNICODE but it's hard to change all this code. I gave up :-)

>   :c:func:`PyUnicode_AsUnicodeAndSize`: use 
> :c:func:`PyUnicode_AsWideCharString`

Sadly, it's not exactly the same: PyUnicode_AsWideCharString returns a
new fresh buffer at each call. I'm not sure that it caches the result
neither.

Victor

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28200] Windows: path_converter() leaks memory for Unicode filenames

2016-09-19 Thread STINNER Victor

STINNER Victor added the comment:

Steve Dower added the comment:
> It's not clear to me that Py_UNICODE is guaranteed to be wchar_t for all 
> time, that's all. If it is, go ahead.

Right now, it's the case and path_converter() leaks memory, so I
proposed a simple and obvious fix :-)

On Windows, it makes sense to continue to use the UTF-16 encoded
string cached in Unicode objects, because this conversion is common,
and it's convenient to not have to release the memory explicitly. The
side effect is that we may waste memory in some cases.

> Otherwise the path_t object has the ability to clean up after itself, so 
> perhaps it should be used here?

Maybe, but I'm not interested to write a more complex patch for
Windows :-) Try to just call PyMem_Free(path->wide) (do nothing it's
NULL).

The advantage of the old code (and my patch) is to only convert a
filename once to UTF-16 and then use the cached result.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28200] Windows: path_converter() leaks memory for Unicode filenames

2016-09-18 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Deprecated functions and types of the C API
---

The :c:type:`Py_UNICODE` has been deprecated by :pep:`393` and will be
removed in Python 4. All functions using this type are deprecated:

Unicode functions and methods using :c:type:`Py_UNICODE` and
:c:type:`Py_UNICODE*` types:

* :c:macro:`PyUnicode_AS_UNICODE`, :c:func:`PyUnicode_AsUnicode`,
  :c:func:`PyUnicode_AsUnicodeAndSize`: use :c:func:`PyUnicode_AsWideCharString`

(From Doc/whatsnew/3.3.rst)

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28200] Windows: path_converter() leaks memory for Unicode filenames

2016-09-18 Thread Steve Dower

Steve Dower added the comment:

It's not clear to me that Py_UNICODE is guaranteed to be wchar_t for all time, 
that's all. If it is, go ahead. Otherwise the path_t object has the ability to 
clean up after itself, so perhaps it should be used here?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28200] Windows: path_converter() leaks memory for Unicode filenames

2016-09-18 Thread STINNER Victor

STINNER Victor added the comment:

I didn't push the fix myself, because Steve Dower maybe made the change for a 
specific reason?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28200] Windows: path_converter() leaks memory for Unicode filenames

2016-09-18 Thread STINNER Victor

New submission from STINNER Victor:

Memory leak spotted by the issue #28195: path_converter() calls 
PyUnicode_AsWideCharString() which allocates a new buffer at each call, but 
this buffer is never released.

On Python 3.5, PyUnicode_AsWideCharString() was used which handles internally 
the memory buffer and so release the memory later.

Attached patch fixes the regression introduced in Python 3.6 beta 1 by the 
change e20c7d8a8187 ("Issue #27781: Change file system encoding on Windows to 
UTF-8 (PEP 529)").

--
files: path_converter.patch
keywords: patch
messages: 276924
nosy: haypo, steve.dower
priority: normal
severity: normal
status: open
title: Windows: path_converter() leaks memory for Unicode filenames
versions: Python 3.6, Python 3.7
Added file: http://bugs.python.org/file44744/path_converter.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com