Re: [Python-Dev] Bug bz2.BZ2File(...).seek(0,2) + patch

2005-12-08 Thread Victor Stinner
Le Vendredi 25 Novembre 2005 15:54, Aahz a écrit : On Fri, Nov 25, 2005, Victor STINNER wrote: I found a bug in bz2 python module. Example: Details and *patch* at: http://sourceforge.net/tracker/index.php?func=detailaid=1366000group_id =5470atid=105470 Thanks! Particularly

Re: [Python-Dev] Fixing incorrect indentations in C files (Decoder functions accept str in py3k)

2009-01-08 Thread Victor Stinner
patches :-/ So if you choose to change the indentation, i would be nice to run also sed s/[ \t]\+$//g -i/ ;-) -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman

Re: [Python-Dev] Add Py_off_t and related APIs?

2009-01-13 Thread Victor Stinner
, int prot, int flags, int fd, off_t offset); mmapmodule.c uses Py_ssize_t type and _GetMapSize() private function to convert the long integer to the Py_ssize_t type. -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev

Re: [Python-Dev] Add Py_off_t and related APIs?

2009-01-13 Thread Victor Stinner
Le Tuesday 13 January 2009 22:47:52 Victor Stinner, vous avez écrit : Le Tuesday 13 January 2009 21:33:28 Martin v. Löwis, vous avez écrit : I would do this through a converter function (O), but yes, making it private to the io library sounds about right. Who else would benefit from

Re: [Python-Dev] socket.create_connection slow

2009-01-14 Thread Victor Stinner
-loopback. You should check why the connect() to IPv6 is so long to raise an error. About the test: since SocketServer address family is constant (IPv4), you can force IPv4 for the client. -- Victor Stinner aka haypo http://www.haypocalc.com/blog

Re: [Python-Dev] Problems with unicode_literals

2009-01-17 Thread Victor Stinner
help strings -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Victor Stinner
Benjamin Peterson a écrit : There are also several IO bugs that should be fixed before it becomes official like #5006. I looked at this one, but I discovered another a bug with f.tell(): it's now issue #5008. This issue is now closed, that I will look again to #5006. See also #5016

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Victor Stinner
Le Wednesday 28 January 2009 11:55:16 Antoine Pitrou, vous avez écrit : 2.x has no encoding costs, which explains why it's so much faster. Why not testing io.open() or codecs.open() which create unicode strings? -- Victor Stinner aka haypo http://www.haypocalc.com/blog

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Victor Stinner
from io.open() and codecs.open() in 2.x either. I use codecs.open() in my programs and so I'm interested by the benchmark on this function ;-) But if I understand correctly, Python (3.1 ?) will be faster (or much faster) to read/write files in unicode, and that's a great news ;-) -- Victor

Re: [Python-Dev] Tracker archeology

2009-02-12 Thread Victor Stinner
Le Thursday 12 February 2009 14:10:32, vous avez écrit : Victor Stinner wrote: I like everything related to Unicode and the separation of byte and character strings in Python3 :-) That's a big one. But Ezio Melotti already asked for Unicode, so I have some 75 issues selected and ready

Re: [Python-Dev] To 3.0.2 or not to 3.0.2?

2009-02-17 Thread Victor Stinner
'cmp' is not defined -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive

Re: [Python-Dev] Reviving restricted mode?

2009-02-23 Thread Victor Stinner
on Python and fix all bugs :-) I wrote a short document in Python's wiki on the different security projects: http://wiki.python.org/moin/Security -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org

Re: [Python-Dev] Challenge: Please break this! (was: Reviving restricted mode)

2009-02-23 Thread Victor Stinner
0wn3d w00t Dinner and drinks on me for an evening -- when you are next in London or I am in your town -- to the first person who manages to break safelite.py and write to the filesystem. Cool. It's a good reason to go to Pycon UK this yeak ;-) -- Victor Stinner aka haypo http://www.haypocalc.com

Re: [Python-Dev] Challenge: Please break this! (was: Reviving restricted mode)

2009-02-23 Thread Victor Stinner
Le Monday 23 February 2009 22:36:47, vous avez écrit : reload(__builtins__) (...) Tav should have made another stipulation: the attack must not be trivial to fix. Why not? Any hole is enough to break a jail. The cracker doesn't care if it's trivial to fix or not :-p -- Victor Stinner aka

Re: [Python-Dev] Challenge: Please break this! (was: Reviving restricted mode)

2009-02-23 Thread Victor Stinner
Le Tuesday 24 February 2009 00:51:25 Farshid Lashkari, vous avez écrit : It seems like some code in safelite passes a file object to isinstance. By overriding the builtin isinstance function I can get access to the original file object and create a new one. Wow, excellent idea! -- Victor

Re: [Python-Dev] Challenge: Please break this! [Now with blog post]

2009-02-23 Thread Victor Stinner
) changed one more time! The check on mode is now: if type(mode) is not type(''): raise TypeError(mode has to be a string.) Could you keep all versions of safelite.py? (eg. rename new version as safelite2.py, safelite3.py, etc.) -- Victor Stinner aka haypo http

Re: [Python-Dev] Challenge: Please break this! [Now with blog post]

2009-02-23 Thread Victor Stinner
.__getattribute__('func_closure')[0]) fileobj.write('twice!\n') f.close() - -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman

Re: [Python-Dev] Challenge: Please break this! [Now with blog post]

2009-02-23 Thread Victor Stinner
victor f.tell.__getattribute__('func_closure') tak But, have you actually run that code? Ooops, I modified my local copy of safelite.py to disable func_xxx protections :-p With the latest version of safelite.py, my exploit doesn't work anymore. Sorry. -- Victor Stinner aka haypo http

[Python-Dev] Python jail: whitelist vs blacklist

2009-02-24 Thread Victor Stinner
;-) and make sure that a proxy can not be modified by itself or read private attributes My approach is maybe naive and imposible to implement :-) -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev

Re: [Python-Dev] Challenge: Please break this! [Now with blog post]

2009-02-24 Thread Victor Stinner
environment with __subclasses__, f_code, etc. ... -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org

Re: [Python-Dev] What type of object mmap.read_byte should return on py3k?

2009-02-28 Thread Victor Stinner
was created with ACCESS_READ, then writing to it will throw a TypeError exception. -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev

Re: [Python-Dev] What type of object mmap.read_byt e should return on py3k?

2009-02-28 Thread Victor Stinner
... with Py_BuildValue and PyArg_Parse... because a function may have other arguments or specify the function name with ...:name: http://bugs.python.org/issue5391 It looks like msvcrt.putch(char) and msvcrt.ungetch(char) use the wrong types. -- Victor Stinner aka haypo http://www.haypocalc.com/blog

Re: [Python-Dev] 3.1 performance

2009-03-08 Thread Victor Stinner
3.1alpha1 because 2.6.1 only includes pybench 2.0 -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ Computer: * Ubuntu 7.10 * Pentium(R) 4 CPU 3.00GHz (32 bits) * 2 GB of RAM

Re: [Python-Dev] 3.1 performance

2009-03-08 Thread Victor Stinner
Le Sunday 08 March 2009 13:20:34 Antoine Pitrou, vous avez écrit : Hi, Victor Stinner victor.stinner at haypocalc.com writes: Summary (minimum total) on 32 bits CPU: * Python 2.6.1: 8762 ms * Python 3.0.1: 8977 ms * Python 3.1a1: 9228 ms (slower than 3.0) Have you compiled

[Python-Dev] py3k: accept unicode for 'c' and byte for 'C' in getarg?

2009-03-17 Thread Victor Stinner
character (code in [0; INTMAX]) Note: Why not using Py_UNICODE instead of int? Usage of C format: datetime.datetime.isoformat(sep) array.array(type, data): type Usage of c format: msvcrt.putch(char) msvcrt.ungetch(char) mmap object.write_byte(char) -- Victor Stinner aka haypo http

Re: [Python-Dev] py3k: accept unicode for 'c' and byte for 'C' in getarg?

2009-03-17 Thread Victor Stinner
Le Tuesday 17 March 2009 13:52:16 Victor Stinner, vous avez écrit : I realised with the issue #3446 that getarg('c') (get a byte) accepts not only a byte string of 1 byte, but also an unicode string of 1 character (if the character code is in [0; 255]). I don't think that it's a good idea

Re: [Python-Dev] py3k: accept unicode for 'c' and byte for 'C' in getarg?

2009-03-17 Thread Victor Stinner
for putch() and ungetch(). See also http://bugs.python.org/issue5410 -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http

Re: [Python-Dev] SoC: Optimize Python3

2009-03-18 Thread Victor Stinner
;-) There are other importants features to optimize like: - unicode string (str in python3) - I/O: io-c in py3k branch is already much better, but I'm sure that we can do better ;-) - etc. -- Victor Stinner aka haypo http://www.haypocalc.com/blog

Re: [Python-Dev] SoC: security

2009-03-18 Thread Victor Stinner
-- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Packaging Survey first results + Summit schedule

2009-03-26 Thread Victor Stinner
-- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Modules of plat-* directories

2011-10-24 Thread Victor Stinner
There are open issues related to plat-XXX. Le Lundi 24 Octobre 2011 00:03:42 Martin v. Löwis a écrit : no, we make no changes to them unless a user actually requests a change Matthias Klose asked for socket SIO* constants in september 2006 (5 years ago). http://bugs.python.org/issue1565071 I

[Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API

2011-10-24 Thread Victor Stinner
Hi, I propose to raise Unicode errors if a filename cannot be decoded on Windows, instead of creating a bogus filenames with questions marks. Because this change is incompatible with Python 3.2, even if such filenames are unusable and I consider the problem as a (Python?) bug, I would like

Re: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API

2011-10-25 Thread Victor Stinner
Le Mardi 25 Octobre 2011 13:20:12 vous avez écrit : Victor Stinner writes: I propose to raise Unicode errors if a filename cannot be decoded on Windows, instead of creating a bogus filenames with questions marks. By bogus you mean sometimes (?) invalid and the OS will refuse to use

Re: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API

2011-10-25 Thread Victor Stinner
Le Mardi 25 Octobre 2011 09:09:56 vous avez écrit : I propose to raise Unicode errors if a filename cannot be decoded on Windows, instead of creating a bogus filenames with questions marks. Can you please elaborate what APIs you are talking about exactly? Basically, all functions

Re: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API

2011-10-25 Thread Victor Stinner
Le Mardi 25 Octobre 2011 09:09:56 vous avez écrit : If it's the byte APIs (i.e. using bytes as file names), then I'm -1 on this proposal. People that explicitly use bytes for file names deserve to get whatever exact platform semantics the platform has to offer. This is true on Unix, and it is

Re: [Python-Dev] memcmp performance

2011-10-25 Thread Victor Stinner
Le Mardi 25 Octobre 2011 10:44:16 Stefan Behnel a écrit : Richard Saunders, 25.10.2011 01:17: -On [20111024 09:22], Stefan Behnel wrote: I agree. Given that the analysis shows that the libc memcmp() is particularly fast on many Linux systems, it should be up to the Python package

Re: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API

2011-10-25 Thread Victor Stinner
Le mardi 25 octobre 2011 00:57:42, Victor Stinner a écrit : I propose to raise Unicode errors if a filename cannot be decoded on Windows, instead of creating a bogus filenames with questions marks. Because this change is incompatible with Python 3.2, even if such filenames are unusable and I

Re: [Python-Dev] [Python-checkins] cpython: Issue #13226: Add RTLD_xxx constants to the os module. These constants can by

2011-10-25 Thread Victor Stinner
Le mardi 25 octobre 2011 14:50:44, Petri Lehtinen a écrit : Hi, victor.stinner wrote: http://hg.python.org/cpython/rev/c75427c0da06 changeset: 73127:c75427c0da06 user:Victor Stinner victor.stin...@haypocalc.com date:Tue Oct 25 13:34:04 2011 +0200 summary

Re: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API

2011-10-26 Thread Victor Stinner
Le Mardi 25 Octobre 2011 10:31:56 Victor Stinner a écrit : Basically, all functions processing filenames, so most functions of posixmodule.c. Some examples: - os.listdir(): FindFirstFileA, FindNextFileA, FindCloseA - os.lstat(): CreateFileA - os.getcwdb(): getcwd() - os.mkdir

[Python-Dev] Emit a BytesWarning on bytes filenames on Windows

2011-10-28 Thread Victor Stinner
Hi, I am not more conviced that raising a UnicodeEncodeError on unencodable characters is the right fix for the issue #13247. The problem with this solution is that you have to wait until an user get a UnicodeEncodeError. I have yet another proposition: emit a warning when a bytes filename is

Re: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows

2011-10-30 Thread Victor Stinner
Le 30/10/2011 09:00, Martin v. Löwis a écrit : As quoted above, deprecation of the bytes version of the API sounds fine to me, but isn't this going to run into the usual objections from the we need bytes for efficiency crowd? It's OK with mewink to say in this restricted area you must convert

Re: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows

2011-10-30 Thread Victor Stinner
Le 29/10/2011 07:47, Mark Hammond a écrit : When previously discussing this issue, I was under the impression that the problem was unencodable bytes passed from the Python code to Windows - but the reverse is true - only the data coming back from Windows isn't encodable. The undecodable

Re: [Python-Dev] ints not overflowing into longs?

2011-11-03 Thread Victor Stinner
Le Mercredi 2 Novembre 2011 19:32:38 Derek Shockey a écrit : I just found an unexpected behavior and I'm wondering if it is a bug. In my 2.7.2 interpreter on OS X, built and installed via MacPorts, it appears that integers are not correctly overflowing into longs and instead are yielding

Re: [Python-Dev] Unicode exception indexing

2011-11-03 Thread Victor Stinner
Le jeudi 3 novembre 2011 18:14:42, mar...@v.loewis.de a écrit : There is a backwards compatibility issue with PEP 393 and Unicode exceptions: the start and end indices: are they Py_UNICODE indices, or code point indices? Oh oh. That's exactly why I didn't want to start to work on this issue.

Re: [Python-Dev] [Python-checkins] cpython: Port code page codec to Unicode API.

2011-11-04 Thread Victor Stinner
Le vendredi 4 novembre 2011 18:23:26, martin.v.loewis a écrit : http://hg.python.org/cpython/rev/9191f804d376 changeset: 73353:9191f804d376 parent: 73351:2bec7c452b39 user:Martin v. Löwis mar...@v.loewis.de date:Fri Nov 04 18:23:06 2011 +0100 summary: Port code page

[Python-Dev] PyDict_Get/SetItem and dict subclasses

2011-11-05 Thread Victor Stinner
Hi, PyDict_GetItem() and PyDict_SetItem() don't call __getitem__ and __setitem__ for dict subclasses. Is there a reason for that? I found this surprising behaviour when I replaced a dict by a custom dict checking the key type on set. But my __setitem__ was not called because the function

Re: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows

2011-11-08 Thread Victor Stinner
Le samedi 29 octobre 2011 07:47:01, vous avez écrit : Therefore, as you imply, I think the solution to this issue is to start the process of deprecating the bytes version of the api in py3k with a view to removing it completely - possibly with a less aggressive timeline than normal. In Python

Re: [Python-Dev] [Python-checkins] cpython: Change decoders to use Unicode API instead of Py_UNICODE.

2011-11-09 Thread Victor Stinner
First of all, thanks for having upgraded this huge part (codecs) to the new Unicode API! +static int +unicode_widen(PyObject **p_unicode, int maxchar) +{ +PyObject *result; +assert(PyUnicode_IS_READY(*p_unicode)); +if (maxchar = PyUnicode_MAX_CHAR_VALUE(*p_unicode)) +

[Python-Dev] unicode_internal codec and the PEP 393

2011-11-09 Thread Victor Stinner
Hi, The unicode_internal decoder doesn't decode surrogate pairs and so test_unicode.UnicodeTest.test_codecs() is failing on Windows (16-bit wchar_t). I don't know if this codec is still revelant with the PEP 393 because the internal representation is now depending on the maximum character

Re: [Python-Dev] PEP 405 (proposed): Python 2.8 Release Schedule

2011-11-09 Thread Victor Stinner
Le Mercredi 9 Novembre 2011 17:18:45 Amaury Forgeot d'Arc a écrit : Hi, 2011/11/9 Barry Warsaw ba...@python.org I think we should have an official pronouncement about Python 2.8, and PEPs are as official as it gets 'round here. Do we need to designate a release manager?

Re: [Python-Dev] unicode_internal codec and the PEP 393

2011-11-09 Thread Victor Stinner
Le mercredi 9 novembre 2011 22:03:52, vous avez écrit : Should we: * Drop this codec (public and documented, but I don't know if it is used) * Use wchar_t* (Py_UNICODE*) to provide a result similar to Python 3.2, and so fix the decoder to handle surrogate pairs * Use the

Re: [Python-Dev] unicode_internal codec and the PEP 393

2011-11-11 Thread Victor Stinner
Le 09/11/2011 23:45, Martin v. Löwis a écrit : After a quick search on Google codesearch (before it disappears!), I don't think that encoding a Unicode string to its internal PEP-393 representation would satisfy any program. It looks like wchar_t* is a better candidate. Ok. Making it

Re: [Python-Dev] peps: And now for something completely different.

2011-11-14 Thread Victor Stinner
If the PEP 404 lists important changes between Python 2 and Python 3, the removal of old-style classes should also be mentioned because it is a change in the core language. Victor ___ Python-Dev mailing list Python-Dev@python.org

Re: [Python-Dev] Is Python insider blog dead?

2011-11-16 Thread Victor Stinner
Le Mercredi 16 Novembre 2011 07:23:03 Brian Curtin a écrit : Not dead, there was just a period where I got a little too busy with real life, plus development seemed to slow down for a while. I have a few drafts working (like a post on all of the recent PEP activity) and a few more in my head,

Re: [Python-Dev] Committing PEP 3155

2011-11-18 Thread Victor Stinner
I haven't seen any strong objections, so I would like to go ahead and commit PEP 3155 (*) soon. Is anyone against it? I'm not against it, but I have some questions. Does you a working implementing? Do you have a patch for issue #9276 using __qualname__? Maybe not a fully working patch, but a

[Python-Dev] Chose a name for a get unicode as wide character, borrowed reference function

2011-11-21 Thread Victor Stinner
Hi, With the PEP 393, the Py_UNICODE is now deprecated and scheduled for removal in Python 4. PyUnicode_AsUnicode() and PyUnicode_AsUnicodeAndSize() functions are still commonly used on Windows to get the string as wchar_t* without having to care of freeing the memory: it's a borrowed

Re: [Python-Dev] Chose a name for a get unicode as wide character, borrowed reference function

2011-11-21 Thread Victor Stinner
Le Lundi 21 Novembre 2011 16:04:06 Antoine Pitrou a écrit : On Mon, 21 Nov 2011 12:53:17 +0100 Victor Stinner victor.stin...@haypocalc.com wrote: I would like to add a new PyUnicode_AsWideChar() function which would return the borrowed reference, exactly as PyUnicode_AsUnicode

Re: [Python-Dev] Chose a name for a get unicode as wide character, borrowed reference function

2011-11-21 Thread Victor Stinner
Le Lundi 21 Novembre 2011 16:55:05 Antoine Pitrou a écrit : I want to rename PyUnicode_AsUnicode() and change its result type (Py_UNICODE* = wchar_t*). The result will be a borrowed reference, ie. you don't have to free the memory, it will be done when the Unicode string will be destroyed

[Python-Dev] PyUnicode_EncodeDecimal

2011-11-21 Thread Victor Stinner
Hi, I'm trying to rewrite PyUnicode_EncodeDecimal() to upgrade it to the new Unicode API. The problem is that the function is not accessible in Python nor tested. Should we document and test it, leave it unchanged and deprecate it, or simply remove it? -- Python has a

Re: [Python-Dev] PyUnicode_EncodeDecimal

2011-11-21 Thread Victor Stinner
Le lundi 21 novembre 2011 21:39:53, Victor Stinner a écrit : I'm trying to rewrite PyUnicode_EncodeDecimal() to upgrade it to the new Unicode API. The problem is that the function is not accessible in Python nor tested. I added tests for this function in Python 2.7, 3.2 and 3.3

[Python-Dev] PyUnicode_Resize

2011-11-21 Thread Victor Stinner
Hi, In Python 3.2, PyUnicode_Resize() expects a number of Py_UNICODE units, whereas Python 3.3 expects a number of characters. It is tricky to convert a number of Py_UNICODE units to a number of characters, so it is diffcult to provide a backward compatibility PyUnicode_Resize() function

Re: [Python-Dev] PyUnicode_EncodeDecimal

2011-11-22 Thread Victor Stinner
Le mardi 22 novembre 2011 02:02:05, Victor Stinner a écrit : This function is broken by design if an error handler is specified: the caller cannot know the size of the output buffer, whereas the caller has to allocate this buffer. I propose to raise an error if an error handler (different

Re: [Python-Dev] cpython: fix compiler warning by implementing this more cleverly

2011-11-23 Thread Victor Stinner
Le Mercredi 23 Novembre 2011 01:49:28 Terry Reedy a écrit : The one-liner could be followed by assert(kind==1 || kind==2 || kind==4) which would also serve to remind the reader of the possibilities. For a ready string, kind must be 1, 2 or 4. We might rename kind to charsize because its

[Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues

2011-12-07 Thread Victor Stinner
Hi, I would like to deny the creation of an Unicode string containing characters outside the range [U+; U+10]. The check is already present in some places (e.g. the builtin chr() function), but not everywhere. The last important function is PyUnicode_FromWideChar, function used to

Re: [Python-Dev] Reject characters bigger than U+10FFFF and Solaris issues

2011-12-08 Thread Victor Stinner
Le 08/12/2011 10:17, Stefan Krah a écrit : I'm think that b'\xA0' is a valid thousands separator. I agree, but it's not the point: the problem is that b'\xA0' is decoded to a strange U+3020 character by mbstowcs(). Currently I have this horrible function to deal with the problem: ...

Re: [Python-Dev] cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

2011-12-09 Thread Victor Stinner
On 09/12/2011 01:35, Antoine Pitrou wrote: On Fri, 09 Dec 2011 00:16:02 +0100 victor.stinnerpython-check...@python.org wrote: +.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) + + Get a new copy of a Unicode object. + + .. versionadded:: 3.3 I'm not sure I understand. Why

Re: [Python-Dev] cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

2011-12-11 Thread Victor Stinner
Le vendredi 9 décembre 2011 20:32:16 Antoine Pitrou a écrit : ... it's a bit obscure why the function exists. Yeah ok, I marked the function as private: renamed to _PyUnicode_Copy() and I undocumented it. Victor ___ Python-Dev mailing list

Re: [Python-Dev] IEEE/ISO draft on Python vulnerabilities

2011-12-12 Thread Victor Stinner
IEEE/ISO are working on a draft document about Python vulunerabilities: http://grouper.ieee.org/groups/plv/DocLog/300-399/360-thru-379/22-WG23-N-0372/n0372.pdf (in the context of a larger effort to classify vulnerabilities in all languages: ISO/IEC TR 24772:2010, available from ISO at no cost

Re: [Python-Dev] PyUnicodeObject / PyASCIIObject questions

2011-12-14 Thread Victor Stinner
Le mardi 13 décembre 2011 02:09:02 Jim Jewett a écrit : (3) I would feel much less nervous if the remaining 4 values of PyUnicode_Kind were explicitly reserved, and the macros raised an error when they showed up. (Better still would be to allow other values, and to have the macros delegate

Re: [Python-Dev] Compiling the source without stat

2011-12-15 Thread Victor Stinner
Le jeudi 15 décembre 2011 15:29:23 vous avez écrit : If faking a stat struct and a function to fill it solves the problem, and checking for existing files and folders is the only thing that python needs to be compiled (i'm talking about 2.7) then it's possible to fail-check it by just trying

[Python-Dev] French sprint this week-end

2011-12-15 Thread Victor Stinner
Hi, I organize an online sprint on CPython this week-end with french developers. At least six developers will participe, some of them don't know C, most know Python. Do you know simple task to start contributing to Python? Something useful and not boring if possible :-) There is the easy

Re: [Python-Dev] [Python-checkins] cpython: Move PyUnicode_WCHAR_KIND outside PyUnicode_Kind enum

2011-12-18 Thread Victor Stinner
On 18/12/2011 20:34, Martin v. Löwis wrote: Move PyUnicode_WCHAR_KIND outside PyUnicode_Kind enum What's the rationale for that change? It's a valid kind value, after all, and the C convention is that an enumeration lists all valid values (else there wouldn't be a need for an enumeration in

Re: [Python-Dev] [Python-checkins] cpython: Move PyUnicode_WCHAR_KIND outside PyUnicode_Kind enum

2011-12-18 Thread Victor Stinner
On 18/12/2011 21:04, Martin v. Löwis wrote: PyUnicode_KIND() only returns PyUnicode_1BYTE_KIND, PyUnicode_2BYTE_KIND or PyUnicode_4BYTE_KIND. Outside unicodeobject.c, you are not supposed to see PyUnicode_WCHAR_KIND. Why do you say that? It can very well happen, assuming you call

Re: [Python-Dev] Fwd: Anyone still using Python 2.5?

2011-12-21 Thread Victor Stinner
What's the general consensus on supporting Python 2.5 nowadays? There is no such consensus :-) Do people still have to use this in commercial environments or is everyone on 2.6+ nowadays? At work, we are still using Python 2.5. Six months ago, we started a project to upgrade to 2.7, but we

Re: [Python-Dev] Fwd: Anyone still using Python 2.5?

2011-12-21 Thread Victor Stinner
On 21/12/2011 15:26, anatoly techtonik wrote: I believe most AppEngine applications in Python are still using 2.5 run-time. So are development boxes for these applications. It may take another year or two for the transition. App engine 1.6 improved support of Python 2.7, so I hope that

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-30 Thread Victor Stinner
Le 29/12/2011 02:28, Michael Foord a écrit : A paper (well, presentation) has been published highlighting security problems with the hashing algorithm (exploiting collisions) in many programming languages Python included:

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-30 Thread Victor Stinner
Le 29/12/2011 14:19, Christian Heimes a écrit : Perhaps the dict code is a better place for randomization. The problem is the creation of a dict with keys all having the same hash value. The current implementation of dict uses a linked-list. Adding a new item requires to compare the new key

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-30 Thread Victor Stinner
In case the watchdog is not a viable solution as I had assumed it was, I think it's more reasonable to indeed consider adding a flag to Python that allows randomization of hashes optionally before startup. A flag will only be needed if the overhead of the fix is too high. However as it was

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Victor Stinner
Le 01/01/2012 04:29, Paul McMillan a écrit : This is incorrect. Once an attacker has guessed the random seed, any operation which reveals the ordering of hashed objects can be used to verify the answer. JSON responses would be ideal. In fact, an attacker can do a brute-force attack of the random

Re: [Python-Dev] RNG in the core

2012-01-03 Thread Victor Stinner
A randomized hash doesn't need cryptographic RNG (which are slow and need a lot of new code), and the new hash function should maybe not be cryptographic. We need to make the DoS more expensive for the attacker, but we don't need to add too much security for that. Mersenne Twister is useless

Re: [Python-Dev] RNG in the core

2012-01-04 Thread Victor Stinner
(or is /dev/urandom still available in a chroot?) Last time that I played with chroot, I binded /dev and /proc. Many programs rely on specific devices like /dev/null. Python should not refuse to start if /dev/urandom (or CryptoGen) is missing or cannot be used, but should use a weak fallback.

Re: [Python-Dev] cpython: Add a new PyUnicode_Fill() function

2012-01-04 Thread Victor Stinner
Oops, it's a typo in the doc (copy/paste failure). It's now fixed, thanks. Victor 2012/1/4 Antoine Pitrou solip...@pitrou.net: +.. c:function:: int PyUnicode_Fill(PyObject *unicode, Py_ssize_t start, \ +                        Py_ssize_t length, Py_UCS4 fill_char) + +   Fill a string with a

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Victor Stinner
2012/1/6 Barry Warsaw ba...@python.org: Settings for PYRANDOMHASH: PYRANDOMHASH=1   enable randomized hashing function PYRANDOMHASH=/path/to/seed   enable randomized hashing function and read seed from 'seed' PYRANDOMHASH=0   disable randomed hashing function Why not PYTHONHASHSEED

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Victor Stinner
Using my patch (random-2.patch), the overhead is 0%. I cannot see a difference with and without my patch. Numbers: --- unpatched: == 3 characters == 1 loops, best of 3: 459 usec per loop == 10 characters == 1 loops, best of 3: 575 usec per loop == 500 characters == 1 loops, best of 3: 1.36 msec

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-09 Thread Victor Stinner
That said, I don't think smallest-format is actually enforced with anything stronger than comments (such as in unicodeobject.h struct PyASCIIObject) and asserts (mostly calling _PyUnicode_CheckConsistency).  I don't have any insight on how prevalent non-conforming strings will be in practice,

Re: [Python-Dev] Compiling 2.7.2 on OS/2

2012-01-09 Thread Victor Stinner
-        if os.name in ('nt', 'os2'): +        if os.name in ('nt'): This change is wrong: it should be os.name == 'nt'. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

Re: [Python-Dev] [Python-checkins] cpython (2.7): Fix stock symbol for Microsoft

2012-01-10 Thread Victor Stinner
You may port the fix to 3.2 and 3.3. Victor 2012/1/10 raymond.hettinger python-check...@python.org: http://hg.python.org/cpython/rev/068ce5d7f7e7 changeset:   74320:068ce5d7f7e7 branch:      2.7 user:        Raymond Hettinger pyt...@rcn.com date:        Tue Jan 10 09:51:51 2012 +

[Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-12 Thread Victor Stinner
Many people proposed their own idea to fix the vulnerability, but only 3 wrote a patch: - Glenn Linderman proposes to fix the vulnerability by adding a new safe dict type (only accepting string keys). His proof-of-concept (SafeDict.py) uses a secret of 64 random bits and uses it to compute the

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-13 Thread Victor Stinner
Unfortunately it requires only a few seconds to compute enough 32bit collisions on one core with no precomputed data. Are you running the hash function backward to generate strings with the same value, or you are more trying something like brute forcing? And how do you get the hash secret? You

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-13 Thread Victor Stinner
- Glenn Linderman proposes to fix the vulnerability by adding a new safe dict type (only accepting string keys). His proof-of-concept (SafeDict.py) uses a secret of 64 random bits and uses it to compute the hash of a key. We could mix Marc's collision counter with SafeDict idea (being able to

Re: [Python-Dev] Status of the fix for the hash collision ulnerability

2012-01-15 Thread Victor Stinner
I don't think that it would be hard to patch this library to use another hash function. It can implement its own hash function, use MD5, SHA1, or anything else. hash() is not stable accross Python versions and 32/64 bit systems. Victor 2012/1/15 Hynek Schlawack h...@ox.cx: Am Sonntag, 15.

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-16 Thread Victor Stinner
2012/1/17 Tim Delaney timothy.c.dela...@gmail.com: What if in a pathological collision (e.g. 1000 collisions), we increased the size of a dict by a small but random amount? It doesn't change anything, you will still get collisions. Victor ___

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-17 Thread Victor Stinner
I thought that the original problem was that with N insertions in the dictionary, by repeatedly inserting different keys generating the same hash value an attacker could arrange that the cost of finding an open slot is O(N), and thus the cost of N insertions is O(N^2). If so, frequent

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-17 Thread Victor Stinner
I finished my patch transforming hash(str) to a randomized hash function, see random-8.patch attached to the issue: http://bugs.python.org/issue13703 The remaining question is which random number generator should be used on Windows to initialize the hash secret (CryptoGen adds an overhead of 10%,

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Victor Stinner
2012/1/17 Martin v. Löwis mar...@v.loewis.de: I'd like to propose a different approach to seeding the string hashes: only do so for dictionaries involving only strings, and leave the tp_hash slot of strings unchanged. The real problem is in dict (or any structure using an hash table), so if it

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Victor Stinner
There is a simpler solution: bucket_index = (hash(str) ^ secret) DICT_MASK. Oops, hash^secret doesn't add any security. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Victor Stinner
Each string would get two hashes: the public hash, which is constant across runs and bugfix releases, and the dict-hash, which is only used by the dictionary implementation, and only if all keys to the dict are strings. The distinction between secret (private, secure) and public hash

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-17 Thread Victor Stinner
I plan to commit my fix to Python 3.3 if it is accepted. Then write a simplified version to Python 3.2 and backport it to 3.1. I'm opposed to any change to the hash values of strings in maintenance releases, so I guess I'm opposed to your patch in principle. If randomized hash cannot be

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-18 Thread Victor Stinner
2012/1/18 Martin v. Löwis mar...@v.loewis.de: For 3.3 onwards, I'm skeptical whether all this configuration support is really necessary. I think a much smaller patch which leaves no choice would be more appropriate. The configuration helps unit testing: see changes on Lib/test/*.py in my last

Re: [Python-Dev] Writable __doc__

2012-01-19 Thread Victor Stinner
http://bugs.python.org/issue12773  :) The bug is marked as close, whereas the bug exists in Python 3.2 and has no been closed. The fix must be backported. Victor ___ Python-Dev mailing list Python-Dev@python.org

  1   2   3   4   5   6   7   8   9   10   >