Re: [Python-Dev] Release of astoptimizer 0.3

2012-09-12 Thread Serhiy Storchaka

On 12.09.12 00:47, Victor Stinner wrote:

set([x for ...]) => {x for ...}
dict([(k, v) for ...]) => {k: v for ...}
dict((k, v) for ...) => {k: v for ...}
''.join([s for ...]) => ''.join(s for ...)
a.extend([s for ...]) => a.extend(s for ...)


These optimizations look correct.


Actually generator can be slower list comprehension. Especially on 
Python2. I think this is an opportunity to optimize the work with 
generators.



(f(x) for x in a) => map(f, a)
(x.y for x in a) => map(operator.attrgetter('y'), a)
(x[0] for x in a) => map(operator.itemgetter(0), a)
(2 * x for x in a) => map((2).__mul__, a)
(x in b for x in a) => map(b.__contains__, a)
map(lambda x: x.strip(), a) => (x.strip() for x in a)


Is it faster? :-)


Yes, significantly for large sequences. But this transformation is not 
safe in general case. For short sequences possible regression (cost of 
"map" name lookup and function call).



x in ['i', 'em', 'cite'] => x in {'i', 'em', 'cite'}


A list can contain non-hashable objects, whereas a set can not.


Agree, it applicable if x is proven str. At least list can be replaced 
by tuple.



x == 'i' or x == 'em' or x == 'cite'] => x in {'i', 'em', 'cite'}


You need to know the type of x. Depending on the type, x.__eq__ and
x.__contains__ may be completly different.


Then => x in ('i', 'em', 'cite') and move forward only if x obviously is 
of the appropriate type.



for ...: f.write(...) => __fwrite = f.write; for ...: __fwrite(...)


f.write lookup cannot be optimized.


Yes, it is a dangerous transformation and it is difficult to prove its 
safety. But name lookup is one of the main brakes of Python.



x = x + 1 => x += 1
x = x + ' ' => x += ' '


I don't know if these optimizations are safe.


It is safe if x is proven number or string. If x is local variable, 
initialized by number/string and modified only by number/string. 
Counters and string accumulators are commonly used.



'x=%s' % repr(x) => 'x=%a' % (x,)


I don't understand this one.


Sorry, it should be => 'x=%r' % (x,). And for more arguments: 'x[' + 
repr(k) + ']=' + repr(v) + ';' => 'x[%r]=%r;' % (k, v). Same for str and 
ascii.


It is not safe (repr can be shadowed).


'x=%s' % x + s => 'x=%s%s' % (x, s)
x = x + ', [%s]' % y => x = '%s, [%s]' % (x, y)


Doesn't work if s type is not str.


Yes, this is partially applicable. In many cases, s is a literal or the 
newly formatted string.



range(0, x) => range(x)


Is it faster?


Slightly.


while True: s = f.readline(); if not s: break; ... => for s in f: ...


Too much assumptions on f type.


I personally would prefer a 2to3-like "modernizer" (as a separate 
utility and as plugins for the IDEs), which would have found some 
templates and offered replacing by a more modern, readable (and possibly 
effective) variant. The decision on the applicability of the 
transformation in the particular case remains for the human. For the 
automatic optimizer remain only simple transformations which deteriorate 
readability, and optimizations which cannot be expressed in the source code.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] packaging location ?

2012-09-12 Thread Tarek Ziadé

Hello

I was wondering if anyone knows if the removed Lib/packaging directory 
landed in some other places after it was removed.


We have http://hg.python.org/distutils2 be the 'packaging' version is a 
full py3-renamed version we need to keep mirrored


Cheers
Tarek
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Release of astoptimizer 0.3

2012-09-12 Thread Serhiy Storchaka

On 12.09.12 00:47, Victor Stinner wrote:

x = x + [y] => x.append(y)
x = x + [y, z] => x.extend([y, z])


It behaves differently if x is not a list, but str for example.


Actually even worse. Transformations applicable only if x has not 
aliases. Pseudocode:


  if type(x) is list and refcount(x) == 1:
  list.append(x, y)
  else:
  x = x + [y]

This optimization can be done only in the interpreter, otherwise 
overhead costs will be too high.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Release of astoptimizer 0.3

2012-09-12 Thread Victor Stinner
2012/9/12 Serhiy Storchaka :
>>> set([x for ...]) => {x for ...}
>>> dict([(k, v) for ...]) => {k: v for ...}
>>> dict((k, v) for ...) => {k: v for ...}
>>> ''.join([s for ...]) => ''.join(s for ...)
>>> a.extend([s for ...]) => a.extend(s for ...)
>>
>> These optimizations look correct.
>
> Actually generator can be slower list comprehension. Especially on Python2.
> I think this is an opportunity to optimize the work with generators.

I checked with timeit, and yes: generators are slower :-/ I will
revert this "optimization".

>>> (f(x) for x in a) => map(f, a)
>>> (x.y for x in a) => map(operator.attrgetter('y'), a)
>>> (x[0] for x in a) => map(operator.itemgetter(0), a)
>>> (2 * x for x in a) => map((2).__mul__, a)
>>> (x in b for x in a) => map(b.__contains__, a)
>>> map(lambda x: x.strip(), a) => (x.strip() for x in a)
>>
>> Is it faster? :-)

Benchmark using iterable=tuple(range(n)) and f=str (see attached script).

Python version: 2.7.3 (default, Aug 1 2012, 05:16:07) [GCC 4.6.3]
CPU model: Intel(R) Core(TM) i5 CPU 661 @ 3.33GHz
Platform: Linux-3.2.0-30-generic-pae-i686-with-Ubuntu-12.04-precise
Bits: int=32, long=32, long long=64, pointer=32

[ 3 items ]
679 ns: list comprehesion
1.08 us (+59%): itertools.imap
1.42 us (+109%): generator

[ 10 items ]
1.6 us: itertools.imap
1.64 us: list comprehesion
2.26 us (+41%): generator

[ 1000 items ]
112 us: itertools.imap
144 us (+29%): list comprehesion
156 us (+40%): generator

[ 100 items ]
142 ms: itertools.imap
183 ms (+29%): generator
186 ms (+31%): list comprehesion

---

Python version: 3.2.3 (default, May 3 2012, 15:54:42) [GCC 4.6.3]
CPU model: Intel(R) Core(TM) i5 CPU 661 @ 3.33GHz
Platform: Linux-3.2.0-30-generic-pae-i686-with-Ubuntu-12.04-precise
Bits: int=32, long=32, long long=64, pointer=32

[ 3 items ]
1.04 us: list comprehesion
1.21 us (+17%): map
1.51 us (+45%): generator


[ 10 items ]
2.02 us: map
2.29 us (+13%): list comprehesion
2.68 us (+33%): generator


[ 1000 items ]
132 us: map
166 us (+25%): list comprehesion
183 us (+38%): generator


[ 100 items ]
182 ms: map
229 ms (+26%): generator
251 ms (+38%): list comprehesion

--

> Yes, significantly for large sequences. But this transformation is not safe
> in general case. For short sequences possible regression (cost of "map" name
> lookup and function call).

So except for very small dataset, map (itertools.imap) is always
faster. List comprehension cannot be replaced with map() because map()
doesn't set the iterator variable to the last item if the iterable.

>>> x in ['i', 'em', 'cite'] => x in {'i', 'em', 'cite'}
> Agree, it applicable if x is proven str. At least list can be replaced by
> tuple.
>>> (...)
>>> x == 'i' or x == 'em' or x == 'cite'] => x in {'i', 'em', 'cite'}
>>> (...)
>>> x = x + 1 => x += 1
>>> x = x + ' ' => x += ' '

Well, type inference would permit more optimizations. It is not implemented yet.

>>> for ...: f.write(...) => __fwrite = f.write; for ...: __fwrite(...)
>>
>> f.write lookup cannot be optimized.
>
> Yes, it is a dangerous transformation and it is difficult to prove its
> safety. But name lookup is one of the main brakes of Python.

Oh sorry, I didn't read correctly the example. Yeah, such optimization
is common and it would help to have an option to enable it. Using type
inference, it may be possible to optimize safetly some cases (ex: if
you know that f is a file).

> Sorry, it should be 'x=%s' % repr(x) => 'x=%r' % (x,)

Ah ok, why not.

>>> range(0, x) => range(x)
>>
>> Is it faster?
>
> Slightly.

timeit gives me exactly the same timing.

> I personally would prefer a 2to3-like "modernizer" (as a separate utility
> and as plugins for the IDEs), which would have found some templates and
> offered replacing by a more modern, readable (and possibly effective)
> variant. The decision on the applicability of the transformation in the
> particular case remains for the human. For the automatic optimizer remain
> only simple transformations which deteriorate readability, and optimizations
> which cannot be expressed in the source code.

If the optimizer sees an interesting optimization but cannot decide if
one option is better than another, it may write an advice for the
developer. The developer can review the advice and decide which option
is the best.

Some patterns are faster or slower depending on the Python versions
:-/ The optimizer may be able to decide which pattern is the best for
the running Python version.

Victor


bench_gen.py
Description: Binary data
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Release of astoptimizer 0.3

2012-09-12 Thread Victor Stinner
>> Projects using the same code base for Python 2 and Python 3 contain a
>> lot of inefficient code. For example, using the six library, a simple
>> Unicode literal strings becomes a function code: u('unicode').
>
> But are you able to do enough static analysis to feel comfortable that
> this is the right u() function? IIRC you said earlier that you're not
> even capable of recognizing "len = ord; print(len('a'))" -- if that is
> really true, I'm very worried about your optimizer's capacity for
> breaking code. I'm not talking about "behind-their-back" changes to
> __builtins__ or patching of the module globals. I'm talking about
> detecting straightforward definitions that override the identifiers
> you are replacing.

astoptimizer is still experimental, but I prefer to release early and
release often, because I already get interesting feedback.

"from math import pow as len" is already supported in the version 0.3,
and "len=ord" was fixed in the version 0.3.1.

astoptimizer is supposed to handle any instruction setting variables
(if I forgot to handle a specific instruction, it's a bug). My initial
goal was to optimize "x=1; return x", but then I realized that I must
handle take care of all variables, because they may shadow builtin
functions or constants.

The following AST statements creates a "new" namespace (inherit some
properties from the parent):
 - Module
 - FunctionDef
 - ClassDef

The following AST statements set variables or have an impact on scope:
 - Assign, AugAssign, Del
 - Global, Nonlocal
 - Import, ImportFrom
 - With
 - arguments (of FunctionDef)
 - comprehension (of ListComp or GeneratorExp)

There is an experimental support for assignments. If an unsupported
assignment is found (ex: ((x, y), z) = x_y_z), all optimizations on
names (ex: len("abc") or math.e) are disabled. For example, "from re
import *" disables optimizations (but "from math import *" is
supported).

>> I expect that astoptimizer will be able to remove (or at least
>> reduce!) the overhead of the six library and all checks on the Python
>> version ("if PYTHON3: ... else: ...").
>
> Hm. Wouldn't it be just as easy to run a source-to-source translator
> to remove six artefacts instead of an ast optimizer?

You mean something like 2to3? I understood that the six module is
written for developers who prefer to use the same code base for Python
2 and Python 3.

With Python 3.3, if astoptimizer hooks the compile() builtin, the
optimization is enabled transparently when importing a module (during
.py => .pyc compiling). There is no need to explicitly "compile" an
application.

> Surely some convention could be adopted that is easy to use,
> and the tool to do the translation could be a lot simpler
> than an ast optimizer.

I like AST because I don't need to write my own parser.

> Sorry for being skeptical, but I'm not excited about advertising this
> as a general optimization tool unless you can make it a lot safer.

It is safer at every release. I will use the version number 1.0 when
the optimizer will be completly safe :-) (So it's only 31% safe yet!)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] issues found by Coverity

2012-09-12 Thread Raymond Hettinger

On Sep 11, 2012, at 6:32 AM, Christian Heimes  wrote:
> 
> maybe you have noticed a bunch of commits I made the last couple of
> days. 

I noticed!  Thank you for all the work to improve quality.


Raymond___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Release of astoptimizer 0.3

2012-09-12 Thread Maciej Fijalkowski
On Tue, Sep 11, 2012 at 2:57 PM, Nick Coghlan  wrote:
> On Tue, Sep 11, 2012 at 8:41 PM, Victor Stinner
>  wrote:
>> * Call builtin functions if arguments are constants. Examples:
>>
>>   - len("abc") => 3
>>   - ord("A") => 65
>
> This is fine in an external project, but should never be added to the
> standard library. The barrier to semantic changes that break
> monkeypatching should be high.
>
> Yes, this is frustrating as it eliminates a great many interesting
> static optimisations that are *probably* OK. That's one of the reasons
> why PyPy uses tracing - it can perform these optimisations *and* still
> include the appropriate dynamic checks.
>
> However, the double barrier of third party module + off by default is
> a suitable activation barrier for ensuring people know that what
> they're doing is producing bytecode that doesn't behave like standard
> Python any more (e.g. tests won't be able to shadow builtins or
> optimised module references). Optimisations that break the language
> semantics are heading towards the same territory as the byteplay and
> withhacks modules (albeit not as evil internally).

The third (and the most important) barrier is that constant folding
len("abc") is essentially useless. You can do some optimizations that
are sound, at probably a great deal of complexity, if you maintain all
the places that constant folded stuff and change them if you shadow a
builtin. This approach is done in PyPy for example in a more
systematic way (by invalidating the assembler).

Anyway, since this is, in it's current shape clearly not designed to
preserve the semantics of python at all, is the discussion of this
package on-topic for python-dev? More so than say discussing Cython or
Numba or any other kind-of-python-but-not-quite project?

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (merge 3.2 -> default): - Issue #15906: Fix a regression in argparse caused by the preceding change,

2012-09-12 Thread Barry Warsaw
On Sep 11, 2012, at 09:30 PM, Chris Jerdonek wrote:

>I didn't have time to respond Barry's e-mail from four hours ago
>before this was committed.  I think this change may be problematic.
>At the least, I think people should have an opportunity to air their
>specific concerns and talk through the implications.
>
>Also, from the discussion it seemed like the sentiment was leaning
>towards a different approach for the fix.
>
>I made a comment on the issue with some more extended remarks:
>
>http://bugs.python.org/msg170351

The alternative suggested fix breaks the test suite (yes I tried it).  So
maybe we have to go back and decide whether the original fix for #12776 and
#11839 is correct.

More detail in the tracker issue.

-Barry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (merge 3.2 -> default): Fix out of bounds read in long_new() for empty bytes with an explicit base.

2012-09-12 Thread Stefan Krah
christian.heimes  wrote:
>   Fix out of bounds read in long_new() for empty bytes with an explicit base. 
> int(b'', somebase) calls PyLong_FromString() with char* of length 1 but the 
> function accesses the first argument at offset 1. CID 715359
> 
> files:
>   Objects/longobject.c |  4 ++--
>   1 files changed, 2 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/Objects/longobject.c b/Objects/longobject.c
> --- a/Objects/longobject.c
> +++ b/Objects/longobject.c
> @@ -4285,8 +4285,8 @@
>  string = PyByteArray_AS_STRING(x);
>  else
>  string = PyBytes_AS_STRING(x);
> -if (strlen(string) != (size_t)size) {
> -/* We only see this if there's a null byte in x,
> +if (strlen(string) != (size_t)size || !size) {
> +/* We only see this if there's a null byte in x or x is empty,
> x is a bytes or buffer, *and* a base is given. */
>  PyErr_Format(PyExc_ValueError,
>   "invalid literal for int() with base %d: %R",


This is a false positive:   
 

 
Assumption: string == ""
 

 
Call:  PyLong_FromString("", NULL, (int)base);  
 

 
Now: str == ""  
 

 
Coverity claims an invalid access at str[1]:
 

 
if (str[0] == '0' &&
 
((base == 16 && (str[1] == 'x' || str[1] == 'X')) ||
 
(base == 8  && (str[1] == 'o' || str[1] == 'O')) || 
 
(base == 2  && (str[1] == 'b' || str[1] == 'B'  
 

 
But str[1] is never accessed due to shortcut evaluation.
 

 

 
Coverity appears to have serious problems with shortcut evaluations in many 
 
places. 
 

 

 
Stefan Krah 
 
 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Make sure that *really* no more than sizeof(ifr.ifr_name) chars are strcpy-ed

2012-09-12 Thread Stefan Krah
christian.heimes  wrote:
>   Make sure that *really* no more than sizeof(ifr.ifr_name) chars are 
> strcpy-ed to ifr.ifr_name and that the string is *always* NUL terminated. New 
> code shouldn't use strcpy(), too. CID 719692
> 
> files:
>   Modules/socketmodule.c |  3 ++-
>   1 files changed, 2 insertions(+), 1 deletions(-)
> 
> 
> diff --git a/Modules/socketmodule.c b/Modules/socketmodule.c
> --- a/Modules/socketmodule.c
> +++ b/Modules/socketmodule.c
> @@ -1674,7 +1674,8 @@
>  if (len == 0) {
>  ifr.ifr_ifindex = 0;
>  } else if (len < sizeof(ifr.ifr_name)) {
> -strcpy(ifr.ifr_name, PyBytes_AS_STRING(interfaceName));
> +strncpy(ifr.ifr_name, PyBytes_AS_STRING(interfaceName), 
> sizeof(ifr.ifr_name));
> +ifr.ifr_name[(sizeof(ifr.ifr_name))-1] = '\0';
>  if (ioctl(s->sock_fd, SIOCGIFINDEX, &ifr) < 0) {
>  s->errorhandler();
>  Py_DECREF(interfaceName);


I've trouble finding the overrun in the existing code. Previously:

We have:

1) len = PyBytes_GET_SIZE(interfaceName); (without NUL terminator)


At the point of strcpy() we have:

2) len < sizeof(ifr.ifr_name)

PyBytes_AS_STRING(interfaceName) always adds a NUL terminator, so:

3) len+1 <= sizeof(ifr.ifr_name)

4) strcpy(ifr.ifr_name, PyBytes_AS_STRING(interfaceName)) will not overrun
   ifr.ifr_name and ifr.ifr_name is always NUL terminated.


So IMO the strcpy() was safe and the report is a false positive.


Stefan Krah



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (merge 3.2 -> default): Fix out of bounds read in long_new() for empty bytes with an explicit base.

2012-09-12 Thread Christian Heimes
Am 12.09.2012 16:22, schrieb Stefan Krah:
> This is a false positive: 
>
>   
>
> Assumption: string == ""  
>
>   
>
> Call:  PyLong_FromString("", NULL, (int)base);
>
>   
>
> Now: str == ""
>
>   
>
> Coverity claims an invalid access at str[1]:  
>
>   
>
> if (str[0] == '0' &&  
>
> ((base == 16 && (str[1] == 'x' || str[1] == 'X')) ||  
>
> (base == 8  && (str[1] == 'o' || str[1] == 'O')) ||   
>
> (base == 2  && (str[1] == 'b' || str[1] == 'B'
>
>   
>
> But str[1] is never accessed due to shortcut evaluation.  
>
>   
>
>   
>
> Coverity appears to have serious problems with shortcut evaluations in many   
>
> places.   
>

You might be right. But did you notice that there is much more code
beyond the large comment block in PyLong_FromString()? There might be
other code paths that push str beyond its limit.

My change adds an early opt out in an error case and doesn't cause a
performance degradation. I'd have no hard feeling if you'd prefer a
revert but I'd keep the modification as it causes no harm.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (merge 3.2 -> default): Closed reference leak of variable 'k' in function ste_new which wasn't decrefed

2012-09-12 Thread Stefan Krah
christian.heimes  wrote:
> summary:
>   Closed reference leak of variable 'k' in function ste_new which wasn't 
> decrefed in error cases
> 
> files:
>   Python/symtable.c |  3 ++-
>   1 files changed, 2 insertions(+), 1 deletions(-)
> 
> 
> diff --git a/Python/symtable.c b/Python/symtable.c
> --- a/Python/symtable.c
> +++ b/Python/symtable.c
> @@ -24,7 +24,7 @@
>  void *key, int lineno, int col_offset)
>  {
>  PySTEntryObject *ste = NULL;
> -PyObject *k;
> +PyObject *k = NULL;
>  
>  k = PyLong_FromVoidPtr(key);
>  if (k == NULL)
> @@ -79,6 +79,7 @@
>  
>  return ste;
>   fail:
> +Py_XDECREF(k);
>  Py_XDECREF(ste);

I think 'k' is owned by the PySTEntryObject after it is assigned here:

ste->ste_id = k;


So ste_dealloc() will call Py_XDECREF(k) a second time.


Stefan Krah


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (merge 3.2 -> default): Closed reference leak of variable 'k' in function ste_new which wasn't decrefed

2012-09-12 Thread Christian Heimes
Am 12.09.2012 17:42, schrieb Stefan Krah:
> I think 'k' is owned by the PySTEntryObject after it is assigned here:
> 
> ste->ste_id = k;
> 
> 
> So ste_dealloc() will call Py_XDECREF(k) a second time.

You are right. I missed that ste steals the reference to k and does its
own cleanup. I've fixed the issue and moved Py_DECREF(k) into ste ==
NULL block.

http://hg.python.org/cpython/rev/2888356cdd4e
http://hg.python.org/cpython/rev/99ab7006e466

Thanks!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (merge 3.2 -> default): Fix out of bounds read in long_new() for empty bytes with an explicit base.

2012-09-12 Thread Terry Reedy

On 9/12/2012 10:22 AM, Stefan Krah wrote:

christian.heimes  wrote:

Fix out of bounds read in long_new() for empty bytes with an explicit base.

>> int(b'', somebase) calls PyLong_FromString() with char* of length 1

I don't know what happens internally, but such calls raise
ValueError: invalid literal for int() with base 16: ''
Of course, even if int() traps such calls before calling
PyLong_FromString, an extension writer could goof.

Does the length 1 come from added \0?

By the way, excessively long lines in checkin messages are a nuisance 
from reading and responding ;-).



--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (merge 3.2 -> default): Fix out of bounds read in long_new() for empty bytes with an explicit base.

2012-09-12 Thread Christian Heimes
Am 12.09.2012 18:14, schrieb Terry Reedy:
> On 9/12/2012 10:22 AM, Stefan Krah wrote:
>> christian.heimes  wrote:
>>> Fix out of bounds read in long_new() for empty bytes with an explicit
>>> base.
>>> int(b'', somebase) calls PyLong_FromString() with char* of length 1
> 
> I don't know what happens internally, but such calls raise
> ValueError: invalid literal for int() with base 16: ''
> Of course, even if int() traps such calls before calling
> PyLong_FromString, an extension writer could goof.
> 
> Does the length 1 come from added \0?

Coverity (a static code analyzing tool) claims that the some code paths
may read beyond the end of data when an empty byte string and any base
is given. Internally b'' is converted to a null terminated char array
(PyBytes_AS_STRING() returns a null terminated char*).

My change shortcuts the execution path and immediately raises an
exception for the combination of b'' and some base.

> By the way, excessively long lines in checkin messages are a nuisance
> from reading and responding ;-).

Sorry! In the future I'll add more line breaks. :)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Release of astoptimizer 0.3

2012-09-12 Thread Terry Reedy

On 9/12/2012 3:36 AM, Serhiy Storchaka wrote:


I personally would prefer a 2to3-like "modernizer" (as a separate
utility and as plugins for the IDEs), which would have found some
templates and offered replacing by a more modern, readable (and possibly
effective) variant. The decision on the applicability of the
transformation in the particular case remains for the human.


IDLE has a plug-in mechanism, though I am not familiar with it yet. It 
also has a built-in parser of some sort. It is used, for instance, to 
determine the function expression that preceeds '(' in order to get the 
function object for a tool tip.


> For the

automatic optimizer remain only simple transformations which deteriorate
readability, and optimizations which cannot be expressed in the source
code.


I had just made the same observation that some of the proposed 
optimization are really source transformations and others are only ast 
or even lower level changes. We also need to differentiate changes which 
are pretty much guaranteed to be faster (at least with current releases) 
and those those might be with particular hardware, os, and python version.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (merge 3.2 -> default): Fix out of bounds read in long_new() for empty bytes with an explicit base.

2012-09-12 Thread Stefan Krah
Christian Heimes  wrote:
> Am 12.09.2012 16:22, schrieb Stefan Krah:
> > This is a false positive:
> 
> You might be right. But did you notice that there is much more code
> beyond the large comment block in PyLong_FromString()? There might be
> other code paths that push str beyond its limit.

Yes, I understand. My reasoning was different: The str[1] location Coverity
pointed out is a false positive. I checked other locations and they seem to
be okay, too.

Now, because there's so much code my first instinct would be not to touch
it unless there's a proven invalid access. This is to avoid subtle behavior
changes.


> My change adds an early opt out in an error case and doesn't cause a
> performance degradation. I'd have no hard feeling if you'd prefer a
> revert but I'd keep the modification as it causes no harm.

As far as I can see, only the error message is affected. Previously:

>>> int(b'', 0)
Traceback (most recent call last):
  File "", line 1, in 
ValueError: invalid literal for int() with base 10: ''


Now the fact that base=0 is converted to base=10 is lost:

>>> int(b'', 0)
Traceback (most recent call last):
  File "", line 1, in 
ValueError: invalid literal for int() with base 0: b''


No big deal of course, but still a change.



Stefan Krah


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] packaging location ?

2012-09-12 Thread Éric Araujo
Hi,

Lib/packaging is in the repository history, and in my backup clones, but
it’s not visible in any branch head as we have no branch for 3.4 yet.  I
can bring the directory back with a simple Mercurial command.

However, it’s not clear to me that we want to do that.  At the inception
of the project, we wanted a new distutils with support for the latest
PEPs and improved extensibility.  Then we found a number of problems in
the PEPs; the last time I pointed the problems out I got no reply but
“find a PEP dictator and propose changes”.  And when I started the
thread about removing packaging in 3.3, hundreds of replies discussed
changing the whole distutils architecture, splitting the project,
exploring new systems, etc., which is why I’m not sure that we can just
bring back packaging in 3.4 as it was and continue with our previous
roadmap.

Cheers
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] packaging location ?

2012-09-12 Thread Antoine Pitrou
On Wed, 12 Sep 2012 15:02:42 -0400
Éric Araujo  wrote:
> Hi,
> 
> Lib/packaging is in the repository history, and in my backup clones, but
> it’s not visible in any branch head as we have no branch for 3.4 yet.  I
> can bring the directory back with a simple Mercurial command.
> 
> However, it’s not clear to me that we want to do that.  At the inception
> of the project, we wanted a new distutils with support for the latest
> PEPs and improved extensibility.  Then we found a number of problems in
> the PEPs; the last time I pointed the problems out I got no reply but
> “find a PEP dictator and propose changes”.  And when I started the
> thread about removing packaging in 3.3, hundreds of replies discussed
> changing the whole distutils architecture, splitting the project,
> exploring new systems, etc., which is why I’m not sure that we can just
> bring back packaging in 3.4 as it was and continue with our previous
> roadmap.

People who want a whole new distutils architecture can start distutils3
(or repackaging) if they want. If I have to give my advice, I would
favour re-integrating packaging in the stdlib or, better, integrating
all changes, one by one, into distutils itself.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] packaging location ?

2012-09-12 Thread Robert Collins
On Thu, Sep 13, 2012 at 7:02 AM, Éric Araujo  wrote:
> Hi,
>
> Lib/packaging is in the repository history, and in my backup clones, but
> it’s not visible in any branch head as we have no branch for 3.4 yet.  I
> can bring the directory back with a simple Mercurial command.
>
> However, it’s not clear to me that we want to do that.  At the inception
> of the project, we wanted a new distutils with support for the latest
> PEPs and improved extensibility.  Then we found a number of problems in
> the PEPs; the last time I pointed the problems out I got no reply but
> “find a PEP dictator and propose changes”.  And when I started the
> thread about removing packaging in 3.3, hundreds of replies discussed
> changing the whole distutils architecture, splitting the project,
> exploring new systems, etc., which is why I’m not sure that we can just
> bring back packaging in 3.4 as it was and continue with our previous
> roadmap.
>
> Cheers

+1 - FWIW, I'd like to see the previous project drive for
consolidation done without landing in the stdlib, /until/ you've got
something that folk are willingly migrating to en masse - at that
point we'll know that the bulk of use cases are well satisfied, and
that we won't need to be fighting an uphill battle for adoption.

If folk are saying 'I would adopt but its not in the stdlib', well - I
think thats ignorable TBH: the market of adopters that matter are
those using setuptools/distribute/$other_thing today.

I rather suspect we'll face things like 'I still support Python2.7'
more than 'its not in the stdlib' for now.

-Rob (who was just looking into what the state of the art to choose
was yesterday)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] packaging location ?

2012-09-12 Thread Lennart Regebro
On Wed, Sep 12, 2012 at 9:02 PM, Éric Araujo  wrote:
> “find a PEP dictator and propose changes”.  And when I started the
> thread about removing packaging in 3.3, hundreds of replies discussed
> changing the whole distutils architecture, splitting the project,
> exploring new systems, etc.,

Yes, yes, but that's just the same old drama that pops up every time
this is discussed with the same old arguments all over again. We'll
never get anywhere if we care about *that*.

The way to go forward is via PEPs, fix them if needed, implement in a
separate package, stick into stdlib once it works.

//Lennart
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] packaging location ?

2012-09-12 Thread Daniel Holth
On Wed, Sep 12, 2012 at 3:28 PM, Lennart Regebro  wrote:
> On Wed, Sep 12, 2012 at 9:02 PM, Éric Araujo  wrote:
>> “find a PEP dictator and propose changes”.  And when I started the
>> thread about removing packaging in 3.3, hundreds of replies discussed
>> changing the whole distutils architecture, splitting the project,
>> exploring new systems, etc.,
>
> Yes, yes, but that's just the same old drama that pops up every time
> this is discussed with the same old arguments all over again. We'll
> never get anywhere if we care about *that*.
>
> The way to go forward is via PEPs, fix them if needed, implement in a
> separate package, stick into stdlib once it works.
>
> //Lennart

I'm happy to note that as of version 0.6.28 distutils (the setuptools
fork) can now consume PEP 345 / 376 "Database of Installed Python
Distributions" installations. Entry points could probably go in as an
extension to the metadata, but at the moment they work as
entry_points.txt with no changes and would be harmlessly ignored by
"import packaging".
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] packaging location ?

2012-09-12 Thread Éric Araujo
> I'm happy to note that as of version 0.6.28 distutils (the setuptools
> fork)

You certainly mean distribute. :-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] packaging location ?

2012-09-12 Thread Brett Cannon
On Wed, Sep 12, 2012 at 3:28 PM, Lennart Regebro  wrote:

> On Wed, Sep 12, 2012 at 9:02 PM, Éric Araujo  wrote:
> > “find a PEP dictator and propose changes”.  And when I started the
> > thread about removing packaging in 3.3, hundreds of replies discussed
> > changing the whole distutils architecture, splitting the project,
> > exploring new systems, etc.,
>
> Yes, yes, but that's just the same old drama that pops up every time
> this is discussed with the same old arguments all over again. We'll
> never get anywhere if we care about *that*.
>
> The way to go forward is via PEPs, fix them if needed, implement in a
> separate package, stick into stdlib once it works.
>

I agree with Lennart's and Antoine's advice of just move forward with what
we have. If some PEPs need fixing then let's fix them, but we don't need to
rock the horse even more by going overboard. Getting the sane, core bits
into the stdlib as packaging is meant to is plenty to take on. If people
want to reinvent stuff they can do it elsewhere. I personally don't care if
it is done inside or outside the stdlib initially or if it stays in
packaging or goes directly into distutils, but forward movement with what
we have is the most important thing.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] packaging location ?

2012-09-12 Thread R. David Murray
On Wed, 12 Sep 2012 18:07:42 -0400, Brett Cannon  wrote:
> On Wed, Sep 12, 2012 at 3:28 PM, Lennart Regebro  wrote:
> 
> > On Wed, Sep 12, 2012 at 9:02 PM, Éric Araujo  wrote:
> > > “find a PEP dictator and propose changes”.  And when I started the
> > > thread about removing packaging in 3.3, hundreds of replies discussed
> > > changing the whole distutils architecture, splitting the project,
> > > exploring new systems, etc.,
> >
> > Yes, yes, but that's just the same old drama that pops up every time
> > this is discussed with the same old arguments all over again. We'll
> > never get anywhere if we care about *that*.
> >
> > The way to go forward is via PEPs, fix them if needed, implement in a
> > separate package, stick into stdlib once it works.
> >
> 
> I agree with Lennart's and Antoine's advice of just move forward with what
> we have. If some PEPs need fixing then let's fix them, but we don't need to
> rock the horse even more by going overboard. Getting the sane, core bits
> into the stdlib as packaging is meant to is plenty to take on. If people
> want to reinvent stuff they can do it elsewhere. I personally don't care if
> it is done inside or outside the stdlib initially or if it stays in
> packaging or goes directly into distutils, but forward movement with what
> we have is the most important thing.

When the removal was being pondered, the possibility of keeping certain
bits that were more ready than others was discussed.  Perhaps the best
way forward is to put it back in bits, with the most finished (and PEP
relevant) stuff going in first.  That might also give non-packaging
people bite-sized-enough chunks to actually digest and help with.

--David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] packaging location ?

2012-09-12 Thread Nick Coghlan
On Thu, Sep 13, 2012 at 8:43 AM, R. David Murray  wrote:
> When the removal was being pondered, the possibility of keeping certain
> bits that were more ready than others was discussed.  Perhaps the best
> way forward is to put it back in bits, with the most finished (and PEP
> relevant) stuff going in first.  That might also give non-packaging
> people bite-sized-enough chunks to actually digest and help with.

This is the plan I'm going to propose. The previous approach was to
just throw the entirety of distutils2 in there, but there are some
hard questions that doesn't address, and some use cases it doesn't
handle. So, rather than importing it wholesale and making the stdlib
the upstream for distutils2, I believe it makes more sense for
distutils2 to remain an independent project, and we cherry pick bits
and pieces for the standard library's new packaging module as they
stabilise.

In particular, Tarek was focused on being able to create *binary* RPMs
automatically. That isn't enough for my purposes, I need to be able to
create *source* RPMs, which can then be fed to the koji build service
for conversion to binary RPMs in accordance with the (ideally)
autogenerated spec file. A binary RPM that isn't built from a source
RPM is no good to me, and the distutils2 support for this approach is
awful, because setup.cfg inherits all the command model cruft from
distutils which is stupidly hard to integrate with other build
systems. I also want to be able to automate most dependency
management, so people can write "Requires: python(pypi-dist-name)" in
their RPM spec files and have it work, just as they can already write
things like "Requires: perl(File::Rules)"

I'm currently working on creating a coherent map of the status quo,
that describes the overall process of software distribution across
various phases (from development -> source archive -> building ->
binary archive -> installation -> import) and looks at the tools and
formats which exist at each step, both legacy
(distutils/setuptools/distribute) and proposed (e.g. distutils2,
bento, wheel), and the kinds of tasks which need to be automated.

Part of the problem with distutils is that the phases of software
distribution are not clearly documented, instead being implicit in the
behaviour of setuptools. The distutils2 project, to date, has not
remedied that deficiency, instead retaining the implicit overall
workflow and just hacking on various pieces in order to "fix Python
packaging". If we're going to get support from the scientific
community (which has some of the more exotic build requirements going
around) and the existing community that wants the full setuptools
feature set rather than the subset currently standardised (primarily
non-Django web developers in my experience), then we need to address
*their* concerns as well, not just the concerns of those of us that
don't like the opaque nature of setuptools and its preference for
guessing in the presence of ambiguity.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com