Re: [Python-Dev] CPython build options for out-of-the box performance

2016-02-14 Thread Patrascu, Alecsandru
I've added the patches here[1], to be more clear about the workflow and the 
small modifications in the CPython build system.

[1] http://bugs.python.org/issue26359

Thank you,
Alecsandru

> -Original Message-
> From: Python-Dev [mailto:python-dev-
> [email protected]] On Behalf Of Patrascu,
> Alecsandru
> Sent: Tuesday, February 9, 2016 1:45 PM
> To: [email protected]
> Subject: [Python-Dev] CPython build options for out-of-the box performance
> 
> Hi all,
> 
> This is Alecsandru from the Dynamic Scripting Languages Optimization Team
> at Intel Corporation. I want to open a discussion regarding the way
> CPython is built, mainly the options that are available to the
> programmers. Analyzing the CPython ecosystem we can see that there are a
> lot of users that just download the sources and hit the commands
> "./configure", "make" and "make install" once and then continue using it
> with their Python scripts. One of the problems with this workflow it that
> the users do not benefit from the entire optimization features that are
> existing in the build system, such as PGO and LTO.
> 
> Therefore, I propose a workflow, like the following. Assume some work has
> to be done into the CPython interpreter, a developer can do the following
> steps:
> A. Implementation and debugging phase.
> 1. The command "./configure PYDIST=debug" is ran once. It will enable
> the Py_DEBUG, -O0 and -g flags
> 2. The command "make" is ran once or multiple times
> 
> B. Testing the implementation from step A, in a pre-release environment
> 1. The command "./configure PYDIST=devel" is ran once. It will disable
> the Py_DEBUG flags and will enable the -O3 and -g flags, and it is just
> like the current implementation in CPython
> 2. The command "make" is ran once or multiple times
> 
> C. For any other CPython usage, for example distributing the interpreter,
> installing it inside an operating system, or just the majority of users
> who are not CPython developers and only want to compile it once and use it
> as-is:
> 1. The command "./configure" is ran once. Alternatively, the command
> "./configure PYDIST=release" can be used. It will disable all debugging
> functionality, enable the -O3 flag and will enable PGO and LTO.
> 2. The command "make" is ran once
> 
> If you think this benefits CPython, I can create an issue and post the
> patches that enable all of the above.
> 
> Thank you,
> Alecsandru
> 
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-
> dev/alecsandru.patrascu%40intel.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Regular expression bytecode

2016-02-14 Thread Jonathan Goble
I'm new to Python's mailing lists, so please forgive me if I'm sending
this to the wrong list. :)

I filed http://bugs.python.org/issue26336 a few days ago, but now I
think this list might be a better place to get discussion going.
Basically, I'd like to see the bytecode of a compiled regex object
exposed as a public (probably read-only) attribute of the object.

Currently, although compiled in pure Python through modules
sre_compile and sre_parse, the list of opcodes is then passed into C
and copied into an array in a C struct, without being publicly exposed
in any way. The only way for a user to get an internal representation
of the regex is the re.DEBUG flag, which only produces an intermediate
representation rather than the actual bytecode and only goes to
stdout, which makes it useless for someone who wants to examine it
programmatically.

I'm sure others can think of other potential use cases for this, but
one in particular would be that someone could write a debugger that
can allow a user to step through a regex one opcode at a time to see
exactly where it is failing. It would also perhaps be nice to have a
public constructor for the regex object type, which would enable users
to modify the bytecode and directly create a new regex object from it,
similar to what is currently possible through the types.FunctionType
and types.CodeType constructors.

In addition to exposing the code in a public attribute, a helper
module written in Python similar to the dis module (which is for
Python's own bytecode) would be very helpful, allowing the code to be
easily disassembled and examined at a higher level.

Is this a good idea, or am I barking up the wrong tree? I think it's a
great idea, but I'm open to being told this is a horrible idea. :) I
welcome any and all comments both here and on the bug tracker.

Jonathan Goble
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Regular expression bytecode

2016-02-14 Thread Franklin? Lee
I think it would be nice for manipulating (e.g. optimizing, possibly with
JIT-like analysis) and comparing regexes. It can also be useful as a
teaching tool, e.g. exercises in optimizing and comparing regexes.

I think the discussion should be on python-ideas, though.
On Feb 14, 2016 2:01 PM, "Jonathan Goble"  wrote:

> I'm new to Python's mailing lists, so please forgive me if I'm sending
> this to the wrong list. :)
>
> I filed http://bugs.python.org/issue26336 a few days ago, but now I
> think this list might be a better place to get discussion going.
> Basically, I'd like to see the bytecode of a compiled regex object
> exposed as a public (probably read-only) attribute of the object.
>
> Currently, although compiled in pure Python through modules
> sre_compile and sre_parse, the list of opcodes is then passed into C
> and copied into an array in a C struct, without being publicly exposed
> in any way. The only way for a user to get an internal representation
> of the regex is the re.DEBUG flag, which only produces an intermediate
> representation rather than the actual bytecode and only goes to
> stdout, which makes it useless for someone who wants to examine it
> programmatically.
>
> I'm sure others can think of other potential use cases for this, but
> one in particular would be that someone could write a debugger that
> can allow a user to step through a regex one opcode at a time to see
> exactly where it is failing. It would also perhaps be nice to have a
> public constructor for the regex object type, which would enable users
> to modify the bytecode and directly create a new regex object from it,
> similar to what is currently possible through the types.FunctionType
> and types.CodeType constructors.
>
> In addition to exposing the code in a public attribute, a helper
> module written in Python similar to the dis module (which is for
> Python's own bytecode) would be very helpful, allowing the code to be
> easily disassembled and examined at a higher level.
>
> Is this a good idea, or am I barking up the wrong tree? I think it's a
> great idea, but I'm open to being told this is a horrible idea. :) I
> welcome any and all comments both here and on the bug tracker.
>
> Jonathan Goble
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/leewangzhong%2Bpython%40gmail.com
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Wordcode v2

2016-02-14 Thread Demur Rumed
Saw recent discussion:
https://mail.python.org/pipermail/python-dev/2016-February/143013.html

I remember trying WPython; it was fast. Unfortunately it feels it came at
the wrong time when development was invested in getting py3k out the door.
It also had a lot of other ideas like *_INT instructions which allowed
having oparg to be a constant int rather than needing to LOAD_CONST one.
Anyways I'll stop reminiscing

abarnert has started an experiment with wordcode:
https://github.com/abarnert/cpython/blob/c095a32f2a68ac708466b9c64906cc4d0f5de1ee/Python/wordcode.md

I've personally benchmarked this fork with positive results. This
experiment seeks to be conservative-- it doesn't seek to introduce new
opcodes or combine BINARY_OP's all into a single op where the currently
unused-in-wordcode arg then states the kind of binary op (à la COMPARE_OP).
I've submitted a pull request which is working on fixing tests & updating
peephole.c

Bringing this up on the list to figure out if there's interest in a basic
wordcode change. It feels like there's no downsides: faster code, smaller
bytecode, simpler interpretation of bytecode (The Nth instruction starts at
the 2Nth byte if you count EXTENDED_ARG as an instruction). The only
downside is the transitional cost

What'd be necessary for this to be pulled upstream?
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Regular expression bytecode

2016-02-14 Thread Jonathan Goble
On Sun, Feb 14, 2016 at 2:41 PM, Franklin? Lee
 wrote:
> I think it would be nice for manipulating (e.g. optimizing, possibly with
> JIT-like analysis) and comparing regexes. It can also be useful as a
> teaching tool, e.g. exercises in optimizing and comparing regexes.

Both great points in favor of this.

> I think the discussion should be on python-ideas, though.

Thanks for being gentle with the correction. :) I'll resend it over
there later tonight when I have some more time on my hands.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode v2

2016-02-14 Thread Guido van Rossum
I think it's probably too soon to discuss on python-dev, but I do
think that something like this could be attempted in 3.6 or (more
likely) 3.7, if it really is faster.

An unfortunate issue however is that many projects seem to make a
hobby of hacking bytecode. All those projects would have to be totally
rewritten in order to support the new wordcode format (as opposed to
just having to be slightly adjusted to support the occasional new
bytecode opcode). Those projects of course don't work with Pypy or
Jython either, but they do work for mainstream CPython, and it's
unacceptable to just leave them all behind.

As an example, AFAIK coverage.py interprets bytecode. This is an
important piece of infrastructure that we wouldn't want to leave
behind. I think py.test's assert-rewrite code also generates or looks
at bytecode. Also important.

All of which means that it's more likely to make it into 3.7. See you
on python-ideas!

--Guido

On Sun, Feb 14, 2016 at 4:20 PM, Demur Rumed  wrote:
> Saw recent discussion:
> https://mail.python.org/pipermail/python-dev/2016-February/143013.html
>
> I remember trying WPython; it was fast. Unfortunately it feels it came at
> the wrong time when development was invested in getting py3k out the door.
> It also had a lot of other ideas like *_INT instructions which allowed
> having oparg to be a constant int rather than needing to LOAD_CONST one.
> Anyways I'll stop reminiscing
>
> abarnert has started an experiment with wordcode:
> https://github.com/abarnert/cpython/blob/c095a32f2a68ac708466b9c64906cc4d0f5de1ee/Python/wordcode.md
>
> I've personally benchmarked this fork with positive results. This experiment
> seeks to be conservative-- it doesn't seek to introduce new opcodes or
> combine BINARY_OP's all into a single op where the currently
> unused-in-wordcode arg then states the kind of binary op (à la COMPARE_OP).
> I've submitted a pull request which is working on fixing tests & updating
> peephole.c
>
> Bringing this up on the list to figure out if there's interest in a basic
> wordcode change. It feels like there's no downsides: faster code, smaller
> bytecode, simpler interpretation of bytecode (The Nth instruction starts at
> the 2Nth byte if you count EXTENDED_ARG as an instruction). The only
> downside is the transitional cost
>
> What'd be necessary for this to be pulled upstream?
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode v2

2016-02-14 Thread Maciej Fijalkowski
On Mon, Feb 15, 2016 at 4:05 AM, Guido van Rossum  wrote:
> I think it's probably too soon to discuss on python-dev, but I do
> think that something like this could be attempted in 3.6 or (more
> likely) 3.7, if it really is faster.
>
> An unfortunate issue however is that many projects seem to make a
> hobby of hacking bytecode. All those projects would have to be totally
> rewritten in order to support the new wordcode format (as opposed to
> just having to be slightly adjusted to support the occasional new
> bytecode opcode). Those projects of course don't work with Pypy or
> Jython either, but they do work for mainstream CPython, and it's
> unacceptable to just leave them all behind.

They mostly work with PyPy (which has 2 or 3 additional bytecodes, but
nothing too
dramatic)

>
> As an example, AFAIK coverage.py interprets bytecode. This is an
> important piece of infrastructure that we wouldn't want to leave
> behind. I think py.test's assert-rewrite code also generates or looks
> at bytecode. Also important.
>
> All of which means that it's more likely to make it into 3.7. See you
> on python-ideas!
>
> --Guido
>
> On Sun, Feb 14, 2016 at 4:20 PM, Demur Rumed  wrote:
>> Saw recent discussion:
>> https://mail.python.org/pipermail/python-dev/2016-February/143013.html
>>
>> I remember trying WPython; it was fast. Unfortunately it feels it came at
>> the wrong time when development was invested in getting py3k out the door.
>> It also had a lot of other ideas like *_INT instructions which allowed
>> having oparg to be a constant int rather than needing to LOAD_CONST one.
>> Anyways I'll stop reminiscing
>>
>> abarnert has started an experiment with wordcode:
>> https://github.com/abarnert/cpython/blob/c095a32f2a68ac708466b9c64906cc4d0f5de1ee/Python/wordcode.md
>>
>> I've personally benchmarked this fork with positive results. This experiment
>> seeks to be conservative-- it doesn't seek to introduce new opcodes or
>> combine BINARY_OP's all into a single op where the currently
>> unused-in-wordcode arg then states the kind of binary op (à la COMPARE_OP).
>> I've submitted a pull request which is working on fixing tests & updating
>> peephole.c
>>
>> Bringing this up on the list to figure out if there's interest in a basic
>> wordcode change. It feels like there's no downsides: faster code, smaller
>> bytecode, simpler interpretation of bytecode (The Nth instruction starts at
>> the 2Nth byte if you count EXTENDED_ARG as an instruction). The only
>> downside is the transitional cost
>>
>> What'd be necessary for this to be pulled upstream?
>>
>> ___
>> Python-Dev mailing list
>> [email protected]
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode v2

2016-02-14 Thread Andrew Barnert via Python-Dev
On Feb 14, 2016, at 19:05, Guido van Rossum  wrote:
> 
> I think it's probably too soon to discuss on python-dev, but I do
> think that something like this could be attempted in 3.6 or (more
> likely) 3.7, if it really is faster.
> 
> An unfortunate issue however is that many projects seem to make a
> hobby of hacking bytecode.
> All those projects would have to be totally
> rewritten in order to support the new wordcode format (as opposed to
> just having to be slightly adjusted to support the occasional new
> bytecode opcode).

This is part of why I suggested, on -ideas, that we should add a 
mutating/assembling API to the dis module. People argued that such an API would 
make the bytecode format more fragile, but the exact opposite is true.

At the dis level, everything is unchanged by wordcode. Or by Serhiy's 
args-packed-in-opcode. So, if the dis module could do everything for people 
that, say, the third-party byteplay module does (which wouldn't take much), so 
things like coverage.py, or the various special-case optimizer decorators on 
PyPI and ActiveState, etc. could all be written to deal with the dis module 
format rather than raw bytecode, we could make changes like this without 
risking nearly as much breakage.

Anyway, this obviously wouldn't help the transition for 3.6. But improving dis 
in 3.6, with a warning that raw bytecode might start changing more frequently 
and/or radically in the future now that there's less reason to depend on it, 
might help if wordcode were to go into 3.7.

> All of which means that it's more likely to make it into 3.7. See you
> on python-ideas!
> 
> --Guido
> 
>> On Sun, Feb 14, 2016 at 4:20 PM, Demur Rumed  wrote:
>> Saw recent discussion:
>> https://mail.python.org/pipermail/python-dev/2016-February/143013.html
>> 
>> I remember trying WPython; it was fast. Unfortunately it feels it came at
>> the wrong time when development was invested in getting py3k out the door.
>> It also had a lot of other ideas like *_INT instructions which allowed
>> having oparg to be a constant int rather than needing to LOAD_CONST one.
>> Anyways I'll stop reminiscing

Despite the name (and inspiration), my fork has very little to do with WPython. 
I'm just focused on simpler (hopefully = faster) fetch code; he started with 
that, but ended up going the exact opposite direction, accepting more 
complicated (and much slower) fetch code as a reasonable cost for drastically 
reducing the number of instructions. (If you double the 30% fetch-and-parse 
overhead per instruction, but cut the number of instructions to 40%, the net is 
a huge win.)



>> 
>> abarnert has started an experiment with wordcode:
>> https://github.com/abarnert/cpython/blob/c095a32f2a68ac708466b9c64906cc4d0f5de1ee/Python/wordcode.md
>> 
>> I've personally benchmarked this fork with positive results. This experiment
>> seeks to be conservative-- it doesn't seek to introduce new opcodes or
>> combine BINARY_OP's all into a single op where the currently
>> unused-in-wordcode arg then states the kind of binary op (à la COMPARE_OP).
>> I've submitted a pull request which is working on fixing tests & updating
>> peephole.c
>> 
>> Bringing this up on the list to figure out if there's interest in a basic
>> wordcode change. It feels like there's no downsides: faster code, smaller
>> bytecode, simpler interpretation of bytecode (The Nth instruction starts at
>> the 2Nth byte if you count EXTENDED_ARG as an instruction). The only
>> downside is the transitional cost
>> 
>> What'd be necessary for this to be pulled upstream?
>> 
>> ___
>> Python-Dev mailing list
>> [email protected]
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
> 
> 
> 
> -- 
> --Guido van Rossum (python.org/~guido)
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/abarnert%40yahoo.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode v2

2016-02-14 Thread Greg Ewing

Guido van Rossum wrote:

An unfortunate issue however is that many projects seem to make a
hobby of hacking bytecode. All those projects would have to be totally
rewritten in order to support the new wordcode format


Maybe this argues for having an assembly-language-like
intermediate form between the AST and the actual code
used by the interpreter? Done properly it could make
things easier for bytecode-hacking projects as well as
providing some insulation from implementation details.

--
Greg
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com