Re: [Python-Dev] Benchmarking Python and micro-optimizations

2016-10-20 Thread Maciej Fijalkowski
Hi Victor

Despite the fact that I was not able to find time to run your stuff
yet, thanks for all the awesome work!

On Thu, Oct 20, 2016 at 12:56 PM, Victor Stinner
 wrote:
> Hi,
>
> Last months, I worked a lot on benchmarks. I ran benchmarks, analyzed
> results in depth (up to the hardware and kernel drivers!), I wrote new
> tools and enhanced existing tools.
>
> * I wrote a new perf module which runs benchmarks in a reliable way
> and contains a LOT of features: collect metadata, JSON file format,
> commands to compare, render an histogram, etc.
>
> * I rewrote the Python benchmark suite: the old benchmarks Mercurial
> repository moved to a new performance GitHub project which uses my
> perf module and contains more benchmarks.
>
> * I also made minor enhancements to timeit in Python 3.7 -- some dev
> don't want major changes to not "break the backward compatibility".
>
> For timeit, I suggest to use my perf tool which includes a reliable
> timeit command and has much more features like --duplicate (repeat the
> statements to reduce the cost of the outer loop) and --compare-to
> (compare two versions of Python), but also all builtin perf features
> (JSON output, statistics, histogram, etc.).
>
> I added benchmarks from PyPy and Pyston benchmark suites to
> performance: performance 0.3.1 contains 51 benchmark scripts which run
> a total of 121 benchmarks. Example of tested Python modules:
>
> * SQLAlchemy
> * Dulwich (full Git implementation in Python)
> * Mercurial (currently only the startup time)
> * html5lib
> * pyaes (AES crypto cipher in pure Python)
> * sympy
> * Tornado (HTTP client and server)
> * Django (sadly, only the template engine right now, Pyston contains
> HTTP benchmarks)
> * pathlib
> * spambayes
>
> More benchmarks will be added later. It would be nice to add
> benchmarks on numpy for example, numpy is important for a large part
> of our community.
>
> All these (new or updated) tools can now be used to take smarter
> decisions on optimizations. Please don't push any optimization anymore
> without providing reliable benchmark results!
>
>
> My first major action was to close the latest attempt to
> micro-optimize int+int in Python/ceval.c,
> http://bugs.python.org/issue21955 : I closed the issue as rejected,
> because there is no significant speedup on benchmarks other than two
> (tiny) microbenchmarks. To make sure that no one looses its time on
> trying to micro-optimize int+int, I even added a comment to
> Python/ceval.c :-)
>
>https://hg.python.org/cpython/rev/61fcb12a9873
>"Please don't try to micro-optimize int+int"
>
>
> The perf and performance are now well tested: Travis CI runs tests on
> the new commits and pull requests, and the "tox" command can be used
> locally to test different Python versions, pep8, doc, ... in a single
> command.
>
>
> Next steps:
>
> * Run performance 0.3.1 on speed.python.org: the benchmark runner is
> currently stopped (and still uses the old benchmarks project). The
> website part may be updated to allow to download full JSON files which
> includes *all* information (all timings, metadata and more).
>
> * I plan to run performance on CPython 2.7, CPython 3.7, PyPy and PyPy
> 3. Maybe also CPython 3.5 and CPython 3.6 if they don't take too much
> resources.
>
> * Later, we can consider adding more implementations of Python:
> Jython, IronPython, MicroPython, Pyston, Pyjion, etc. All benchmarks
> should be run on the same hardware to be comparable.
>
> * Later, we might also allow other projects to upload their own
> benchmark results, but we should find a solution to groups benchmark
> results per benchmark runner (ex: at least by the hostname, perf JSON
> contains the hostname) to not compare two results from two different
> hardware
>
> * We should continue to add more benchmarks to the performance
> benchmark suite, especially benchmarks more representative of real
> applications (we have enough microbenchmarks!)
>
>
> Links:
>
> * perf: http://perf.readthedocs.io/
> * performance: https://github.com/python/performance
> * Python Speed mailing list: https://mail.python.org/mailman/listinfo/speed
> * https://speed.python.org/ (currently outdated, and don't use performance 
> yet)
>
> See https://pypi.python.org/pypi/performance which contains even more
> links to Python benchmarks (PyPy, Pyston, Numba, Pythran, etc.)
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered

2016-09-20 Thread Maciej Fijalkowski
On Thu, Sep 15, 2016 at 1:27 PM, Paul Moore  wrote:
> On 15 September 2016 at 10:43, Raymond Hettinger
>  wrote:
>> Something like this will reveal the true and massive improvement in 
>> iteration speed:
>>
>>  $ ./python.exe -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" 
>> "list(d)"
>
>>py -3.5 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)"
> 10 loops, best of 3: 66.2 msec per loop
>>py -3.6 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)"
> 10 loops, best of 3: 27.8 msec per loop
>
> And for Victor:
>
>>py -3.5 -m perf timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)"
> 
> Median +- std dev: 65.7 ms +- 3.8 ms
>>py -3.6 -m perf timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)"
> 
> Median +- std dev: 27.9 ms +- 1.2 ms
>
> Just as a side point, perf provided essentially identical results but
> took 2 minutes as opposed to 8 seconds for timeit to do so. I
> understand why perf is better, and I appreciate all the work Victor
> did to create it, and analyze the results, but for getting a quick
> impression of how a microbenchmark performs, I don't see timeit as
> being *quite* as bad as Victor is claiming.
>
> I will tend to use perf now that I have it installed, and now that I
> know how to run a published timeit invocation using perf. It's a
> really cool tool. But I certainly won't object to seeing people
> publish timeit results (any more than I'd object to *any*
> mirobenchmark).
>
> Paul

How about we just make timeit show average and not disable the GC then
(two of the complaints that will not change the execution time)?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered

2016-09-09 Thread Maciej Fijalkowski
On Fri, Sep 9, 2016 at 10:55 AM, Antoine Pitrou  wrote:
> On Thu, 8 Sep 2016 14:20:53 -0700
> Victor Stinner  wrote:
>> 2016-09-08 13:36 GMT-07:00 Guido van Rossum :
>> > IIUC there's one small thing we might still want to change somewhere
>> > after 3.6b1 but before 3.6rc1: the order is not preserved when you
>> > delete some keys and then add some other keys. Apparently PyPy has
>> > come up with a clever solution for this, and we should probably adopt
>> > it, but it's probably best not to hurry that for 3.6b1.
>>
>> Very good news: I was wrong, Raymond Hettinger confirmed that the
>> Python 3.6 dict *already* preserves the items order in all cases. In
>> short, Python 3.6 dict = Python 3.5 OrderedDict (in fact, OrderedDict
>> has a few more methods).
>
> Is it an official feature of the language or an implementation detail?
>
> Regards
>
> Antoine.

I think an implementation detail (although I'm not opposed to having
it mentioned in the spec), but using the same/similar approach for
sets should be mostly relatively simple, no?

PyPy has a pure python OrderedDict which is a wrapper around dict. For
3.6 it needs an adjustement since new methods showed up
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What do we do about bad slicing and possible crashes (issue 27867)

2016-08-30 Thread Maciej Fijalkowski
On Tue, Aug 30, 2016 at 2:31 PM, Dima Tisnek  wrote:
> On 30 August 2016 at 14:13, Serhiy Storchaka  wrote:
>>> 1. Detect length change and raise.
>>
>>
>> It would be simpler solution. But I afraid that this can break third-party
>> code that "just works" now. For example slicing a list "just works" if step
>> is 1. It can return not what the author expected if a list grows, but it
>> never crashes, and existing code can depends on current behavior. This
>> solution is not applicable in maintained versions.
>
> Serhiy,
>
> If dictionary is iterated in thread1 while thread2 changes the
> dictionary, thread1 currently raises RuntimeError.
>
> Would cloning current dict behaviour to slice with overridden
> __index__ make sense?
>
>
> I'd argue 3rd party code depends on slicing not to raise an exception,
> is same as 3rd party code depending on dict iteration not to raise and
> exception; If same container may be concurrently used in another
> thread, then 3rd party code is actually buggy. It's OK to break such
> code.
>
>
> Just my 2c.
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

I'm with Dima here.

It's more complicated - if the third party rely on the code working
when one thread slices while the other thread modifies that gives
implicit atomicity requirements. Those specific requirements are very
hard to maintain across the python versions and python
implementations. Replicating the exact CPython behavior (for each
CPython version too!) is a major nightmare for such specific
scenarios.

I propose the following:

* we raise an error if detected

-or-

* we define the exact behavior what it means to modify the collection
in one thread while the other is slicing it (what do you get? what are
the guarantees? does it also apply if the list is resized?)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects

2016-08-30 Thread Maciej Fijalkowski
On Tue, Aug 30, 2016 at 3:00 AM, Brett Cannon  wrote:
>
>
> On Mon, Aug 29, 2016, 17:06 Terry Reedy  wrote:
>>
>> On 8/29/2016 5:38 PM, Brett Cannon wrote:
>>
>> > who objected to the new field did either for memory ("it adds another
>> > pointer to the struct that won't be typically used"), or for conceptual
>> > reasons ("the code object is immutable and you're proposing a mutable
>> > field"). The latter is addressed by not exposing the field in Python and
>>
>> Am I correct is thinking that you will also not add the new field as an
>> argument to PyCode_New?
>
>
> Correct.
>
>>
>>  > clearly stating that code should never expect the field to be filled.
>>
>> I interpret this as "The only code that should access the field should
>> be code that put something there."
>
>
> Yep, seems like a reasonable rule to follow.
>
> -brett

How do we make sure that multuple tools don't stomp on each other?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects

2016-08-29 Thread Maciej Fijalkowski
Hi Brett

For what is worth, vmprof and similar tools would love such field
(there is an open question how can you use vmprof *and* another tool,
but later)

On Mon, Aug 29, 2016 at 11:38 PM, Brett Cannon  wrote:
> For quick background for those that don't remember, part of PEP 523 proposed
> adding a co_extra field to code objects along with making the frame
> evaluation function pluggable
> (https://www.python.org/dev/peps/pep-0523/#expanding-pycodeobject). The idea
> was that things like JITs and debuggers could use the field as a scratch
> space of sorts to store data related to the code object. People who objected
> to the new field did either for memory ("it adds another pointer to the
> struct that won't be typically used"), or for conceptual reasons ("the code
> object is immutable and you're proposing a mutable field"). The latter is
> addressed by not exposing the field in Python and clearly stating that code
> should never expect the field to be filled.
>
> For the former issue of whether the memory was worth it, Dino has been
> testing whether the field is necessary for performance from a JIT
> perspective. Well, Dino found the time to test Pyjion without the co_extra
> field and it isn't pretty. With the field, Pyjion is faster than stock
> Python in 15 benchmarks
> (https://github.com/Microsoft/Pyjion/tree/master/Perf). Removing the
> co_extra field and using an unordered_map from the C++ STL drops that number
> to 2. Performance is even worse if we try and use a Python dictionary
> instead.
>
> That means we still want to find a solution to attach arbitrary data to code
> objects without sacrificing performance. One proposal is what's in PEP 523
> for the extra field. Another option is to make the memory allocator for code
> objects pluggable and introduce a new flag that signals that the object was
> created using a non-default allocator. Obviously we prefer the former
> solution due to its simplicity. :)
>
> Anyway, we would like to get this settled this week so that I can get
> whatever solution we agree to (if any) in next week in time for Python 3.6b1
> feature freeze that would be greatly appreciated.
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Review request: issue 27350, compact ordered dict

2016-08-15 Thread Maciej Fijalkowski
On Mon, Aug 15, 2016 at 6:02 AM, Xavier Combelle
<xavier.combe...@gmail.com> wrote:
>
>
> On 10/08/2016 17:06, Maciej Fijalkowski wrote:
>> * there are nice speedups
>>
> in this blog post
> https://morepypy.blogspot.fr/2015/01/faster-more-memory-efficient-and-more.html
> it is mentioned big speedup only on microbenchmark and small speedups on
> pypy benchmark. is it what you call nice speedups or does there is other
> things ?

Yes, making dictionaries faster by a bit would not give you huge
speedups everywhere. It'll give you a small, measurable speedup a bit
everywhere. This is much better than a lot of things that cpython does
which is a performance win.

Note that there are two PEPs (sorted order in kwargs and sorted order
in class names) which would be superseded by just reviewing this patch
and merging it.

Best regards,
Maciej Fijalkowski
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Review request: issue 27350, compact ordered dict

2016-08-10 Thread Maciej Fijalkowski
Hello everyone.

I did do only a cursory look on that one, but I would like to
reiterate that this gives huge benefits in general and we measured
nice speedups on pypy (where all the dicts are ordered forever):

* you can essentially kill OrderedDict or make it almost OrderedDict =
dict (in pypy it's a simple dict subclass that has one or two extra
things that OrderedDict has in the API)
* there are nice speedups
* the C version of OrderedDict can be killed
* it saves memory, on 64bit by quite a bit (not everyone stores more
than 4bln items in a dictionary)
* it solves the problem of tests relying on order in dictionaries

In short, it has no downsides

On Tue, Aug 9, 2016 at 3:12 PM, INADA Naoki  wrote:
> Hi, devs.
>
> I've implemented compact and ordered dictionary [1], which PyPy
> implemented in 2015 [2].
>
> Since it is my first large patch, I would like to have enough time for
> review cycle by Python 3.6 beta1.
>
> Could someone review it?
>
> [1] http://bugs.python.org/issue27350
> [2] 
> https://morepypy.blogspot.jp/2015/01/faster-more-memory-efficient-and-more.html
>
> --
> INADA Naoki  
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-12 Thread Maciej Fijalkowski
On Tue, Apr 12, 2016 at 1:14 PM, Jon Ribbens
 wrote:
> On Tue, Apr 12, 2016 at 06:21:04AM -0400, Isaac Morland wrote:
>> On Tue, 12 Apr 2016, Jon Ribbens wrote:
>> >>This is still a massive game of whack-a-mole.
>> >
>> >No, it still isn't. If the names blacklist had to keep being extended
>> >then you would be right, but that hasn't happened so far. Whitelists
>> >by definition contain only a small, limited number of potential moles.
>> >
>> >The only thing you found above that even remotely approaches an
>> >exploit is the decimal.getcontext() thing, and even that I don't
>> >think you could use to do any code execution.
>>
>> "I don't think"?
>>
>> Where's the formal proof?
>
> I disallowed the module completely, that's the proof.
>
>> Without a proof, this is indeed just a game of whack-a-mole.
>
> Almost no computer programs are ever "formally proved" to be secure.
> None of those that run the global Internet are. I don't see why it
> makes any sense to demand that my experiment be held to a massively
> higher standard than the rest of the code everyone relies on every day.
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

Jon, let me reiterate. You asked people to break it (that's the title
of the thread) and they did so almost immediately. Then you patched
the thing and asked them to break it again and they did. Now the
faulty assumption here is that this procedure, repeated enough times
will produce a secure environment - this is not how security works,
you need to be secure against people who will spend more than 5
minutes and who are not on this list or reading this incredibly long
email chain. You can't do that just by asking on the mailing list and
whacking all the examples. As others pointed out, this particular
approach (with maybe different details) has been tried again and again
and again and the result has been the same - you end up with either a
completely unusable python (the python that can't run anything is
trivially secure) or you end up with something that's insecure. I
suggest you look instead at something like PyPy sandbox - which
systematically replaces all external calls with a call to a proxy.
Because PyPy is written in RPython, you can do that - the amount of
code that needs reviewing is relatively small, a couple pages of code.
The code you need to review in order to be even remotely secure is
much larger - it's the amount of C code you can call from your python
with or without knowing that it can happen.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-09 Thread Maciej Fijalkowski
I'm with Victor here. In fact I tried (and failed) to convince Victor
that the approach is entirely unworkable when he was starting, don't
be the next one :-)

On Sat, Apr 9, 2016 at 3:43 PM, Victor Stinner  wrote:
> Please don't loose time trying yet another sandbox inside CPython. It's just
> a waste of time. It's broken by design.
>
> Please read my email about my attempt (pysandbox):
> https://lwn.net/Articles/574323/
>
> And the LWN article:
> https://lwn.net/Articles/574215/
>
> There are a lot of safe ways to run CPython inside a sandbox (and not rhe
> opposite).
>
> I started as you, add more and more things to a blacklist, but it doesn't
> work.
>
> See pysandbox test suite for a lot of ways to escape a sandbox. CPython has
> a list of know code to crash CPython (I don't recall the dieectory in
> sources), even with the latest version of CPython.
>
> Victor
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hash randomization for which types?

2016-02-16 Thread Maciej Fijalkowski
Note that hashing in python 2.7 and prior to 3.4 is simply broken and
the randomization does not do nearly enough, see
https://bugs.python.org/issue14621

On Wed, Feb 17, 2016 at 4:45 AM, Shell Xu  wrote:
> I thought you are right. Here is the source code in python 2.7.11:
>
> long
> PyObject_Hash(PyObject *v)
> {
> PyTypeObject *tp = v->ob_type;
> if (tp->tp_hash != NULL)
> return (*tp->tp_hash)(v);
> /* To keep to the general practice that inheriting
>  * solely from object in C code should work without
>  * an explicit call to PyType_Ready, we implicitly call
>  * PyType_Ready here and then check the tp_hash slot again
>  */
> if (tp->tp_dict == NULL) {
> if (PyType_Ready(tp) < 0)
> return -1;
> if (tp->tp_hash != NULL)
> return (*tp->tp_hash)(v);
> }
> if (tp->tp_compare == NULL && RICHCOMPARE(tp) == NULL) {
> return _Py_HashPointer(v); /* Use address as hash value */
> }
> /* If there's a cmp but no hash defined, the object can't be hashed */
> return PyObject_HashNotImplemented(v);
> }
>
> If object has hash function, it will be used. If not, _Py_HashPointer will
> be used. Which _Py_HashSecret are not used.
> And I checked reference of _Py_HashSecret. Only bufferobject, unicodeobject
> and stringobject use _Py_HashSecret.
>
> On Wed, Feb 17, 2016 at 9:54 AM, Steven D'Aprano 
> wrote:
>>
>> On Tue, Feb 16, 2016 at 11:56:55AM -0800, Glenn Linderman wrote:
>> > On 2/16/2016 1:48 AM, Christoph Groth wrote:
>> > >Hello,
>> > >
>> > >Recent Python versions randomize the hashes of str, bytes and datetime
>> > >objects.  I suppose that the choice of these three types is the result
>> > >of a compromise.  Has this been discussed somewhere publicly?
>> >
>> > Search archives of this list... it was discussed at length.
>>
>> There's a lot of discussion on the mailing list. I think that this is
>> the very start of it, in Dec 2011:
>>
>> https://mail.python.org/pipermail/python-dev/2011-December/115116.html
>>
>> and continuing into 2012, for example:
>>
>> https://mail.python.org/pipermail/python-dev/2012-January/115577.html
>> https://mail.python.org/pipermail/python-dev/2012-January/115690.html
>>
>> and a LOT more, spread over many different threads and subject lines.
>>
>> You should also read the issue on the bug tracker:
>>
>> http://bugs.python.org/issue13703
>>
>>
>> My recollection is that it was decided that only strings and bytes need
>> to have their hashes randomized, because only strings and bytes can be
>> used directly from user-input without first having a conversion step
>> with likely input range validation. In addition, changing the hash for
>> ints would break too much code for too little benefit: unlike strings,
>> where hash collision attacks on web apps are proven and easy, hash
>> collision attacks based on ints are more difficult and rare.
>>
>> See also the comment here:
>>
>> http://bugs.python.org/issue13703#msg151847
>>
>>
>>
>> > >I'm not a web programmer, but don't web applications also use
>> > >dictionaries that are indexed by, say, tuples of integers?
>> >
>> > Sure, and that is the biggest part of the reason they were randomized.
>>
>> But they aren't, as far as I can see:
>>
>> [steve@ando 3.6]$ ./python -c "print(hash((23, 42, 99, 100)))"
>> 1071302475
>> [steve@ando 3.6]$ ./python -c "print(hash((23, 42, 99, 100)))"
>> 1071302475
>>
>> Web apps can use dicts indexed by anything that they like, but unless
>> there is an actual attack, what does it matter? Guido makes a good point
>> about security here:
>>
>> https://mail.python.org/pipermail/python-dev/2013-October/129181.html
>>
>>
>>
>> > I think hashes of all types have been randomized, not _just_ the list
>> > you mentioned.
>>
>> I'm pretty sure that's not actually the case. Using 3.6 from the repo
>> (admittedly not fully up to date though), I can see hash randomization
>> working for strings:
>>
>> [steve@ando 3.6]$ ./python -c "print(hash('abc'))"
>> 11601873
>> [steve@ando 3.6]$ ./python -c "print(hash('abc'))"
>> -2009889747
>>
>> but not for ints:
>>
>> [steve@ando 3.6]$ ./python -c "print(hash(42))"
>> 42
>> [steve@ando 3.6]$ ./python -c "print(hash(42))"
>> 42
>>
>>
>> which agrees with my recollection that only strings and bytes would be
>> randomized.
>>
>>
>>
>> --
>> Steve
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/shell909090%40gmail.com
>
>
>
>
> --
> 彼節者有間,而刀刃者無厚;以無厚入有間,恢恢乎其於游刃必有餘地矣。
> blog: http://shell909090.org/blog/
> twitter: @shell909090
> about.me: http://about.me/shell909090
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> 

Re: [Python-Dev] Wordcode v2

2016-02-14 Thread Maciej Fijalkowski
On Mon, Feb 15, 2016 at 4:05 AM, Guido van Rossum  wrote:
> I think it's probably too soon to discuss on python-dev, but I do
> think that something like this could be attempted in 3.6 or (more
> likely) 3.7, if it really is faster.
>
> An unfortunate issue however is that many projects seem to make a
> hobby of hacking bytecode. All those projects would have to be totally
> rewritten in order to support the new wordcode format (as opposed to
> just having to be slightly adjusted to support the occasional new
> bytecode opcode). Those projects of course don't work with Pypy or
> Jython either, but they do work for mainstream CPython, and it's
> unacceptable to just leave them all behind.

They mostly work with PyPy (which has 2 or 3 additional bytecodes, but
nothing too
dramatic)

>
> As an example, AFAIK coverage.py interprets bytecode. This is an
> important piece of infrastructure that we wouldn't want to leave
> behind. I think py.test's assert-rewrite code also generates or looks
> at bytecode. Also important.
>
> All of which means that it's more likely to make it into 3.7. See you
> on python-ideas!
>
> --Guido
>
> On Sun, Feb 14, 2016 at 4:20 PM, Demur Rumed  wrote:
>> Saw recent discussion:
>> https://mail.python.org/pipermail/python-dev/2016-February/143013.html
>>
>> I remember trying WPython; it was fast. Unfortunately it feels it came at
>> the wrong time when development was invested in getting py3k out the door.
>> It also had a lot of other ideas like *_INT instructions which allowed
>> having oparg to be a constant int rather than needing to LOAD_CONST one.
>> Anyways I'll stop reminiscing
>>
>> abarnert has started an experiment with wordcode:
>> https://github.com/abarnert/cpython/blob/c095a32f2a68ac708466b9c64906cc4d0f5de1ee/Python/wordcode.md
>>
>> I've personally benchmarked this fork with positive results. This experiment
>> seeks to be conservative-- it doesn't seek to introduce new opcodes or
>> combine BINARY_OP's all into a single op where the currently
>> unused-in-wordcode arg then states the kind of binary op (à la COMPARE_OP).
>> I've submitted a pull request which is working on fixing tests & updating
>> peephole.c
>>
>> Bringing this up on the list to figure out if there's interest in a basic
>> wordcode change. It feels like there's no downsides: faster code, smaller
>> bytecode, simpler interpretation of bytecode (The Nth instruction starts at
>> the 2Nth byte if you count EXTENDED_ARG as an instruction). The only
>> downside is the transitional cost
>>
>> What'd be necessary for this to be pulled upstream?
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 509: Add a private version to dict

2016-01-20 Thread Maciej Fijalkowski
The easiest version is to have global numbering (as opposed to local).

Anyway, I would strongly suggest getting some benchmarks done and
showing performance benefits first, because you don't want PEPs to be
final when you don't exactly know the details.

On Wed, Jan 20, 2016 at 7:02 PM, Yury Selivanov  wrote:
> On 2016-01-18 5:43 PM, Victor Stinner wrote:
>>
>> Is someone opposed to this PEP 509?
>>
>> The main complain was the change on the public Python API, but the PEP
>> doesn't change the Python API anymore.
>>
>> I'm not aware of any remaining issue on this PEP.
>
>
> Victor,
>
> I've been experimenting with the PEP to implement a per-opcode
> cache in ceval loop (I'll share my progress on that in a few
> days).  This allows to significantly speedup LOAD_GLOBAL and
> LOAD_METHOD opcodes, to the point, where they don't require
> any dict lookups at all.  Some macro-benchmarks (such as
> chameleon_v2) demonstrate impressive ~10% performance boost.
>
> I rely on your dict->ma_version to implement cache invalidation.
>
> However, besides guarding against version change, I also want
> to guard against the dict being swapped for another dict, to
> avoid situations like this:
>
>
> def foo():
> print(bar)
>
> exec(foo.__code__, {'bar': 1}, {})
> exec(foo.__code__, {'bar': 2}, {})
>
>
> What I propose is to add a pointer "ma_extra" (same 64bits),
> which will be set to NULL for most dict instances (instead of
> ma_version).  "ma_extra" can then point to a struct that has a
> globally unique dict ID (uint64), and a version tag (unit64).
> A macro like PyDict_GET_ID and PyDict_GET_VERSION could then
> efficiently fetch the version/unique ID of the dict for guards.
>
> "ma_extra" would also make it easier for us to extend dicts
> in the future.
>
> Yury
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 509: Add a private version to dict

2016-01-20 Thread Maciej Fijalkowski
On Wed, Jan 20, 2016 at 7:22 PM, Brett Cannon  wrote:
>
>
> On Wed, 20 Jan 2016 at 10:11 Yury Selivanov  wrote:
>>
>> On 2016-01-18 5:43 PM, Victor Stinner wrote:
>> > Is someone opposed to this PEP 509?
>> >
>> > The main complain was the change on the public Python API, but the PEP
>> > doesn't change the Python API anymore.
>> >
>> > I'm not aware of any remaining issue on this PEP.
>>
>> Victor,
>>
>> I've been experimenting with the PEP to implement a per-opcode
>> cache in ceval loop (I'll share my progress on that in a few
>> days).  This allows to significantly speedup LOAD_GLOBAL and
>> LOAD_METHOD opcodes, to the point, where they don't require
>> any dict lookups at all.  Some macro-benchmarks (such as
>> chameleon_v2) demonstrate impressive ~10% performance boost.
>
>
> Ooh, now my brain is trying to figure out the design of the cache. :)
>
>>
>>
>> I rely on your dict->ma_version to implement cache invalidation.
>>
>> However, besides guarding against version change, I also want
>> to guard against the dict being swapped for another dict, to
>> avoid situations like this:
>>
>>
>>  def foo():
>>  print(bar)
>>
>>  exec(foo.__code__, {'bar': 1}, {})
>>  exec(foo.__code__, {'bar': 2}, {})
>>
>>
>> What I propose is to add a pointer "ma_extra" (same 64bits),
>> which will be set to NULL for most dict instances (instead of
>> ma_version).  "ma_extra" can then point to a struct that has a
>> globally unique dict ID (uint64), and a version tag (unit64).
>> A macro like PyDict_GET_ID and PyDict_GET_VERSION could then
>> efficiently fetch the version/unique ID of the dict for guards.
>>
>> "ma_extra" would also make it easier for us to extend dicts
>> in the future.
>
>
> Why can't you simply use the id of the dict object as the globally unique
> dict ID? It's already globally unique amongst all Python objects which makes
> it inherently unique amongst dicts.
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>

Brett, you need two things - the ID of the dict and the version tag.
What we do in pypy is we have a small object (called, surprisingly,
VersionTag()) and we use the ID of that. That way you can change the
version id of an existing dict and have only one field.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 509: Add a private version to dict

2016-01-20 Thread Maciej Fijalkowski
On Wed, Jan 20, 2016 at 8:00 PM, Yury Selivanov <yselivanov...@gmail.com> wrote:
>
>
> On 2016-01-20 1:36 PM, Maciej Fijalkowski wrote:
>>
>> On Wed, Jan 20, 2016 at 7:22 PM, Brett Cannon <br...@python.org> wrote:
>>>
>>>
>>> On Wed, 20 Jan 2016 at 10:11 Yury Selivanov <yselivanov...@gmail.com>
>>> wrote:
>
> [..]
>>>>
>>>> "ma_extra" would also make it easier for us to extend dicts
>>>> in the future.
>>>
>>>
>>> Why can't you simply use the id of the dict object as the globally unique
>>> dict ID? It's already globally unique amongst all Python objects which
>>> makes
>>> it inherently unique amongst dicts.
>>>
>>>
>> Brett, you need two things - the ID of the dict and the version tag.
>> What we do in pypy is we have a small object (called, surprisingly,
>> VersionTag()) and we use the ID of that. That way you can change the
>> version id of an existing dict and have only one field.
>
>
>
> Yeah, that's essentially what I propose with ma_extra.
>
> Yury

The trick is we use only one field :-)

you're proposing to have both fields - version tag and dict id. Why
not just use the id of the object (without any fields)?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 509: Add a private version to dict

2016-01-20 Thread Maciej Fijalkowski
there is also the problem that you don't want it on all dicts. So
having two extra words is more to pay than having extra objects (also,
comparison is cheaper for guards)

On Wed, Jan 20, 2016 at 8:23 PM, Yury Selivanov <yselivanov...@gmail.com> wrote:
>
>
> On 2016-01-20 2:09 PM, Maciej Fijalkowski wrote:
>>>
>>> >
>>
>> You don't free a version tag that's stored in the guard. You store the
>> object and not id
>
>
> Ah, got it.  Although for my current cache design it would be
> more memory efficient to use the dict itself to store its own
> unique id and tag, hence my "ma_extra" proposal.  In any case,
> the current "ma_version" proposal is flawed :(
>
> Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 509: Add a private version to dict

2016-01-20 Thread Maciej Fijalkowski
On Wed, Jan 20, 2016 at 8:08 PM, Yury Selivanov <yselivanov...@gmail.com> wrote:
>
> On 2016-01-20 2:02 PM, Maciej Fijalkowski wrote:
>>
>> On Wed, Jan 20, 2016 at 8:00 PM, Yury Selivanov <yselivanov...@gmail.com>
>> wrote:
>>
> [..]
>>>>
>>>> Brett, you need two things - the ID of the dict and the version tag.
>>>> What we do in pypy is we have a small object (called, surprisingly,
>>>> VersionTag()) and we use the ID of that. That way you can change the
>>>> version id of an existing dict and have only one field.
>>>
>>> Yeah, that's essentially what I propose with ma_extra.
>>>
>>> Yury
>>
>> The trick is we use only one field :-)
>>
>> you're proposing to have both fields - version tag and dict id. Why
>> not just use the id of the object (without any fields)?
>
>
> What if your old dict is GCed, its "VersionTag()" (1) object is
> freed, and you have a new dict, for which a new "VersionTag()" (2)
> object happens to be allocated at the same address as (1)?
>
> Yury
>

You don't free a version tag that's stored in the guard. You store the
object and not id
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] _PyThreadState_Current

2016-01-18 Thread Maciej Fijalkowski
seems to work thanks.

That said, I would love to have PyThreadState_Get equivalent that
would let me handle the NULL.

On Mon, Jan 18, 2016 at 9:31 PM, Maciej Fijalkowski <fij...@gmail.com> wrote:
> Good point
>
> On Mon, Jan 18, 2016 at 9:25 PM, Victor Stinner
> <victor.stin...@gmail.com> wrote:
>> Hum, you can try to lie and define Py_BUILD_CORE?
>>
>> Victor
>>
>> 2016-01-18 21:18 GMT+01:00 Maciej Fijalkowski <fij...@gmail.com>:
>>> Hi
>>>
>>> change in between 3.5.0 and 3.5.1 (hiding _PyThreadState_Current and
>>> pyatomic.h) broke vmprof. The problem is that as a profile, vmprof can
>>> really encounter _PyThreadState_Current being null, while crashing an
>>> interpreter is a bit not ideal in this case.
>>>
>>> Any chance, a) _PyThreadState_Current can be restored in visibility?
>>> b) can I get a better API to get it in case it can be NULL, but also
>>> in 3.5 (since it works in 3.5.0 and breaks in 3.5.1)
>>>
>>> Cheers,
>>> fijal
>>> ___
>>> Python-Dev mailing list
>>> Python-Dev@python.org
>>> https://mail.python.org/mailman/listinfo/python-dev
>>> Unsubscribe: 
>>> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] _PyThreadState_Current

2016-01-18 Thread Maciej Fijalkowski
Good point

On Mon, Jan 18, 2016 at 9:25 PM, Victor Stinner
<victor.stin...@gmail.com> wrote:
> Hum, you can try to lie and define Py_BUILD_CORE?
>
> Victor
>
> 2016-01-18 21:18 GMT+01:00 Maciej Fijalkowski <fij...@gmail.com>:
>> Hi
>>
>> change in between 3.5.0 and 3.5.1 (hiding _PyThreadState_Current and
>> pyatomic.h) broke vmprof. The problem is that as a profile, vmprof can
>> really encounter _PyThreadState_Current being null, while crashing an
>> interpreter is a bit not ideal in this case.
>>
>> Any chance, a) _PyThreadState_Current can be restored in visibility?
>> b) can I get a better API to get it in case it can be NULL, but also
>> in 3.5 (since it works in 3.5.0 and breaks in 3.5.1)
>>
>> Cheers,
>> fijal
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: 
>> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 509: Add a private version to dict

2016-01-11 Thread Maciej Fijalkowski
Hi Victor.

You know that pypy does this stuff without changing and exposing
python semantics right? We have a version dict that does not leak
abstractions to the user.

In general, doing stuff like that where there is a public API that
leaks details of certain optimizations makes it harder and harder for
optimizing compilers to do their job properly, if you want to do
something slightly different.

Can we make this happen (as you noted in the prior art) WITHOUT
changing ANY of the things exposed to the user?

On Mon, Jan 11, 2016 at 6:49 PM, Victor Stinner
 wrote:
> Hi,
>
> After a first round on python-ideas, here is the second version of my
> PEP. The main changes since the first version are that the dictionary
> version is no more exposed at the Python level and the field type now
> also has a size of 64-bit on 32-bit platforms.
>
> The PEP is part of a serie of 3 PEP adding an API to implement a
> static Python optimizer specializing functions with guards. The second
> PEP is currently discussed on python-ideas and I'm still working on
> the third PEP.
>
> Thanks to Red Hat for giving me time to experiment on this.
>
>
> HTML version:
> https://www.python.org/dev/peps/pep-0509/
>
>
> PEP: 509
> Title: Add a private version to dict
> Version: $Revision$
> Last-Modified: $Date$
> Author: Victor Stinner 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 4-January-2016
> Python-Version: 3.6
>
>
> Abstract
> 
>
> Add a new private version to builtin ``dict`` type, incremented at each
> change, to implement fast guards on namespaces.
>
>
> Rationale
> =
>
> In Python, the builtin ``dict`` type is used by many instructions. For
> example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the
> global namespace, or in the builtins namespace (two dict lookups).
> Python uses ``dict`` for the builtins namespace, globals namespace, type
> namespaces, instance namespaces, etc. The local namespace (namespace of
> a function) is usually optimized to an array, but it can be a dict too.
>
> Python is hard to optimize because almost everything is mutable: builtin
> functions, function code, global variables, local variables, ... can be
> modified at runtime. Implementing optimizations respecting the Python
> semantics requires to detect when "something changes": we will call
> these checks "guards".
>
> The speedup of optimizations depends on the speed of guard checks. This
> PEP proposes to add a version to dictionaries to implement fast guards
> on namespaces.
>
> Dictionary lookups can be skipped if the version does not change which
> is the common case for most namespaces. The performance of a guard does
> not depend on the number of watched dictionary entries, complexity of
> O(1), if the dictionary version does not change.
>
> Example of optimization: copy the value of a global variable to function
> constants.  This optimization requires a guard on the global variable to
> check if it was modified. If the variable is modified, the variable must
> be loaded at runtime when the function is called, instead of using the
> constant.
>
> See the `PEP 510 -- Specialized functions with guards
> `_ for the concrete usage of
> guards to specialize functions and for the rationale on Python static
> optimizers.
>
>
> Guard example
> =
>
> Pseudo-code of an fast guard to check if a dictionary entry was modified
> (created, updated or deleted) using an hypothetical
> ``dict_get_version(dict)`` function::
>
> UNSET = object()
>
> class GuardDictKey:
> def __init__(self, dict, key):
> self.dict = dict
> self.key = key
> self.value = dict.get(key, UNSET)
> self.version = dict_get_version(dict)
>
> def check(self):
> """Return True if the dictionary entry did not changed."""
>
> # read the version field of the dict structure
> version = dict_get_version(self.dict)
> if version == self.version:
> # Fast-path: dictionary lookup avoided
> return True
>
> # lookup in the dictionary
> value = self.dict.get(self.key, UNSET)
> if value is self.value:
> # another key was modified:
> # cache the new dictionary version
> self.version = version
> return True
>
> # the key was modified
> return False
>
>
> Usage of the dict version
> =
>
> Specialized functions using guards
> --
>
> The `PEP 510 -- Specialized functions with guards
> `_ proposes an API to support
> specialized functions with guards. It allows to implement static
> optimizers for Python without breaking the Python semantics.
>
> Example of a static 

Re: [Python-Dev] PEP 509: Add a private version to dict

2016-01-11 Thread Maciej Fijalkowski
On Mon, Jan 11, 2016 at 9:56 PM, Victor Stinner
<victor.stin...@gmail.com> wrote:
> Le 11 janv. 2016 8:09 PM, "Maciej Fijalkowski" <fij...@gmail.com> a écrit :
>> Hi Victor.
>>
>> You know that pypy does this stuff without changing and exposing
>> python semantics right? We have a version dict that does not leak
>> abstractions to the user.
>
> The PEP adds a field to the C structure PyDictObject. Are you asking me to
> hide it from the C structure?
>
> The first version of my PEP added a public read-only property at Python
> level, but I changed the PEP. See the alternatives section for more detail.
>
> Victor

I asked you to hide it from python, read the wrong version :-)

Cool!
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Idea: Dictionary references

2015-12-17 Thread Maciej Fijalkowski
You can very easily implement this with version tags on the globals
dictionaries - means that the dictionaries have versions and the guard
checking if everything is ok just checks the version tag on globals.

Generally speaking, such optimizations have been done in the past
(even in places like pypy, but also in literature) and as soon as we
have dynamic compilation (and FAT is a form of it), you can do such
tricks.

On Thu, Dec 17, 2015 at 3:48 PM, Steven D'Aprano  wrote:
> On Thu, Dec 17, 2015 at 12:53:13PM +0100, Victor Stinner quoted:
>> 2015-12-17 11:54 GMT+01:00 Franklin? Lee :
>
>> > Each function keeps an indirect, automagically updated
>> > reference to the current value of the names they use,
>
> Isn't that a description of globals()? If you want to look up a name
> "spam", you grab an indirect reference to it:
>
> globals()["spam"]
>
> which returns the current value of the name "spam".
>
>
>> > and will never need to look things up again.[*]
>
> How will this work?
>
> Naively, it sounds to me like Franklin is suggesting that on every
> global assignment, the interpreter will have to touch every single
> function in the module to update that name. Something like this:
>
> # on a global assignment
> spam = 23
>
> # the interpreter must do something like this:
> for function in module.list_of_functions:
> if "spam" in function.__code__.__global_names__:
> function.__code__.__global_names__["spam"] = spam
>
> As I said, that's a very naive way to implement this. Unless you have
> something much cleverer, I think this will be horribly slow.
>
> And besides, you *still* need to deal with the case that the name isn't
> a global at all, but in the built-ins namespace.
>
>
> --
> Steve
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Avoiding CPython performance regressions

2015-12-01 Thread Maciej Fijalkowski
Hi David.

Any reason you run a tiny tiny subset of benchmarks?

On Tue, Dec 1, 2015 at 5:26 PM, Stewart, David C
 wrote:
>
>
> From: Fabio Zadrozny >
> Date: Tuesday, December 1, 2015 at 1:36 AM
> To: David Stewart 
> >
> Cc: "R. David Murray" >, 
> "python-dev@python.org" 
> >
> Subject: Re: [Python-Dev] Avoiding CPython performance regressions
>
>
> On Mon, Nov 30, 2015 at 3:33 PM, Stewart, David C 
> > wrote:
>
> On 11/30/15, 5:52 AM, "Python-Dev on behalf of R. David Murray" 
> 
>  on behalf of rdmur...@bitdance.com> wrote:
>
>>
>>There's also an Intel project posted about here recently that checks
>>individual benchmarks for performance regressions and posts the results
>>to python-checkins.
>
> The description of the project is at https://01.org/lp - Python results are 
> indeed sent daily to python-checkins. (No results for Nov 30 and Dec 1 due to 
> Romania National Day holiday!)
>
> There is also a graphic dashboard at http://languagesperformance.intel.com/
>
> Hi Dave,
>
> Interesting, but I'm curious on which benchmark set are you running? From the 
> graphs it seems it has a really high standard deviation, so, I'm curious to 
> know if that's really due to changes in the CPython codebase / issues in the 
> benchmark set or in how the benchmarks are run... (it doesn't seem to be the 
> benchmarks from https://hg.python.org/benchmarks/ right?).
>
> Fabio – my advice to you is to check out the daily emails sent to 
> python-checkins. An example is 
> https://mail.python.org/pipermail/python-checkins/2015-November/140185.html. 
> If you still have questions, Stefan can answer (he is copied).
>
> The graphs are really just a manager-level indicator of trends, which I find 
> very useful (I have it running continuously on one of the monitors in my 
> office) but core developers might want to see day-to-day the effect of their 
> changes. (Particular if they thought one was going to improve performance. 
> It's nice to see if you get community confirmation).
>
> We do run nightly a subset of https://hg.python.org/benchmarks/ and run the 
> full set when we are evaluating our performance patches.
>
> Some of the "benchmarks" really do have a high standard deviation, which 
> makes them hardly very useful for measuring incremental performance 
> improvements, IMHO. I like to see it spelled out so I can tell whether I 
> should be worried or not about a particular delta.
>
> Dave
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Avoiding CPython performance regressions

2015-12-01 Thread Maciej Fijalkowski
On Tue, Dec 1, 2015 at 9:04 PM, Stewart, David C
<david.c.stew...@intel.com> wrote:
> On 12/1/15, 10:56 AM, "Maciej Fijalkowski" <fij...@gmail.com> wrote:
>
>
>
>>Hi David.
>>
>>Any reason you run a tiny tiny subset of benchmarks?
>
> We could always run more. There are so many in the full set in 
> https://hg.python.org/benchmarks/ with such divergent results that it seems 
> hard to see the forest because there are so many trees. I'm more interested 
> in gradually adding to the set rather than the huge blast of all of them in 
> daily email. Would you disagree?
>
> Part of the reason that I monitor ssbench so closely on Python 2 is that 
> Swift is a major element in cloud computing (and OpenStack in particular) and 
> has ~70% of its cycles in Python.

Last time I checked, Swift was quite a bit faster under pypy :-)


>
> We are really interested in workloads which are representative of the way 
> Python is used by a lot of people and which produce repeatable results. (and 
> which are open source). Do you have a suggestions?

You know our benchmark suite (https://bitbucket.org/pypy/benchmarks),
we're gradually incorporating what people report. That means that
(Typically) it'll be open source library benchmarks, if they get to
the point of writing some. I have for example coming django ORM
benchmark, can show you if you want. I don't think there is a
"representative benchmark" or maybe even "representative set", also
because open source code tends to be higher quality and less
spaghetti-like than closed source code that I've seen, but we're
adding and adding.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Avoiding CPython performance regressions

2015-12-01 Thread Maciej Fijalkowski
Hi

Thanks for doing the work! I'm on of the pypy devs and I'm very
interested in seeing this getting somewhere. I must say I struggle to
read the graph - is red good or is red bad for example?

I'm keen to help you getting anything you want to run it repeatedly.

PS. The intel stuff runs one benchmark in a very questionable manner,
so let's maybe not rely on it too much.

On Mon, Nov 30, 2015 at 3:52 PM, R. David Murray  wrote:
> On Mon, 30 Nov 2015 09:02:12 -0200, Fabio Zadrozny  wrote:
>> Note that uploading the data to SpeedTin should be pretty straightforward
>> (by using https://github.com/fabioz/pyspeedtin, so, the main issue would be
>> setting up o machine to run the benchmarks).
>
> Thanks, but Zach almost has this working using codespeed (he's still
> waiting on a review from infrastructure, I think).  The server was not in
> fact running; a large part of what Zach did was to get that server set up.
> I don't know what it would take to export the data to another consumer,
> but if you want to work on that I'm guessing there would be no objection.
> And I'm sure there would be no objection if you want to get involved
> in maintaining the benchmark server!
>
> There's also an Intel project posted about here recently that checks
> individual benchmarks for performance regressions and posts the results
> to python-checkins.
>
> --David
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Avoiding CPython performance regressions

2015-12-01 Thread Maciej Fijalkowski
On Tue, Dec 1, 2015 at 11:49 AM, Fabio Zadrozny <fabi...@gmail.com> wrote:
>
> On Tue, Dec 1, 2015 at 6:36 AM, Maciej Fijalkowski <fij...@gmail.com> wrote:
>>
>> Hi
>>
>> Thanks for doing the work! I'm on of the pypy devs and I'm very
>> interested in seeing this getting somewhere. I must say I struggle to
>> read the graph - is red good or is red bad for example?
>>
>> I'm keen to help you getting anything you want to run it repeatedly.
>>
>> PS. The intel stuff runs one benchmark in a very questionable manner,
>> so let's maybe not rely on it too much.
>
>
> Hi Maciej,
>
> Great, it'd be awesome having data on multiple Python VMs (my latest target
> is really having a way to compare across multiple VMs/versions easily and
> help each implementation keep a focus on performance). Ideally, a single,
> dedicated machine could be used just to run the benchmarks from multiple VMs
> (one less variable to take into account for comparisons later on, as I'm not
> sure it'd be reliable to normalize benchmark data from different machines --
> it seems Zach was the one to contact from that, but if there's such a
> machine already being used to run PyPy, maybe it could be extended to run
> other VMs too?).
>
> As for the graph, it should be easy to customize (and I'm open to
> suggestions). In the case, as it is, red is slower and blue is faster (so,
> for instance in
> https://www.speedtin.com/reports/1_CPython27x_Performance_Over_Time,  the
> fastest CPython version overall was 2.7.3 -- and 2.7.1 was the baseline).
> I've updated the comments to make it clearer (and changed the second graph
> to compare the latest against the fastest version (2.7.rc11 vs 2.7.3) for
> the individual benchmarks.
>
> Best Regards,
>
> Fabio

There is definitely a machine available. I suggest you ask
python-infra list for access. It definitely can be used to run more
than just pypy stuff. As for normalizing across multiple machines -
don't even bother. Different architectures make A LOT of difference,
especially with cache sizes and whatnot, that seems to have different
impact on different loads.

As for graph - I like the split on the benchmarks and a better
description (higher is better) would be good.

I have a lot of ideas about visualizations, pop in on IRC, I'm happy
to discuss :-)

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Benchmark results across all major Python implementations

2015-11-16 Thread Maciej Fijalkowski
Hi Brett

Any thoughts on improving the benchmark set (I think all of
{cpython,pypy,pyston} introduced new benchmarks to the set).
"speed.python.org" becoming a thing is generally stopped on "noone
cares enough to set it up".

Cheers,
fijal


On Mon, Nov 16, 2015 at 9:18 PM, Brett Cannon  wrote:
> I gave the opening keynote at PyCon CA and then gave the same talk at PyData
> NYC on the various interpreters of Python (Jupyter notebook of my
> presentation can be found at bit.ly/pycon-ca-keynote; no video yet). I
> figured people here might find the benchmark numbers interesting so I'm
> sharing the link here.
>
> I'm still hoping someday speed.python.org becomes a thing so I never have to
> spend so much time benchmarking so may Python implementations ever again and
> this sort of thing is just part of what we do to keep the implementation
> ecosystem healthy.
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Second milestone of FAT Python

2015-11-04 Thread Maciej Fijalkowski
How do you check that someone did not e.g. bind something different to "len"?

On Wed, Nov 4, 2015 at 8:50 AM, Victor Stinner  wrote:
> Hi,
>
> I'm writing a new "FAT Python" project to try to implement optimizations in
> CPython (inlining, constant folding, move invariants out of loops, etc.)
> using a "static" optimizer (not a JIT). For the background, see the thread
> on python-ideas:
> https://mail.python.org/pipermail/python-ideas/2015-October/036908.html
>
> See also the documentation:
> https://hg.python.org/sandbox/fatpython/file/tip/FATPYTHON.rst
> https://hg.python.org/sandbox/fatpython/file/tip/ASTOPTIMIZER.rst
>
> I implemented the most basic optimization to test my code: replace calls to
> builtin functions (with constant arguments) with the result. For example,
> len("abc") is replaced with 3. I reached the second milestone: it's now
> possible to run the full Python test suite with these optimizations enabled.
> It confirms that the optimizations don't break the Python semantic.
>
> Example:
> ---
 def func():
> ... return len("abc")
> ...
 import dis
 dis.dis(func)
>   2   0 LOAD_GLOBAL  0 (len)
>   3 LOAD_CONST   1 ('abc')
>   6 CALL_FUNCTION1 (1 positional, 0 keyword pair)
>   9 RETURN_VALUE
>
 len(func.get_specialized())
> 1
 specialized=func.get_specialized()[0]
 dis.dis(specialized['code'])
>   2   0 LOAD_CONST   1 (3)
>   3 RETURN_VALUE
 len(specialized['guards'])
> 2
>
 func()
> 3
>
 len=lambda obj: "mock"
 func()
> 'mock'
 func.get_specialized()
> []
> ---
>
> The function func() has specialized bytecode which returns directly 3
> instead of calling len("abc"). The specialized bytecode has two guards
> dictionary keys: builtins.__dict__['len'] and globals()['len']. If one of
> these keys is modified, the specialized bytecode is simply removed (when the
> function is called) and the original bytecode is executed.
>
>
> You cannot expect any speedup at this milestone, it's just to validate the
> implementation. You can only get speedup if you implement *manually*
> optimizations. See for example posixpath.isabs() which inlines manually the
> call to the _get_sep() function. More optimizations will be implemented in
> the third milestone. I don't know yet if I will be able to implement
> constant folding, function inlining and/or moving invariants out of loops.
>
>
> Download, compile and test FAT Python with:
>
> hg clone http://hg.python.org/sandbox/fatpython
> ./configure && make && ./python -m test test_astoptimizer test_fat
>
>
> Currently, only 24 functions are specialized in the standard library.
> Calling a builtin function with constant arguments in not common (it was
> expected, it's only the first step for my optimizer). But 161 functions are
> specialized in tests.
>
>
> To be honest, I had to modify some tests to make them pass in FAT mode. But
> most changes are related to the .pyc filename, or to the exact size in bytes
> of dictionary objects.
>
> FAT Python is still experimental. Currently, the main bug is that the AST
> optimizer can optimize a call to a function which is not the expected
> builtin function. I already started to implement code to understand
> namespaces (detect global and local variables), but it's not enough yet to
> detect when a builtin is overriden. See TODO.rst for known bugs and
> limitations.
>
> Victor
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Second milestone of FAT Python

2015-11-04 Thread Maciej Fijalkowski
Uh, sorry, misread your full mail, scratch that

On Wed, Nov 4, 2015 at 9:07 AM, Maciej Fijalkowski <fij...@gmail.com> wrote:
> How do you check that someone did not e.g. bind something different to "len"?
>
> On Wed, Nov 4, 2015 at 8:50 AM, Victor Stinner <victor.stin...@gmail.com> 
> wrote:
>> Hi,
>>
>> I'm writing a new "FAT Python" project to try to implement optimizations in
>> CPython (inlining, constant folding, move invariants out of loops, etc.)
>> using a "static" optimizer (not a JIT). For the background, see the thread
>> on python-ideas:
>> https://mail.python.org/pipermail/python-ideas/2015-October/036908.html
>>
>> See also the documentation:
>> https://hg.python.org/sandbox/fatpython/file/tip/FATPYTHON.rst
>> https://hg.python.org/sandbox/fatpython/file/tip/ASTOPTIMIZER.rst
>>
>> I implemented the most basic optimization to test my code: replace calls to
>> builtin functions (with constant arguments) with the result. For example,
>> len("abc") is replaced with 3. I reached the second milestone: it's now
>> possible to run the full Python test suite with these optimizations enabled.
>> It confirms that the optimizations don't break the Python semantic.
>>
>> Example:
>> ---
>>>>> def func():
>> ... return len("abc")
>> ...
>>>>> import dis
>>>>> dis.dis(func)
>>   2   0 LOAD_GLOBAL  0 (len)
>>   3 LOAD_CONST   1 ('abc')
>>   6 CALL_FUNCTION1 (1 positional, 0 keyword pair)
>>   9 RETURN_VALUE
>>
>>>>> len(func.get_specialized())
>> 1
>>>>> specialized=func.get_specialized()[0]
>>>>> dis.dis(specialized['code'])
>>   2   0 LOAD_CONST   1 (3)
>>   3 RETURN_VALUE
>>>>> len(specialized['guards'])
>> 2
>>
>>>>> func()
>> 3
>>
>>>>> len=lambda obj: "mock"
>>>>> func()
>> 'mock'
>>>>> func.get_specialized()
>> []
>> ---
>>
>> The function func() has specialized bytecode which returns directly 3
>> instead of calling len("abc"). The specialized bytecode has two guards
>> dictionary keys: builtins.__dict__['len'] and globals()['len']. If one of
>> these keys is modified, the specialized bytecode is simply removed (when the
>> function is called) and the original bytecode is executed.
>>
>>
>> You cannot expect any speedup at this milestone, it's just to validate the
>> implementation. You can only get speedup if you implement *manually*
>> optimizations. See for example posixpath.isabs() which inlines manually the
>> call to the _get_sep() function. More optimizations will be implemented in
>> the third milestone. I don't know yet if I will be able to implement
>> constant folding, function inlining and/or moving invariants out of loops.
>>
>>
>> Download, compile and test FAT Python with:
>>
>> hg clone http://hg.python.org/sandbox/fatpython
>> ./configure && make && ./python -m test test_astoptimizer test_fat
>>
>>
>> Currently, only 24 functions are specialized in the standard library.
>> Calling a builtin function with constant arguments in not common (it was
>> expected, it's only the first step for my optimizer). But 161 functions are
>> specialized in tests.
>>
>>
>> To be honest, I had to modify some tests to make them pass in FAT mode. But
>> most changes are related to the .pyc filename, or to the exact size in bytes
>> of dictionary objects.
>>
>> FAT Python is still experimental. Currently, the main bug is that the AST
>> optimizer can optimize a call to a function which is not the expected
>> builtin function. I already started to implement code to understand
>> namespaces (detect global and local variables), but it's not enough yet to
>> detect when a builtin is overriden. See TODO.rst for known bugs and
>> limitations.
>>
>> Victor
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] compatibility for C-accelerated types

2015-10-20 Thread Maciej Fijalkowski
For what is worth, that level of differences already exists on pypy
and it's really hard to get the *exact* same semantics if things are
implemented in python vs C or the other way around.

Example list of differences (which I think OrderedDict already breaks
if moved to C):

* do methods like items call special methods like __getitem__ (I think
it's undecided anyway)

* what happens if you take a method and rebind it to another subclass,
does it automatically become a method (there are differences between
built in and pure python)

* atomicity of operations. Some operations used to be non-atomic in
Python will be atomic now.

I personally think those (and the __class__ issue) are unavoidable

On Mon, Oct 19, 2015 at 11:47 PM, Serhiy Storchaka  wrote:
> On 20.10.15 00:00, Guido van Rossum wrote:
>>
>> Apart from Serhiy's detraction of the 3.5 bug report there wasn't any
>> discussion in this thread. I also don't really see any specific
>> questions, so maybe you don't have any. Are you just asking whether it's
>> okay to merge your code? Or are you asking for more code review?
>
>
> I think Eric ask whether it's okay to have some incompatibility between
> Python and C implementations.
>
> 1. Is it okay to have a difference in effect of __class__ assignment. Pure
> Python and extension classes have different restrictions. For example
> (tested example this time) following code works with Python implementation
> in 3.4, but fails with C implementation in 3.5:
>
> from collections import OrderedDict
> od = OrderedDict()
> class D(dict): pass
>
> od.__class__ = D
>
> 2. Is it okay to use obj.__class__ in Python implementation and type(obj) in
> C implementation for the sake of code simplification? Can we ignore subtle
> differences?
>
> 3. In general, is it okay to have some incompatibility between Python and C
> implementations for the sake of code simplification, and where the border
> line lies?
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] An example of Python 3 promotion attitude

2015-10-06 Thread Maciej Fijalkowski
There was a discussion a while ago about python 3 and the attitude on
social media and there was a lack of examples. Here is one example:

https://www.reddit.com/r/Python/comments/3nl5ut/ninite_the_popular_website_to_install_essential/

According to some people, it is everybodys job to promote python 3 and
force people to upgrade. This is really not something I enjoy (people
telling me pypy should promote python 3 - it's not really our job).

Now I sometimes feel that there is not enough sentiment in python-dev
to distance from such ideas. It *is* python-dev job to promote
python3, but it's also python-dev job sometimes to point out that
whatever helps in promoting the python ecosystem (e.g. in case of pypy
is speed) is a good enough reason to do those things.

I wonder what are other people ideas about that.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue #25256: Add sys.debug_build?

2015-10-02 Thread Maciej Fijalkowski
Speaking of other python implementations - why would you even care?
(the pypy debug build has very different properties and does very
different stuff for example). I would be very happy to have this
clearly marked as implementation-dependent and that's why it would be
cool to not be in sys (there are already 5 symbols there for this
reason, so hasattr totalrefcount is cool enough)

On Fri, Oct 2, 2015 at 2:19 PM, Victor Stinner  wrote:
> 2015-10-02 13:16 GMT+02:00 Nir Soffer :
>> Whats wrong with:
>>
> sysconfig.get_config_var('Py_DEBUG')
>> 0
>
> Again, refer to my first message "On the Internet, I found various
> recipes to check if Python is compiled is debug mode. Sadly, some of
> them are not portable."
>
> I don't think that sysconfig.get_config_var('Py_DEBUG') will work on
> other Python implementations.
>
> On Windows, there is no such file like "Makefile" used to fill
> syscofig.get_config_vars() :-( sysconfig._init_non_posix() only fills
> a few variables like BINDIR or INCLUDEPY, but not Py_DEBUG.
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] semantics of subclassing things from itertools

2015-09-14 Thread Maciej Fijalkowski
Hey Raymond

I'm sorry you got insulted, that was not my intention. I suppose
something like "itertools objects are implemented as classes
internally, which means they're subclassable like other builtin types"
is an improvement to documentation.

On Mon, Sep 14, 2015 at 12:17 AM, Raymond Hettinger
<raymond.hettin...@gmail.com> wrote:
>
>> On Sep 13, 2015, at 3:09 PM, Maciej Fijalkowski <fij...@gmail.com> wrote:
>>
>> Well, fair enough, but the semantics of "whatever happens to happen
>> because we decided subclassing is a cool idea" is possibly the worst
>> answer to those questions.
>
> It's hard to read this in any way that isn't insulting.
>
> It was subclassable because a) it was a class, 2) type/class unification was
> pushing us in the direction of making builtin types more like regular classes
> (which are subclassable), and 3) because it seemed potentially useful
> to users (and apparently it has been because users are subclassing it).
>
> FWIW, the code was modeled on what was done for enumerate() and
> reversed() where I got a lot of coaching and review from Tim Peters,
> Alex Martelli, Fredrik Lundh, and other python luminaries of the day.
>
>
>> Ideally, make it non-subclassable. If you
>> want to have it subclassable, then please have defined semantics as
>> opposed to undefined.
>
> No, I'm not going to change a 13 year-old API and break existing user code
> just because you've gotten worked-up about it.
>
> FWIW, the semantics wouldn't even be defined in the itertools docs.
> It is properly in some section that describes what happens to any C type
> that defines sets the Py_TPFLAGS_BASETYPE flag.   In general, all of
> the exposed dunder methods are overridable or extendable by subclassers.
>
>
> Raymond
>
>
> P.S.  Threads like this are why I've developed an aversion to python-dev.
> I've answered your questions with respect and candor. I've been sympathetic
> to your unique needs as someone building an implementation of a language
> that doesn't have a spec.  I was apologetic that the docs which have been
> helpful to users weren't precise enough for your needs.
>
> In return, you've suggested that my first contributions to Python were
> irresponsible and based on doing whatever seemed cool.
>
> In fact, the opposite is the case.  I spent a full summer researching how 
> similar
> tools were used in other languages and fitting them into Python in a way that
> supported known use cases.  I raised the standard of the Python docs by
> including rough python equivalent code, showing sample inputs and outputs,
> building a quick navigation and summary section as the top of the docs,
> adding a recipes section, making thorough unittests, and getting input from 
> Alex,
> Tim, and Fredrik (Guido also gave high level advice on the module design).
>
> I'm not inclined to go on with this thread. Your questions have been answered
> to the extent that I remember the answers.  If you have a doc patch you want
> to submit, please assign it to me on the tracker.  I would be happy to review 
> it.
>
>
>
>
>
>
>
>
>
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] semantics of subclassing things from itertools

2015-09-13 Thread Maciej Fijalkowski
On Fri, Sep 11, 2015 at 1:48 AM, Raymond Hettinger
<raymond.hettin...@gmail.com> wrote:
>
>> On Sep 10, 2015, at 3:23 AM, Maciej Fijalkowski <fij...@gmail.com> wrote:
>>
>> I would like to know what are the semantics if you subclass something
>> from itertools (e.g. islice).
>>
>> Right now it's allowed and people do it, which is why the
>> documentation is incorrect. It states "equivalent to: a function-or a
>> generator", but you can't subclass whatever it is equivalent to, which
>> is why in PyPy we're unable to make it work in pure python.
>>
>> I would like some clarification on that.
>
> The docs should say "roughly equivalent to" not "exactly equivalent to".
> The intended purpose of the examples in the itertools docs is to use
> pure python code to help people better understand each tool.  It is not
> is intended to dictate that tool x is a generator or is a function.
>
> The intended semantics are that the itertools are classes (not functions
> and not generators).  They are intended to be sub-classable (that is
> why they have Py_TPFLAGS_BASETYPE defined).

Ok, so what's completely missing from the documentation is what *are*
the semantics of subclasses of those classes? Can you override any
magic methods? Can you override next (which is or isn't a magic method
depending how you look)? Etc.

The documentation on this is completely missing and it's left guessing
with "whatever cpython happens to be doing".
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] semantics of subclassing things from itertools

2015-09-13 Thread Maciej Fijalkowski
On Sun, Sep 13, 2015 at 5:46 PM, Raymond Hettinger
<raymond.hettin...@gmail.com> wrote:
>
>> On Sep 13, 2015, at 3:49 AM, Maciej Fijalkowski <fij...@gmail.com> wrote:
>>
>>> The intended semantics are that the itertools are classes (not functions
>>> and not generators).  They are intended to be sub-classable (that is
>>> why they have Py_TPFLAGS_BASETYPE defined).
>>
>> Ok, so what's completely missing from the documentation is what *are*
>> the semantics of subclasses of those classes? Can you override any
>> magic methods? Can you override next (which is or isn't a magic method
>> depending how you look)? Etc.
>>
>> The documentation on this is completely missing and it's left guessing
>> with "whatever cpython happens to be doing".
>
> The reason it is underspecified is that this avenue of development was
> never explored (not thought about, planned, used, tested, or documented).
> IIRC, the entire decision process for having Py_TPFLAGS_BASETYPE
> boiled down to a single question:  Was there any reason to close this
> door and make the itertools not subclassable?
>
> For something like NoneType, there was a reason to be unsubclassable;
> otherwise, the default choice was to give users maximum flexibility
> (the itertools were intended to be a generic set of building blocks,
> forming what Guido termed an "iterator algebra").
>
> As an implementor of another version of Python, you are reasonably
> asking the question, what is the specification for subclassing semantics?
> The answer is somewhat unsatisfying -- I don't know because I've
> never thought about it.  As far as I can tell, this question has never
> come up in the 13 years of itertools existence and you may be the
> first person to have ever cared about this.
>
>
> Raymond

Well, fair enough, but the semantics of "whatever happens to happen
because we decided subclassing is a cool idea" is possibly the worst
answer to those questions. Ideally, make it non-subclassable. If you
want to have it subclassable, then please have defined semantics as
opposed to undefined.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] semantics of subclassing things from itertools

2015-09-10 Thread Maciej Fijalkowski
Hi

I would like to know what are the semantics if you subclass something
from itertools (e.g. islice).

Right now it's allowed and people do it, which is why the
documentation is incorrect. It states "equivalent to: a function-or a
generator", but you can't subclass whatever it is equivalent to, which
is why in PyPy we're unable to make it work in pure python.

I would like some clarification on that.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] semantics of subclassing things from itertools

2015-09-10 Thread Maciej Fijalkowski
On Thu, Sep 10, 2015 at 10:26 AM, Serhiy Storchaka <storch...@gmail.com> wrote:
> On 10.09.15 10:23, Maciej Fijalkowski wrote:
>>
>> I would like to know what are the semantics if you subclass something
>> from itertools (e.g. islice).
>>
>> Right now it's allowed and people do it, which is why the
>> documentation is incorrect. It states "equivalent to: a function-or a
>> generator", but you can't subclass whatever it is equivalent to, which
>> is why in PyPy we're unable to make it work in pure python.
>>
>> I would like some clarification on that.
>
>
> There is another reason why itertools iterators can't be implemented as
> simple generator functions. All iterators are pickleable in 3.x.

maybe the documentation should reflect that? (note that generators are
pickleable on pypy anyway)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] tp_finalize vs tp_del sematics

2015-09-03 Thread Maciej Fijalkowski
On Thu, Sep 3, 2015 at 9:23 AM, Valentine Sinitsyn
 wrote:
> Hi Armin,
>
> On 25.08.2015 13:00, Armin Rigo wrote:
>>
>> Hi Valentine,
>>
>> On 25 August 2015 at 09:56, Valentine Sinitsyn
>>  wrote:

 Yes, I think so.  There is a *highly obscure* corner case: __del__
 will still be called several times if you declare your class with
 "__slots__=()".
>>>
>>>
>>> Even on "post-PEP-0442" Python 3.4+? Could you share a link please?
>>
>>
>> class X(object):
>>  __slots__=() # <= try with and without this
>>  def __del__(self):
>>  global revive
>>  revive = self
>>  print("hi")
>>
>> X()
>> revive = None
>> revive = None
>> revive = None
>
> By accident, I found a solution to this puzzle:
>
> class X(object):
> __slots__ = ()
>
> class Y(object):
> pass
>
> import gc
> gc.is_tracked(X())  # False
> gc.is_tracked(Y())  # True
>
> An object with _empty_ slots is naturally untracked, as it can't create back
> references.
>
> Valentine
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

That does not make it ok to have del called several time, does it?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Profile Guided Optimization active by-default

2015-08-25 Thread Maciej Fijalkowski

 Interesting.  So pypy (with it's profiling JIT) would be in a similar boat,
 potentially.


PGO and what pypy does have pretty much nothing to do with each other.
I'm not sure what do you mean by similar boat
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Branch Prediction And The Performance Of Interpreters - Don't Trust Folklore

2015-08-10 Thread Maciej Fijalkowski
On Mon, Aug 10, 2015 at 4:44 PM, Larry Hastings la...@hastings.org wrote:


 This just went by this morning on reddit's /r/programming.  It's a paper
 that analyzed Python--among a handful of other languages--to answer the
 question are branch predictors still that bad at the big switch statement
 approach to interpreters?  Their conclusion: no.

 Our simulations [...] show that, as long as the payload in the bytecode
 remains limited and do not feature significant amount of extra indirect
 branches, then the misprediction rate on the interpreter can be even become
 insignificant (less than 0.5 MPKI).

 (MPKI = missed predictions per thousand instructions)

 Their best results were on simulated hardware with state-of-the-art
 prediction algorithms (TAGE and ITTAGE), but they also demonstrate that
 branch predictors in real hardware are getting better quickly.  When running
 the Unladen Swallow test suite on Python 3.3.2, compiled with
 USE_COMPUTED_GOTOS turned off, Intel's Nehalem experienced an average of
 12.8 MPKI--but Sandy Bridge drops that to 3.5 MPKI, and Haswell reduces it
 further to a mere *1.4* MPKI.  (AFAICT they didn't compare against Python
 3.3.2 using computed gotos, either in terms of MPKI or in overall
 performance.)

 The paper is here:

 https://hal.inria.fr/hal-01100647/document


 I suppose I wouldn't propose removing the labels-as-values opcode dispatch
 code yet.  But perhaps that day is in sight!


 /arry

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com


Hi Larry

Please also note that as far as I can tell this mostly applies to x86.
The ARM branch prediction is significantly dumber these days and as
long as python performance is considered on such platforms such tricks
do make the situation better. We found it out doing CPython/PyPy
comparison, where the difference PyPy vs cPython was bigger on ARM and
smaller on x86, despite our ARM assembler that we produce being less
well optimized.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python automatic optimization

2015-07-23 Thread Maciej Fijalkowski
As far as I can tell, the feedback directed optimizations don't give
much speedup on Python. There is a variety of tools for help: cython,
numba, pypy, numpy etc. if you care about performance of mathematical
operations.

On Thu, Jul 23, 2015 at 9:04 PM, Andrew Steinberg via Python-Dev
python-dev@python.org wrote:
 Hello everybody,

 I am using Python 2.7 as a backbone for some mathematical simulations. I
 recently discovered a tool called AutoFDO and I tried compiling my own
 Python version, but I did not manage to get it working. My question is, will
 sometime in the future Python include this tool?

 Thank you,
 Andrew

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Tighten-up code in the set iterator to use an entry pointer rather than

2015-07-07 Thread Maciej Fijalkowski
I must say I completely fail to understand the procedures under which
python is developed. If the change (unreviewed, just randomly applied)
causes crashes, then surely it should be reverted first and continued
on bug tracker instead of lingering (and the complain sitting on bug
tracker)?

On Tue, Jul 7, 2015 at 10:10 AM, Serhiy Storchaka storch...@gmail.com wrote:
 On 07.07.15 10:42, Serhiy Storchaka wrote:

 On 07.07.15 05:03, raymond.hettinger wrote:

 https://hg.python.org/cpython/rev/c9782a9ac031
 changeset:   96865:c9782a9ac031
 user:Raymond Hettinger pyt...@rcn.com
 date:Mon Jul 06 19:03:01 2015 -0700
 summary:
Tighten-up code in the set iterator to use an entry pointer rather
 than indexing.


 What if so-table was reallocated during the iteration, but so-used is
 left the same? This change looks unsafe to me.


 There is crash reproducer.

 http://bugs.python.org/issue24581


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Tighten-up code in the set iterator to use an entry pointer rather than

2015-07-07 Thread Maciej Fijalkowski
On Tue, Jul 7, 2015 at 2:14 PM, Guido van Rossum gu...@python.org wrote:
 FYI, do we have any indication that Raymond even read the comment? IIRC he
 doesn't regularly read python-dev. I also don't think code review comments
 ought to go to python-dev; the commiters list would seem more appropriate?
 (Though it looks like python-checkins is configured to direct replies to
 python-dev. Maybe we need to revisit that?)

I kind of thought that python does pre-commit reviews (at least seems
to apply to most people), so in case someone is completely exempt from
that, maybe he should read python-dev or wherever the reply is set to?
That also does not explain why a crashing commit has not been
reverted.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Tighten-up code in the set iterator to use an entry pointer rather than

2015-07-07 Thread Maciej Fijalkowski
On Tue, Jul 7, 2015 at 3:08 PM, Serhiy Storchaka storch...@gmail.com wrote:
 On 07.07.15 15:32, Maciej Fijalkowski wrote:

 I kind of thought that python does pre-commit reviews (at least seems
 to apply to most people), so in case someone is completely exempt from
 that, maybe he should read python-dev or wherever the reply is set to?
 That also does not explain why a crashing commit has not been
 reverted.


 There is no haste. Only developed branch is affected and we have enough time
 to fix it. No buildbots is broken. Just rolling back this changeset can be
 impossible because Raymond committed other changes after it. I'm not sure
 that this changeset is culprit, it can be previous one. Raymond is the most
 experienced person in this file, and writing good fix that conform to
 Raymond's view by other person can take more time than the time that is
 needed to Raymond to awake and read this topic.

Then maybe a good option would be to add the crasher to the test
suite, so the buildbots *are* actually broken showing the problem
exists?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] speed.python.org (was: 2.7 is here until 2020, please don't call it a waste.)

2015-06-04 Thread Maciej Fijalkowski
On Thu, Jun 4, 2015 at 4:32 PM, R. David Murray rdmur...@bitdance.com wrote:
 On Thu, 04 Jun 2015 12:55:55 +0200, M.-A. Lemburg m...@egenix.com wrote:
 On 04.06.2015 04:08, Tetsuya Morimoto wrote:
  If someone were to volunteer to set up and run speed.python.org, I think
  we could add some additional focus on performance regressions. Right now,
  we don't have any way of reliably and reproducibly testing Python
  performance.
 
  I'm very interested in speed.python.org and feel regret that the project is
  standing still. I have a mind to contribute something ...

 On 03.06.2015 18:59, Maciej Fijalkowski wrote: On Wed, Jun 3, 2015 at 3:49 
 PM, R. David Murray
  I think we should look into getting speed.python.org up and
  running for both Python 2 and 3 branches:
 
   https://speed.python.org/
 
  What would it take to make that happen ?
 
  I guess ideal would be some cooperation from some of the cpython devs,
  so say someone can setup cpython buildbot
 
  What does set up cpython buildbot mean in this context?
 
  The way it works is dual - there is a program running the benchmarks
  (the runner) which is in the pypy case run by the pypy buildbot and
  the web side that reports stuff. So someone who has access to cpython
  buildbot would be useful.

 (I don't seem to have gotten a copy of Maciej's message, at least not
 yet.)

 OK, so what you are saying is that speed.python.org will run a buildbot
 slave so that when a change is committed to cPython, a speed run will be
 triggered?  Is the runner a normal buildbot slave, or something
 custom?  In the normal case the master controls what the slave
 runs...but regardless, you'll need to let us know how the slave
 invocation needs to be configured on the master.

Ideally nightly (benchmarks take a while). The setup for pypy looks like this:


https://bitbucket.org/pypy/buildbot/src/5fa1f1a4990f842dfbee416c4c2e2f6f75d451c4/bot2/pypybuildbot/builds.py?at=default#cl-734

so fairly easy. This already generates a json file that you can plot.
We can setup an upload automatically too.



 Ok, so there's interest and we have at least a few people who are
 willing to help.

 Now we need someone to take the lead on this and form a small
 project group to get everything implemented. Who would be up
 to such a task ?

 The speed project already has a mailing list, so you could use
 that for organizing the details.

 If it's a low volume list I'm willing to sign up, but regardless I'm
 willing to help with the buildbot setup on the CPython side.  (As soon
 as my credential-update request gets through infrastructure, at least :)

 --David
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.7 is here until 2020, please don't call it a waste.

2015-06-03 Thread Maciej Fijalkowski
On Wed, Jun 3, 2015 at 11:38 AM, M.-A. Lemburg m...@egenix.com wrote:
 On 02.06.2015 21:07, Maciej Fijalkowski wrote:
 Hi

 There was a PSF-sponsored effort to improve the situation with the
 https://bitbucket.org/pypy/codespeed2/src being written (thank you
 PSF). It's not better enough than codespeed that I would like, but
 gives some opportunities.

 That said, we have a benchmark machine for benchmarking cpython and I
 never deployed nightly benchmarks of cpython for a variety of reasons.

 * would be cool to get a small VM to set up the web front

 * people told me that py3k is only interesting, so I did not set it up
 for py3k because benchmarks are mostly missing

 I'm willing to set up a nightly speed.python.org using nightly build
 on python 2 and possible python 3 if there is an interest. I need
 support from someone maintaining python buildbot to setup builds and a
 VM to set up stuff, otherwise I'm good to go

 DISCLAIMER: I did facilitate in codespeed rewrite that was not as
 successful as I would have hoped. I did not receive any money from the
 PSF on that though.

 I think we should look into getting speed.python.org up and
 running for both Python 2 and 3 branches:

  https://speed.python.org/

 What would it take to make that happen ?

I guess ideal would be some cooperation from some of the cpython devs,
so say someone can setup cpython buildbot
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.7 is here until 2020, please don't call it a waste.

2015-06-03 Thread Maciej Fijalkowski
On Wed, Jun 3, 2015 at 3:49 PM, R. David Murray rdmur...@bitdance.com wrote:
 On Wed, 03 Jun 2015 12:04:10 +0200, Maciej Fijalkowski fij...@gmail.com 
 wrote:
 On Wed, Jun 3, 2015 at 11:38 AM, M.-A. Lemburg m...@egenix.com wrote:
  On 02.06.2015 21:07, Maciej Fijalkowski wrote:
  Hi
 
  There was a PSF-sponsored effort to improve the situation with the
  https://bitbucket.org/pypy/codespeed2/src being written (thank you
  PSF). It's not better enough than codespeed that I would like, but
  gives some opportunities.
 
  That said, we have a benchmark machine for benchmarking cpython and I
  never deployed nightly benchmarks of cpython for a variety of reasons.
 
  * would be cool to get a small VM to set up the web front
 
  * people told me that py3k is only interesting, so I did not set it up
  for py3k because benchmarks are mostly missing
 
  I'm willing to set up a nightly speed.python.org using nightly build
  on python 2 and possible python 3 if there is an interest. I need
  support from someone maintaining python buildbot to setup builds and a
  VM to set up stuff, otherwise I'm good to go
 
  DISCLAIMER: I did facilitate in codespeed rewrite that was not as
  successful as I would have hoped. I did not receive any money from the
  PSF on that though.
 
  I think we should look into getting speed.python.org up and
  running for both Python 2 and 3 branches:
 
   https://speed.python.org/
 
  What would it take to make that happen ?

 I guess ideal would be some cooperation from some of the cpython devs,
 so say someone can setup cpython buildbot

 What does set up cpython buildbot mean in this context?

The way it works is dual - there is a program running the benchmarks
(the runner) which is in the pypy case run by the pypy buildbot and
the web side that reports stuff. So someone who has access to cpython
buildbot would be useful.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.7 is here until 2020, please don't call it a waste.

2015-06-02 Thread Maciej Fijalkowski
Hi

There was a PSF-sponsored effort to improve the situation with the
https://bitbucket.org/pypy/codespeed2/src being written (thank you
PSF). It's not better enough than codespeed that I would like, but
gives some opportunities.

That said, we have a benchmark machine for benchmarking cpython and I
never deployed nightly benchmarks of cpython for a variety of reasons.

* would be cool to get a small VM to set up the web front

* people told me that py3k is only interesting, so I did not set it up
for py3k because benchmarks are mostly missing

I'm willing to set up a nightly speed.python.org using nightly build
on python 2 and possible python 3 if there is an interest. I need
support from someone maintaining python buildbot to setup builds and a
VM to set up stuff, otherwise I'm good to go

DISCLAIMER: I did facilitate in codespeed rewrite that was not as
successful as I would have hoped. I did not receive any money from the
PSF on that though.

Cheers,
fijal


On Mon, Jun 1, 2015 at 1:14 PM, M.-A. Lemburg m...@egenix.com wrote:
 On 01.06.2015 12:44, Armin Rigo wrote:
 Hi Larry,

 On 31 May 2015 at 01:20, Larry Hastings la...@hastings.org wrote:
 p.s. Supporting this patch also helps cut into PyPy's reported performance
 lead--that is, if they ever upgrade speed.pypy.org from comparing against
 Python *2.7.2*.

 Right, we should do this upgrade when 2.7.11 is out.

 There is some irony in your comment which seems to imply PyPy is
 cheating by comparing with an old Python 2.7.2: it is inside a thread
 which started because we didn't backport performance improvements to
 2.7.x so far.

 Just to convince myself, I just ran a performance comparison.  I ran
 the same benchmark suite as speed.pypy.org, with 2.7.2 against 2.7.10,
 both freshly compiled with no configure options at all.  The
 differences are usually in the noise, but range from +5% to... -60%.
 If anything, this seems to show that CPython should take more care
 about performance regressions.  If someone is interested:

 * raytrace-simple is 1.19 times slower
 * bm_mako is 1.29 times slower
 * spitfire_cstringio is 1.60 times slower
 * a number of other benchmarks are around 1.08.

 The 7.0x faster number on speed.pypy.org would be significantly
 *higher* if we upgraded the baseline to 2.7.10 now.

 If someone were to volunteer to set up and run speed.python.org,
 I think we could add some additional focus on performance
 regressions. Right now, we don't have any way of reliably
 and reproducibly testing Python performance.

 Hint: The PSF would most likely fund such adventures :-)

 --
 Marc-Andre Lemburg
 eGenix.com

 Professional Python Services directly from the Source  (#1, Jun 01 2015)
 Python Projects, Coaching and Consulting ...  http://www.egenix.com/
 mxODBC Plone/Zope Database Adapter ...   http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/
 

 : Try our mxODBC.Connect Python Database Interface for free ! ::

eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Computed Goto dispatch for Python 2

2015-05-28 Thread Maciej Fijalkowski
 I'm -1 on the idea because:

 * Performance improvements are not bug fixes
 * The patch doesn't make the migration process from Python 2 to Python 3 
 easier

And this is why people have been porting Python applications to Go.
Maybe addressing Python performance and making Python (2 or 3) a
better language/platform would mitigate that.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ctypes module

2015-04-08 Thread Maciej Fijalkowski
I presume the reason was that noone wants to maintain code for the
case where there are no buildbots available and there is no
development time available. You are free to put back in the files and
see if they work (they might not), but such things are usually removed
if they're a maintenance burden. I would be happy to assist you with
finding someone willing to do commercial maintenance of ctypes for
itanium, but asking python devs to do it for free is a bit too much.

Cheers,
fijal

On Tue, Apr 7, 2015 at 9:58 PM, Cristi Fati cristifa...@gmail.com wrote:
 Hi all,

 Not sure whether you got this question, or this is the right distribution
 list:

 Intel has deprecated Itanium architecture, and Windows also deprecated its
 versions(currently 2003 and 2008) that run on IA64.

 However Python (2.7.3) is compilable on Windows IA64, but ctypes module
 (1.1.0) which is now part of Python is not (the source files have been
 removed). What was the reason for its disablement?

 I am asking because an older version of ctypes (1.0.2) which came as a
 separate extension module (i used to compile it with Python 2.4.5) was
 available for WinIA64; i found (and fixed) a nasty buffer overrun in it.

 Regards,
 Cristi Fati.


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ctypes module

2015-04-08 Thread Maciej Fijalkowski
for the record libffi supports itanium officially (but as usual I'm
very skeptical how well it works on less used platforms)
https://sourceware.org/libffi/

On Wed, Apr 8, 2015 at 1:32 PM, Nick Coghlan ncogh...@gmail.com wrote:
 On 8 April 2015 at 20:36, Maciej Fijalkowski fij...@gmail.com wrote:
 I presume the reason was that noone wants to maintain code for the
 case where there are no buildbots available and there is no
 development time available. You are free to put back in the files and
 see if they work (they might not), but such things are usually removed
 if they're a maintenance burden. I would be happy to assist you with
 finding someone willing to do commercial maintenance of ctypes for
 itanium, but asking python devs to do it for free is a bit too much.

 As a point of reference, even Red Hat dropped Itanium support for
 RHEL6+ - you have to go all the way back to RHEL5 to find a version we
 still support running on Itanium.

 For most of CPython, keeping it running on arbitrary architectures
 often isn't too difficult, as libc abstracts away a lot of the
 hardware details. libffi (and hence ctypes) are notable exceptions to
 that :)

 Cheers,
 Nick.

 --
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] libffi embedded in CPython

2015-03-25 Thread Maciej Fijalkowski
On Tue, Mar 24, 2015 at 11:31 PM, Paul Moore p.f.mo...@gmail.com wrote:
 On 12 March 2015 at 17:44, Paul Moore p.f.mo...@gmail.com wrote:
 On 12 March 2015 at 17:26, Brett Cannon br...@python.org wrote:
 I'm all for ditching our 'libffi_msvc' in favor of adding libffi as
 another 'external' for the Windows build.  I have managed to get
 _ctypes to build on Windows using vanilla libffi sources, prepared
 using their configure script from within Git Bash and built with our
 usual Windows build system (properly patched).  Unfortunately, making
 things usable will take some work on ctypes itself, which I'm not
 qualified to do. I'm happy to pass on my procedure and patches for
 getting to the point of successful compilation to anyone who feels up
 to fixing the things that are broken.


 So it seems possible to use upstream libffi but will require some work.

 I'd be willing to contemplate helping out on the Windows side of
 things, if nobody else steps up (with the proviso that I have little
 free time, and I'm saying this without much idea of what's involved
 :-)) If Zachary can give a bit more detail on what the work on ctypes
 is, and/or put what he has somewhere that I could have a look at, that
 might help.

 One thing that seems to be an issue. On Windows, ctypes detects if the
 FFI call used the wrong number of arguments off the stack, and raises
 a ValueError if it does. The tests rely on that behaviour. But it's
 based on ffi_call() returning a value, which upstream libffi doesn't
 do. As far as I can tell (not that the libffi docs are exactly
 comprehensive...) there's no way of getting that information from
 upstream libffi.

 What does Unix ctypes do when faced with a call being made with the
 wrong number of arguments? On Windows, using upstream libffi and
 omitting the existing check, it seems to crash the Python process,
 which obviously isn't good. But the test that fails is
 Windows-specific, and short of going through all the tests looking for
 one that checks passing the wrong number of arguments and isn't
 platform-specific, I don't know how Unix handles this.

 Can anyone on Unix tell me if a ctypes call with the wrong number of
 arguments returns ValueError on Unix? Something like strcmp() (with no
 args) should do as a test, I guess...

 If there's a way Unix handles this, I can see about replicating it on
 Windows. But if there isn't, I fear we could always need a patched
 libffi to maintain the interface we currently have...

 Thanks,
 Paul

Linux crashes. The mechanism for detecting the number of arguments is
only available on windows (note that this is a band-aid anyway, since
if your arguments are of the wrong kind you segfault anyway). We do
have two copies of libffi one for windows one for unix anyway, don't
we?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] libffi embedded in CPython

2015-03-12 Thread Maciej Fijalkowski
On Thu, Mar 12, 2015 at 8:35 PM, Ned Deily n...@acm.org wrote:
 In article
 CAP1=2w7cx5jpqv_pr61rqs1ubusjf5f6kg0cd-qcwr2+9ij...@mail.gmail.com,
 For UNIX OSs we could probably rely on the system libffi then. What's the
 situation on OS X? Anyone know if it has libffi, or would be need to be
 pulled in to be used like on Windows?

 Ronald (in http://bugs.python.org/issue23534):
 On OSX the internal copy of libffi that's used is based on the one in
 PyObjC, which in turn is based on the version of libffi on
 opensource.apple.com (IIRC with some small patches that fix minor issues
 found by the PyObjC testsuite).

 --
  Ned Deily,
  n...@acm.org

From pypy experience, libffi installed on OS X tends to just work (we
never had any issues with those)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (ctypes example) libffi embedded in CPython

2015-03-11 Thread Maciej Fijalkowski
On Wed, Mar 11, 2015 at 8:17 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 11 Mar 2015 19:05:57 +0100
 Antoine Pitrou solip...@pitrou.net wrote:
  
   But they are not ctypes. For example, cffi wouldn't be obvious to use
   for interfacing with non-C code, since it requires you to write C-like
   declarations.
 
  You mean like Fortran? Or what precisely?

 Any toolchain that can generate native code. It can be Fortran, but it
 can also be code generated at runtime without there being any external
 declaration. Having to generate C declarations for such code would be
 a distraction.

 For instance, you can look at the compiler example that Eli wrote using
 llvmlite. It implements a JIT compiler for a toy language. The
 JIT-compiled function is then declared and called using a simple ctypes
 declaration:

 https://github.com/eliben/pykaleidoscope/blob/master/chapter7.py#L937

 Regards

 Antoine.

It might be a matter of taste, but I don't find declaring C functions
any more awkward than using strange interface that ctypes comes with.
the equivalent in cffi would be ffi.cast(double (*)(), x)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] libffi embedded in CPython

2015-03-11 Thread Maciej Fijalkowski
On Wed, Mar 11, 2015 at 11:34 PM, Victor Stinner
victor.stin...@gmail.com wrote:

 Le 11 mars 2015 18:29, Brett Cannon br...@python.org a écrit :
 I'm going to propose a somewhat controversial idea: let's deprecate the
 ctypes module.

 In the past I tried to deprecate many functions or modules because they are
 rarely or never used. Many developers prefered to keep them. By the way, I
 still want to remove plat-xxx modules like IN or CDROM :-)

 Getopt was deprecated when optparse was added to the stdlib. Then optparse
 was deprecated when argparse was added to the stdlib.

 Cython and cffi are not part of the stdlib and can be hard to install on
 some platforms. Ctypes is cool because it doesn't require C headers nor a C
 compiler.

 Is it possible to use cffi without a C compiler/headers as easily than
 ctypes?

yes, it has two modes, one that does that and the other that does
extra safety at the cost of a C compiler
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] libffi embedded in CPython

2015-03-11 Thread Maciej Fijalkowski
On Wed, Mar 11, 2015 at 8:31 PM, Wes Turner wes.tur...@gmail.com wrote:

 On Mar 11, 2015 12:55 PM, Maciej Fijalkowski fij...@gmail.com wrote:

 On Wed, Mar 11, 2015 at 7:50 PM, Antoine Pitrou solip...@pitrou.net
 wrote:
  On Wed, 11 Mar 2015 17:27:58 +
  Brett Cannon br...@python.org wrote:
 
  Did anyone ever step forward to do this? I'm a bit worried about the
  long-term viability of ctypes if we don't have a maintainer or at least
  someone making sure we are staying up-to-date with upstream libffi. The
  ctypes module is a dangerous thing, so having a chunk of C code that
  isn't
  being properly maintained seems to me to make it even more dangerous.
 
  Depends what you call dangerous. C code doesn't rot quicker than pure
  Python code :-) Also, libffi really offers a wrapper around platform
  ABIs, which rarely change.

 And yet, lesser known ABIs in libffi contain bugs (as we discovered
 trying to work there with anything else than x86 really). Also there
 *are* ABI differencies that change slowly over time (e.g. requiring
 stack to be 16 byte aligned)

 Are there tests for this?


What do you mean? The usual failure mode is will segfault every now
and again if the moon is in the right position (e.g. the stack
alignment thing only happens if the underlaying function uses certain
SSE instructions that compilers emit these days in certain
circumstances)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] libffi embedded in CPython

2015-03-11 Thread Maciej Fijalkowski
On Wed, Mar 11, 2015 at 8:05 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 11 Mar 2015 19:54:58 +0200
 Maciej Fijalkowski fij...@gmail.com wrote:
 
  Depends what you call dangerous. C code doesn't rot quicker than pure
  Python code :-) Also, libffi really offers a wrapper around platform
  ABIs, which rarely change.

 And yet, lesser known ABIs in libffi contain bugs (as we discovered
 trying to work there with anything else than x86 really). Also there
 *are* ABI differencies that change slowly over time (e.g. requiring
 stack to be 16 byte aligned)

 Well, sure. The point is, such bugs are unlikely to appear at a fast
 rate... Also, I don't understand why libffi issues would affect cffi
 any less than it affects ctypes, at least in the compiler-less mode of
 operation.

My point here was only about shipping own libffi vs using the system
one (and it does affect cffi equally with or without compiler)


  We now have things like cffi and Cython for people who need
  to interface with C code. Both of those projects are maintained. And they
  are not overly difficult to work with.
 
  But they are not ctypes. For example, cffi wouldn't be obvious to use
  for interfacing with non-C code, since it requires you to write C-like
  declarations.

 You mean like Fortran? Or what precisely?

 Any toolchain that can generate native code. It can be Fortran, but it
 can also be code generated at runtime without there being any external
 declaration. Having to generate C declarations for such code would be
 a distraction.

 Of course, if cffi gains the same ability as ctypes (namely to lookup
 a function and declare its signature without going through the
 FFI.cdef() interface), that issue disappears.

 As a side note, ctypes has a large number of users, so even if it were
 deprecated that wouldn't be a good reason to stop maintaining it.

 And calling cffi simple while it relies on a parser of the C language
 (which would then have to be bundled with Python) is a bit misleading
 IMO.

 Regards

 Antoine.
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] libffi embedded in CPython

2015-03-11 Thread Maciej Fijalkowski
On Wed, Mar 11, 2015 at 7:50 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 11 Mar 2015 17:27:58 +
 Brett Cannon br...@python.org wrote:

 Did anyone ever step forward to do this? I'm a bit worried about the
 long-term viability of ctypes if we don't have a maintainer or at least
 someone making sure we are staying up-to-date with upstream libffi. The
 ctypes module is a dangerous thing, so having a chunk of C code that isn't
 being properly maintained seems to me to make it even more dangerous.

 Depends what you call dangerous. C code doesn't rot quicker than pure
 Python code :-) Also, libffi really offers a wrapper around platform
 ABIs, which rarely change.

And yet, lesser known ABIs in libffi contain bugs (as we discovered
trying to work there with anything else than x86 really). Also there
*are* ABI differencies that change slowly over time (e.g. requiring
stack to be 16 byte aligned)


 I'm going to propose a somewhat controversial idea: let's deprecate the
 ctypes module.

 This is gratuitous.

I'm +1 on deprecating ctypes


 We now have things like cffi and Cython for people who need
 to interface with C code. Both of those projects are maintained. And they
 are not overly difficult to work with.

 But they are not ctypes. For example, cffi wouldn't be obvious to use
 for interfacing with non-C code, since it requires you to write C-like
 declarations.

You mean like Fortran? Or what precisely?

 I don't understand why cffi would be safer than ctypes. At least not in
 the operation mode where it doesn't need to invoke a C compiler.
 Cython is a completely different beast, it requires a separate
 compilation pass which makes it useless in some situations.


Our main motivation for safer is comes with less magic and less
gotchas, which also means does less stuff. It's also smaller.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] libffi embedded in CPython

2015-03-11 Thread Maciej Fijalkowski
On Thu, Mar 12, 2015 at 12:20 AM, Brett Cannon br...@python.org wrote:


 On Wed, Mar 11, 2015 at 6:03 PM Paul Moore p.f.mo...@gmail.com wrote:

 On 11 March 2015 at 21:45, Maciej Fijalkowski fij...@gmail.com wrote:
  Is it possible to use cffi without a C compiler/headers as easily than
  ctypes?
 
  yes, it has two modes, one that does that and the other that does
  extra safety at the cost of a C compiler

 So if someone were to propose a practical approach to including cffi
 into the stdlib, *and* assisting the many Windows projects using
 ctypes for access to the Windows API [1], then there may be a
 reasonable argument for deprecating ctypes. But nobody seems to be
 doing that, rather the suggestion appears to be just to deprecate a
 widely used part of the stdlib offering no migration path :-(


 You're ignoring that it's not maintained, which is the entire reason I
 brought this up. No one seems to want to touch the code. Who knows what
 improvements, bugfixes, etc. exist upstream in libffi that we lack because
 no one wants to go through and figure it out. If someone would come forward
 and help maintain it then I have no issue with it sticking around.

It's a bit worse than that. Each time someone wants to touch the code
(e.g. push back the upstream libffi fixes), there is we need to
review it, but there is noone to do it, noone knows how it works,
don't touch it kind of feedback, which leads to disincentives to
potential maintainers. I would be likely willing to rip off the libffi
from CPython as it is for example (and just use the upstream one)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] libffi embedded in CPython

2015-03-11 Thread Maciej Fijalkowski
On Thu, Mar 12, 2015 at 12:31 AM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 11 Mar 2015 23:10:14 +0200
 Maciej Fijalkowski fij...@gmail.com wrote:
 
  Well, sure. The point is, such bugs are unlikely to appear at a fast
  rate... Also, I don't understand why libffi issues would affect cffi
  any less than it affects ctypes, at least in the compiler-less mode of
  operation.

 My point here was only about shipping own libffi vs using the system
 one (and it does affect cffi equally with or without compiler)

 So what? If ctypes used the system libffi as cffi does, it would by
 construction be at least portable as cffi is.  The only reason the
 bundled libffi was patched at some point was to be *more* portable than
 vanilla libffi is.

 So, really, I don't see how switching from ctypes to cffi solves any of
 this.

You're missing my point. Ripping off the libffi from CPython is a good
idea to start with. Maybe deprecating ctypes is *also* a good idea,
but it's a separate discussion point. It certainly does not solve the
libffi problem.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] boxing and unboxing data types

2015-03-09 Thread Maciej Fijalkowski
Not all your examples are good.

* float(x) calls __float__ (not __int__)

* re.group requires __eq__ (and __hash__)

* I'm unsure about OSError

* the % thing at the very least works on pypy

On Mon, Mar 9, 2015 at 8:07 AM, Serhiy Storchaka storch...@gmail.com wrote:
 On 09.03.15 06:33, Ethan Furman wrote:

 I guess it could boil down to:  if IntEnum was not based on 'int', but
 instead had the __int__ and __index__ methods
 (plus all the other __xxx__ methods that int has), would it still be a
 drop-in replacement for actual ints?  Even when
 being used to talk to non-Python libs?


 If you don't call isinstance(x, int) (PyLong_Check* in C).

 Most conversions from Python to C implicitly call __index__ or __int__, but
 unfortunately not all.

 float(Thin(42))
 42.0
 float(Wrap(42))
 Traceback (most recent call last):
   File stdin, line 1, in module
 TypeError: float() argument must be a string or a number, not 'Wrap'

 '%*s' % (Thin(5), 'x')
 'x'
 '%*s' % (Wrap(5), 'x')
 Traceback (most recent call last):
   File stdin, line 1, in module
 TypeError: * wants int

 OSError(Thin(2), 'No such file or directory')
 FileNotFoundError(2, 'No such file or directory')
 OSError(Wrap(2), 'No such file or directory')
 OSError(__main__.Wrap object at 0xb6fe81ac, 'No such file or directory')

 re.match('(x)', 'x').group(Thin(1))
 'x'
 re.match('(x)', 'x').group(Wrap(1))
 Traceback (most recent call last):
   File stdin, line 1, in module
 IndexError: no such group

 And to be ideal drop-in replacement IntEnum should override such methods as
 __eq__ and __hash__ (so it could be used as mapping key). If all methods
 should be overridden to quack as int, why not take an int?



 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-08 Thread Maciej Fijalkowski
I'm working on vmprof (github.com/vmprof/vmprof-python) which works
for both cpython and pypy (pypy has special support, cpython is
patched on-the fly)

On Sun, Feb 8, 2015 at 6:39 AM, Gregory P. Smith g...@krypto.org wrote:
 To get at the Python thread state from a signal handler (using 2.7 as a
 reference here; but i don't believe 3.4 has changed this part much) you need
 to modify the interpreter to expose pystate.c's autoTLSkey and thread.c's
 struct key as well as keyhead and keymutex.

 From there, in your signal handler you must try to acquire the newly exposed
 keymutex and do nothing if you were unable to get it.  If you did acquire it
 (rare not to), you can walk the keyhead list looking for autoTLSkey to find
 the current valid thread state.

 I had an intern (hi Elena!) write a signal sampling based low overhead
 Python CPU profiler based on that last summer. I believe there are still
 bugs to shaken out (if they are even possible to fix... Armin's comments are
 true: signal handler code is super limited). I am stating this here because
 I want someone to pester me at PyCon if I haven't released our work as a
 proof of concept by then. The important take away: From what I could figure
 out, you need to modify the CPython interpreter to be more amenable to such
 introspection.

 A downside of a signal based profiler: *ALL* of the EINTR mishandling bugs
 within the Python interpreter, stdlib, and your own code will show up in
 your application. So until those are fixed (hooray for Antoine's PEP!), it
 may not be practical for use on production processes which is sort of the
 entire point of a low overhead sampling profiler...

 I'd like to get a buildbot setup that runs the testsuite while a continual
 barrage of signals are being generated. We really don't stress test that
 stuff (as evidence by the EINTR mishandling issues that are rampant) as
 non-fatal signals are so rare for most things... until they aren't.

 As a side note and encouragement: I wonder what PyPy could do for
 dynamically enabled and disabled low overhead CPU profiling. (take that as a
 hint that I want someone else to get extremely creative!)

 -gps

 On Sat Feb 07 2015 at 1:34:26 PM Greg Ewing greg.ew...@canterbury.ac.nz
 wrote:

 Maciej Fijalkowski wrote:
  However, you can't access thread
  locals from signal handlers (since in some cases it mallocs, thread
  locals are built lazily if you're inside the .so, e.g. if python is
  built with --shared)

 You might be able to use Py_AddPendingCall to schedule
 what you want done outside the context of the signal
 handler.

 The call will be made by the main thread, though,
 so if you need to access the frame of whatever thread
 was running when the signal occured, you will have
 to track down its PyThreadState somehow and get the
 frame from there. Not sure what would be involved
 in doing that.

 --
 Greg
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/greg%40krypto.org


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-08 Thread Maciej Fijalkowski
Hi Francis

Feel free to steal most of vmprof code, it should generally work
without requiring to patch cpython (python 3 patches appreciated :-).
As far as timer goes - it seems not to be going anywhere, I would
rather use a background thread or something

On Sun, Feb 8, 2015 at 10:03 PM, Francis Giraldeau
francis.girald...@gmail.com wrote:
 2015-02-08 4:01 GMT-05:00 Maciej Fijalkowski fij...@gmail.com:

 I'm working on vmprof (github.com/vmprof/vmprof-python) which works
 for both cpython and pypy (pypy has special support, cpython is
 patched on-the fly)


 This looks interesting. I'm working on a profiler that is similar, but not
 based on timer. Instead, the signal is generated when an hardware
 performance counter overflows. It required a special linux kernel module,
 and the tracepoint is recorded using LTTng-UST.

 https://github.com/giraldeau/perfuser
 https://github.com/giraldeau/perfuser-modules
 https://github.com/giraldeau/python-profile-ust

 This is of course very experimental, requires a special setup, an I don't
 even know if it's going to produce good results. I'll report the results in
 the coming weeks.

 Cheers,

 Francis Giraldeau
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-07 Thread Maciej Fijalkowski
On Sat, Feb 7, 2015 at 12:48 AM, Francis Giraldeau
francis.girald...@gmail.com wrote:
 2015-02-06 6:04 GMT-05:00 Armin Rigo ar...@tunes.org:

 Hi,

 On 6 February 2015 at 08:24, Maciej Fijalkowski fij...@gmail.com wrote:
  I don't think it's safe to assume f_code is properly filled by the
  time you might read it, depending a bit where you find the frame
  object. Are you sure it's not full of garbage?


 Yes, before discussing how to do the utf8 decoding, we should realize
 that it is really unsafe code starting from the line before.  From a
 signal handler you're only supposed to read data that was written to
 volatile fields.  So even PyEval_GetFrame(), which is done by
 reading the thread state's frame field, is not safe: this is not a
 volatile.  This means that the compiler is free to do crazy things
 like *first* write into this field and *then* initialize the actual
 content of the frame.  The uninitialized content may be garbage, not
 just NULLs.


 Thanks for these comments. Of course accessing frames withing a signal
 handler is racy. I confirm that code encoded in non-ascii is not accessible
 from the uft8 buffer pointer. However, a call to PyUnicode_AsUTF8() encodes
 the data and caches it in the unicode object. Later access returns the byte
 buffer without memory allocation and re-encoding.

 I think it is possible to solve both safety problems by registering a
 handler with PyPyEval_SetProfile(). On function entry, the handler will call
 PyUnicode_AsUTF8() on the required frame members to make sure the utf8
 encoded string is available. Then, we increment the refcount of the frame
 and assign it to a thread local pointer. On function return, the refcount is
 decremented. These operations occurs in the normal context and they are not
 racy. The signal handler will use the thread local frame pointer instead of
 calling PyEval_GetFrame(). Does that sounds good?

 Thanks again for your feedback!

 Francis

You still didn't explain what are you trying to achieve nor adressed
armins questions about volatile. However, you can't access thread
locals from signal handlers (since in some cases it mallocs, thread
locals are built lazily if you're inside the .so, e.g. if python is
built with --shared)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-05 Thread Maciej Fijalkowski
Hi Francis

I don't think it's safe to assume f_code is properly filled by the
time you might read it, depending a bit where you find the frame
object. Are you sure it's not full of garbage?

Besides, are you writing a profiler, or what exactly are you doing?

On Fri, Feb 6, 2015 at 1:27 AM, Francis Giraldeau
francis.girald...@gmail.com wrote:
 I need to access frame members from within a signal handler for tracing
 purpose. My first attempt to access co_filename was like this (omitting
 error checking):

 PyFrameObject *frame = PyEval_GetFrame();
 PyObject *ob = PyUnicode_AsUTF8String(frame-f_code-co_filename)
 char *str = PyBytes_AsString(ob)

 However, the function PyUnicode_AsUTF8String() calls PyObject_Malloc(),
 which is not reentrant. If the signal handler nest over PyObject_Malloc(),
 it causes a segfault, and it could also deadlock.

 Instead, I access members directly:
 char *str = PyUnicode_DATA(frame-f_code-co_filename);
 size_t len = PyUnicode_GET_DATA_SIZE(frame-f_code-co_filename);

 Is it safe to assume that unicode objects co_filename and co_name are always
 UTF-8 data for loaded code? I looked at the PyTokenizer_FromString() and it
 seems to convert everything to UTF-8 upfront, and I would like to make sure
 this assumption is valid.

 Thanks!

 Francis

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 468 (Ordered kwargs)

2015-01-24 Thread Maciej Fijalkowski
Hi Guido.

I *think* part of the reason why our implementation works is that
machines are significantly different than at the times of Knuth.
Avoiding cache misses is a very effective way to improve performance
these days.

Cheers,
fijal

On Sat, Jan 24, 2015 at 7:39 PM, Guido van Rossum gu...@python.org wrote:
 Wow, very cool. When I implemented the very first Python dict (cribbing from
 an algorithm in Knuth) I had no idea that 25 years later there would still
 be ways to improve upon it! I've got a feeling Knuth probably didn't expect
 this either...

 On Sat, Jan 24, 2015 at 2:51 AM, Maciej Fijalkowski fij...@gmail.com
 wrote:

 On Sat, Jan 24, 2015 at 12:50 PM, Maciej Fijalkowski fij...@gmail.com
 wrote:
  Hi
 
  I would like to point out that we implemented rhettingers idea in PyPy
  that makes all the dicts ordered by default and we don't have any
  adverse performance effects (in fact, there is quite significant
  memory saving coming from it). The measurments on CPython could be
  different, but in principle OrderedDict can be implemented as
  efficiently as normal dict.
 
  Writeup:
  http://morepypy.blogspot.com/2015/01/faster-more-memory-efficient-and-more.html
 
  Previous discussion:
  https://mail.python.org/pipermail/python-dev/2012-December/123028.html
 
  Cheers,
  fijal

 also as a sidenote: PEP should maybe mention that PyPy is already
 supporting it, a bit by chance
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/guido%40python.org




 --
 --Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 468 (Ordered kwargs)

2015-01-24 Thread Maciej Fijalkowski
On Sat, Jan 24, 2015 at 12:50 PM, Maciej Fijalkowski fij...@gmail.com wrote:
 Hi

 I would like to point out that we implemented rhettingers idea in PyPy
 that makes all the dicts ordered by default and we don't have any
 adverse performance effects (in fact, there is quite significant
 memory saving coming from it). The measurments on CPython could be
 different, but in principle OrderedDict can be implemented as
 efficiently as normal dict.

 Writeup: 
 http://morepypy.blogspot.com/2015/01/faster-more-memory-efficient-and-more.html

 Previous discussion:
 https://mail.python.org/pipermail/python-dev/2012-December/123028.html

 Cheers,
 fijal

also as a sidenote: PEP should maybe mention that PyPy is already
supporting it, a bit by chance
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 468 (Ordered kwargs)

2015-01-24 Thread Maciej Fijalkowski
Hi

I would like to point out that we implemented rhettingers idea in PyPy
that makes all the dicts ordered by default and we don't have any
adverse performance effects (in fact, there is quite significant
memory saving coming from it). The measurments on CPython could be
different, but in principle OrderedDict can be implemented as
efficiently as normal dict.

Writeup: 
http://morepypy.blogspot.com/2015/01/faster-more-memory-efficient-and-more.html

Previous discussion:
https://mail.python.org/pipermail/python-dev/2012-December/123028.html

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2015-01-01 Thread Maciej Fijalkowski
On Wed, Dec 31, 2014 at 3:12 PM, Serhiy Storchaka storch...@gmail.com wrote:
 On 10.12.12 03:44, Raymond Hettinger wrote:

 The current memory layout for dictionaries is
 unnecessarily inefficient.  It has a sparse table of
 24-byte entries containing the hash value, key pointer,
 and value pointer.

 Instead, the 24-byte entries should be stored in a
 dense table referenced by a sparse table of indices.


 FYI PHP 7 will use this technique [1]. In conjunction with other
 optimizations this will decrease memory consumption of PHP hashtables up to
 4 times.

up to 4 times is a bit of a stretch, given that most of their
savings come from:

* saving on the keeping of ordering
* other optimizations in zval

None of it applies to python

PHP does not implement differing sizes of ints in key dict, which
makes memory saving php-only (if we did the same thing as PHP, we
would save more or less nothing, depending on how greedy you are with
the list overallocation)

We implemented the same strategy in PyPy as of last year, testing it
to become a default dict and OrderedDict for PyPy in the next
release.

Cheers,
fijal

PS. I wonder who came up with the idea first, PHP or rhettinger and
who implemented it first (I'm pretty sure it was used in hippy before
it was used in Zend PHP)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] libffi embedded in CPython

2014-12-19 Thread Maciej Fijalkowski
On Thu, Dec 18, 2014 at 10:36 PM, Jim J. Jewett jimjjew...@gmail.com wrote:


 On Thu, Dec 18, 2014, at 14:13, Maciej Fijalkowski wrote:
 ... http://bugs.python.org/issue23085 ...
 is there any reason any more for libffi being included in CPython?

 [And why a fork, instead of just treating it as an external dependency]

 Benjamin Peterson responded:

 It has some sort of Windows related patches. No one seems to know
 whether they're still needed for newer libffi. Unfortunately, ctypes
 doesn't currently have a maintainer.

 Are any of the following false?

 (1)  Ideally, we would treat it as an external dependency.

 (2)  At one point, it was intentionally forked to get in needed
 patches, including at least some for 64 bit windows with MSVC.

 (3)  Upstream libffi maintenance has picked back up.

 (4)  Alas, that means the switch merge would not be trivial.

 (5)  In theory, we could now switch to the external version.
 [In particular, does libffi have a release policy such that we
 could assume the newest released version is safe, so long as
 our integration doesn't break?]

 (6)  By its very nature, libffi changes are risky and undertested.
 At the moment, that is also true of its primary user, ctypes.

 (7)  So a switch is OK in theory, but someone has to do the
 non-trivial testing and merging, and agree to support both libffi
 and and ctypes in the future.  Otherwise, stable wins.

 (8)  The need for future support makes this a bad candidate for
 patches wanted/bug bounty/GSoC.

 -jJ

I would like to add that not doing anything is not a good strategy
either, because you accumulate bugs that get fixed upstream (I'm
pretty sure all the problems from cpython got fixed in upstream
libffi, but not all libffi fixes made it to cpython).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] libffi embedded in CPython

2014-12-18 Thread Maciej Fijalkowski
After reading this http://bugs.python.org/issue23085 and remembering
struggling having our own patches into cpython's libffi (but not into
libffi itself), I wonder, is there any reason any more for libffi
being included in CPython?

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] libffi embedded in CPython

2014-12-18 Thread Maciej Fijalkowski
On Thu, Dec 18, 2014 at 9:17 PM, Steve Dower steve.do...@microsoft.com wrote:
 Maciej Fijalkowski wrote:
 After reading this http://bugs.python.org/issue23085 and remembering 
 struggling
 having our own patches into cpython's libffi (but not into libffi itself), I
 wonder, is there any reason any more for libffi being included in CPython?

 We use it for ctypes, so there's certainly still a need. Are you asking 
 whether we need a fork of it as opposed to treating it like an external (like 
 OpenSSL)?

yes (why is there a copy of libffi in the cpython source). And I'm
asking not why it landed there, but why it is still there
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] libffi embedded in CPython

2014-12-18 Thread Maciej Fijalkowski
well, the problem is essentially that libffi gets patched (e.g. for
ARM) and it does not make it's way to CPython quickly. This is
unlikely to be a security issue (for a variety of reasons, including
ctypes), but it's still an issue I think. Segfaults related to e.g.
stack alignment are hard to debug

On Thu, Dec 18, 2014 at 9:30 PM, Benjamin Peterson benja...@python.org wrote:


 On Thu, Dec 18, 2014, at 14:13, Maciej Fijalkowski wrote:
 After reading this http://bugs.python.org/issue23085 and remembering
 struggling having our own patches into cpython's libffi (but not into
 libffi itself), I wonder, is there any reason any more for libffi
 being included in CPython?

 It has some sort of Windows related patches. No one seems to know
 whether they're still needed for newer libffi. Unfortunately, ctypes
 doesn't currently have a maintainer.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should standard library modules optimize for CPython?

2014-06-02 Thread Maciej Fijalkowski
On Mon, Jun 2, 2014 at 10:43 AM, Victor Stinner
victor.stin...@gmail.com wrote:
 2014-06-01 10:11 GMT+02:00 Steven D'Aprano st...@pearwood.info:
 My feeling is that the CPython standard library should be written for
 CPython,

 Right. PyPy, Jython and IronPython already have their own standard
 library when they need a different implement.

 PyPy: lib_pypy directory (lib-python is the CPython stdlib):
 https://bitbucket.org/pypy/pypy/src/ac52eb7059d0b8d001a2103774917cf7396f/lib_pypy/?at=default

it's for stuff that's in CPython implemented in C, not a
reimplementation of python stuff. we patched the most obvious
CPython-specific hacks, but it's a loosing battle, you guys will go
way out of your way to squeeze extra 2% by doing very obscure hacks.


 Jython: Lib directory (lib-python is the CPython stdlib):
 https://bitbucket.org/jython/jython/src/9cd9ab75eadea898e2e74af82ae414925d6a1135/Lib/?at=default

 IronPython: IronPython.Modules directory:
 http://ironpython.codeplex.com/SourceControl/latest#IronPython_Main/Languages/IronPython/IronPython.Modules/

 See for example the _fsum.py module of Jython:
 https://bitbucket.org/jython/jython/src/9cd9ab75eadea898e2e74af82ae414925d6a1135/Lib/_fsum.py?at=default

 Victor
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Language Summit notes

2014-04-11 Thread Maciej Fijalkowski
On Fri, Apr 11, 2014 at 2:22 PM, Paul Moore p.f.mo...@gmail.com wrote:
 On 11 April 2014 10:36, Armin Rigo ar...@tunes.org wrote:
 This would be superficial, but change the perception of CFFI to be a
 preprocessor that produces C extension modules.

 Thanks, that clarification helps a lot. Does this mean that API-mode
 CFFI is competing with things like swig (which is not used much these
 days, as far as I know) and Cython (which is used a lot in the numeric
 community)? (ABI-mode CFFI is obviously directly competing with
 ctypes).

Yes.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] collections.sortedtree

2014-03-27 Thread Maciej Fijalkowski
On Thu, Mar 27, 2014 at 10:11 AM, Stephen J. Turnbull
step...@xemacs.org wrote:
 Nick Coghlan writes:

On 27 Mar 2014 07:02, Guido van Rossum gu...@python.org wrote:
   Actually, the first step is publish it on PyPI, the second is to
   get a fair number of happy users there. The bar for getting something
   included into the stdlib is pretty high

   The why not a third party module? bar also got a fair bit higher
   with Python 3.4 - by bundling pip, we have deliberately made third
   party modules easier to consume, thus weakening the convenience
   argument that applies to stdlib inclusion.

 Maybe.  That depends on if you care about the convenience of folks who
 have to get new modules past Corporate Security, but it's easier to
 get an upgrade of the whole shebang.  I don't think it's ever really
 been resolved whether they're a typical case that won't go away or a
 special group whose special needs should be considered.

 Steve

And random pieces of C included in the standard library can be
shuffled under the carpet under the disguise of upgrade or what are
you suggesting?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] collections.sortedtree

2014-03-27 Thread Maciej Fijalkowski
On Thu, Mar 27, 2014 at 11:07 AM, Paul Moore p.f.mo...@gmail.com wrote:
 On 27 March 2014 08:16, Maciej Fijalkowski fij...@gmail.com wrote:
 And random pieces of C included in the standard library can be
 shuffled under the carpet under the disguise of upgrade or what are
 you suggesting?

 The sort of thing that happens is that the relevant approvers will
 accept python-dev as a trusted supplier and then Python upgrades are
 acceptable subject to review of the changes, etc. For a new module,
 there is a whole other level of questions around how do we trust the
 person who developed the code, do we need to do a full code review,
 etc?

 It's a bit unfair to describe the process as random pieces of C
 being shuffled under the carpet. (Although there probably are
 environments where that is uncomfortably close to the truth :-()

 Paul

I just find my company is stupid so let's work around it by putting
stuff to python standard library unacceptable argument for python-dev
and all the python community.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Intricacies of calling __eq__

2014-03-21 Thread Maciej Fijalkowski
On Wed, Mar 19, 2014 at 11:43 PM, Nick Coghlan ncogh...@gmail.com wrote:

 On 20 Mar 2014 07:38, Nick Coghlan ncogh...@gmail.com wrote:

 Correct, but I think this discussion has established that how many times
 dict lookup calls __eq__ on the key is one such thing. In CPython, it
 already varies based on:

 - dict contents (due to the identity check and the distribution of entries
 across hash buckets)
 - pointer size (due to the hash bucket distribution differing between 32
 bit and 64 bit builds)
 - dict tuning parameters (there are some settings in the dict
 implementation that affect when dicts resize up and down, etc, which can
 mean the hash bucket distribution may already change without much notice in
 feature releases)

 I just realised that hash randomisation also comes into play here - the
 distribution of entries across hash buckets is inherently variable between
 runs for any key types that rely directly or indirectly on a randomised
 hash.

 Cheers,
 Nick.


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com


at the end of the day we settled for dicts with str int or identity
keys, so we're perfectly safe
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Intricacies of calling __eq__

2014-03-19 Thread Maciej Fijalkowski
On Tue, Mar 18, 2014 at 11:19 PM, Paul Moore p.f.mo...@gmail.com wrote:
 On 18 March 2014 19:46, Maciej Fijalkowski fij...@gmail.com wrote:
 A question: how far away will this optimization apply?

 if x in d:
 do_this()
 do_that()
 do_something_else()
 spam = d[x]

 it depends what those functions do. JIT will inline them and if
 they're small, it should work (although a modification of a different
 dict is illegal, since aliasing is not proven), but at some point
 it'll give up (Note that it'll also give up with a call to C releasing
 GIL since some other thread can modify it).

 Surely in the presence of threads the optimisation is invalid anyway
 as other threads can run in between each opcode (I don't know how
 you'd phrase that in a way that wasn't language dependent other than
 everywhere :-)) so

 if x in d:
 # HERE
 spam = d[x]

 d can be modified at HERE. (If d is a local variable, obviously the
 chance that another thread has access to d is a lot lower, but do you
 really do that level of alias tracking?)

 Paul

not in the case of JIT that *knows* where the GIL can be released. We
precisely make it not possible every few bytecodes to avoid such
situations.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Intricacies of calling __eq__

2014-03-19 Thread Maciej Fijalkowski
On Wed, Mar 19, 2014 at 2:42 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Tue, 18 Mar 2014 09:52:05 +0200
 Maciej Fijalkowski fij...@gmail.com wrote:

 We're thinking about doing an optimization where say:

 if x in d:
return d[x]

 where d is a dict would result in only one dict lookup (the second one
 being constant folded away). The question is whether it's ok to do it,
 despite the fact that it changes the semantics on how many times
 __eq__ is called on x.

 I don't think it's ok. If the author of the code had wanted only one
 lookup, they would have written:

   try:
   return d[x]
   except KeyError:
   pass

 I agree that an __eq__ method with side effects is rather bad, of
 course.
 What you could do is instruct people that the latter idiom (EAFP)
 performs better on PyPy.

 Regards

 Antoine.

I would like to point out that instructing people does not really
work. Besides, other examples like this:

if d[x] = 3:
   d[x] += 1 don't really work.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Intricacies of calling __eq__

2014-03-19 Thread Maciej Fijalkowski
On Wed, Mar 19, 2014 at 3:17 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 19 Mar 2014 15:09:04 +0200
 Maciej Fijalkowski fij...@gmail.com wrote:

 I would like to point out that instructing people does not really
 work. Besides, other examples like this:

 if d[x] = 3:
d[x] += 1 don't really work.

 That's a good point. But then, perhaps PyPy should analyze the __eq__
 method and decide whether it's likely to have side effects or not (the
 answer can be hard-coded for built-in types such as str).

 Regards

 Antoine.

Ok. But then how is it valid to have is fast-path?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Intricacies of calling __eq__

2014-03-19 Thread Maciej Fijalkowski
On Wed, Mar 19, 2014 at 3:26 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 19 Mar 2014 15:21:16 +0200
 Maciej Fijalkowski fij...@gmail.com wrote:

 On Wed, Mar 19, 2014 at 3:17 PM, Antoine Pitrou solip...@pitrou.net wrote:
  On Wed, 19 Mar 2014 15:09:04 +0200
  Maciej Fijalkowski fij...@gmail.com wrote:
 
  I would like to point out that instructing people does not really
  work. Besides, other examples like this:
 
  if d[x] = 3:
 d[x] += 1 don't really work.
 
  That's a good point. But then, perhaps PyPy should analyze the __eq__
  method and decide whether it's likely to have side effects or not (the
  answer can be hard-coded for built-in types such as str).
 
  Regards
 
  Antoine.

 Ok. But then how is it valid to have is fast-path?

 What do you mean?


I mean that dict starts with is before calling __eq__, so the number
of calls to __eq__ can as well be zero.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Intricacies of calling __eq__

2014-03-19 Thread Maciej Fijalkowski
On Wed, Mar 19, 2014 at 8:38 AM, Kevin Modzelewski k...@dropbox.com wrote:
 Sorry, I definitely didn't mean to imply that this kind of optimization is
 valid on arbitrary subscript expressions; I thought we had restricted
 ourselves to talking about builtin dicts.  If we do, I think this becomes a
 discussion about what subset of the semantics of CPython's builtins are
 language-specified vs implementation-dependent; my argument is that just
 because something results in an observable behavioral difference doesn't
 necessarily mean that it's a change in language semantics, if it's just a
 change in the implementation-dependent behavior.


 On Tue, Mar 18, 2014 at 9:54 PM, Stephen J. Turnbull step...@xemacs.org
 wrote:

 Kevin Modzelewski writes:

   I think in this case, though, if we say for the sake of argument
   that the guaranteed semantics of a dictionary lookup are zero or

 I don't understand the point of that argument.  It's simply false that
 semantics are guaranteed, and all of the dunders might be user
 functions.

   more calls to __hash__ plus zero or more calls to __eq__, then two
   back-to-back dictionary lookups wouldn't have any observable
   differences from doing only one, unless you start to make
   assumptions about the behavior of the implementation.

 That's false.  The inverse is true: you should allow the possibility of
 observable differences, unless you make assumptions about the behavior
 (implying there are none).

   To me there seems to be a bit of a gap between seeing a dictionary
   lookup and knowing the exact sequence of user-functions that get
   called, far more than for example something like a  b.

 The point here is that we *know* that there may be a user function
 (the dunder that implements []) being called, and it is very hard to
 determine that that function is pure.

 Your example of a caching hash is exactly the kind of impure function
 that one would expect, but who knows what might be called -- there
 could be a reference to a database on Mars involved (do we have a
 vehicle on Mars at the moment? anyway...), which calls a pile of
 Twisted code, and has latencies of many seconds.

 So Steven is precisely right -- in order to allow this optimization,
 it would have to be explicitly allowed.

 Like Steven, I have no strong feeling against it, but then, I don't
 have a program talking to a deep space vehicle in my near future.
 Darn it! :-(



 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com


we're discussing builtin dicts
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Intricacies of calling __eq__

2014-03-18 Thread Maciej Fijalkowski
Hi

I have a question about calling __eq__ in some cases.

We're thinking about doing an optimization where say:

if x in d:
   return d[x]

where d is a dict would result in only one dict lookup (the second one
being constant folded away). The question is whether it's ok to do it,
despite the fact that it changes the semantics on how many times
__eq__ is called on x.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Intricacies of calling __eq__

2014-03-18 Thread Maciej Fijalkowski
On Tue, Mar 18, 2014 at 11:35 AM, Nick Coghlan ncogh...@gmail.com wrote:
 On 18 March 2014 17:52, Maciej Fijalkowski fij...@gmail.com wrote:
 Hi

 I have a question about calling __eq__ in some cases.

 We're thinking about doing an optimization where say:

 if x in d:
return d[x]

 where d is a dict would result in only one dict lookup (the second one
 being constant folded away). The question is whether it's ok to do it,
 despite the fact that it changes the semantics on how many times
 __eq__ is called on x.

 I'll assume the following hold:

 - we're only talking about true builtin dicts (the similarity between
 __contains__ and __getitem__ can't be assumed otherwise)

yes

 - guards will trigger if d is mutated (e.g. by another thread) between
 the containment check and the item retrieval

yes


 Semantically, what you propose is roughly equivalent to reinterpreting
 the look-before-you-leap version to the exception handling based
 fallback:

 try:
 return d[x]
 except KeyError:
 pass

 For a builtin dict and any *reasonable* x, those two operations will
 behave the same way. Differences arise only if x.__hash__ or x.__eq__
 is defined in a way that most people would consider unreasonable.

 For an optimisation that actually changes the language semantics like
 that, though, I would expect it to be buying a significant payoff in
 speed, especially given that most cases where the key lookup is known
 to be a bottleneck can already be optimised by hand.

 Cheers,
 Nick.

the payoff is significant. Note that __eq__ might not be called at all
(since dicts check identity first). It turns out not all people write
reasonable code and we can't expect them to micro-optimize by hand. It
also covers cases that are hard to optimize, like:

if d[x]  3:
   d[x] += 1

etc.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Intricacies of calling __eq__

2014-03-18 Thread Maciej Fijalkowski
On Tue, Mar 18, 2014 at 1:18 PM, Steven D'Aprano st...@pearwood.info wrote:
 On Tue, Mar 18, 2014 at 05:05:56AM -0400, Terry Reedy wrote:
 On 3/18/2014 3:52 AM, Maciej Fijalkowski wrote:
 Hi
 
 I have a question about calling __eq__ in some cases.
 
 We're thinking about doing an optimization where say:
 
 if x in d:
 return d[x]

 if d.__contains__(x): return d.__getitem__(x)

 [Aside: to be pedantic, Python only calls dunder methods on the class,
 not the instance, in response to operators and other special calls. That
 is, type(d).__contains__ rather than d.__contains__, etc. And to be even
 more pedantic, that's only true for new-style classes.]


 I do not see any requirement to call x.__eq__ any particular number of
 times. The implementation of d might always call somekey.__eq__(x). The
 concept of sets (and dicts) requires coherent equality comparisons.

 To what extent does Python the language specify that user-defined
 classes must be coherent? How much latitude to shoot oneself in the foot
 should the language allow?

 What counts as coherent can depend on the types involved. For instance,
 I consider IEEE-754 Not-A-Numbers to be coherent, albeit weird. Python
 goes only so far to accomodate NANs: while it allows a NAN to test
 unequal even to itself (`NAN == NAN` returns False), containers are
 allowed to assume that instances are equal to themselves (`NAN in {NAN}`
 returns True). This was discussed in great detail a few years ago, and
 if I recall correctly, the conclusion was that containers can assume
 that their elements are reflexive (they equal themselves), but equality
 == cannot make the same assumption and bypass calling __eq__.


 where d is a dict would result in only one dict lookup (the second one
 being constant folded away). The question is whether it's ok to do it,
 despite the fact that it changes the semantics on how many times
 __eq__ is called on x.

 A __eq__ that has side-effects violates the intended and expected
 semanitics of __eq__.

 Nevertheless, an __eq__ with side-effects is legal Python and may in
 fact be useful.

 It's a tricky one... I don't know how I feel about short-cutting normal
 Python semantics for speed. On the one hand, faster is good. But on the
 other hand, it makes it harder to reason about code when things go
 wrong. Why is my __eq__ method not being called?


 --
 Steven
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

note that this is specifically about dicts, where __eq__ will be
called undecided number of times anyway (depending on collisions in
hash/buckets which is implementation specific to start with)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Intricacies of calling __eq__

2014-03-18 Thread Maciej Fijalkowski
On Tue, Mar 18, 2014 at 4:21 PM, Steven D'Aprano st...@pearwood.info wrote:
 On Tue, Mar 18, 2014 at 01:21:05PM +0200, Maciej Fijalkowski wrote:

 note that this is specifically about dicts, where __eq__ will be
 called undecided number of times anyway (depending on collisions in
 hash/buckets which is implementation specific to start with)

 Exactly. Using a __eq__ method with side-effects is a good way to find
 out how many collisions your dict has :-)

 But specifically with your example,

 if x in d:
 return d[x]

 my sense of this is that it falls into the same conceptual area as the
 identity optimization for checking list or set containment: slightly
 unclean, but justified. Provided d is an actual built-in dict, and it
 hasn't been modified between one call and the next, I think it would be
 okay to optimize the second lookup d[x].

 A question: how far away will this optimization apply?

 if x in d:
 do_this()
 do_that()
 do_something_else()
 spam = d[x]

it depends what those functions do. JIT will inline them and if
they're small, it should work (although a modification of a different
dict is illegal, since aliasing is not proven), but at some point
it'll give up (Note that it'll also give up with a call to C releasing
GIL since some other thread can modify it).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What is the precise problem? [was: Reference cycles in Exception.__traceback__]

2014-03-10 Thread Maciej Fijalkowski
On Mon, Mar 10, 2014 at 12:10 PM, Victor Stinner
victor.stin...@gmail.com wrote:
 2014-03-08 16:30 GMT+01:00 Maciej Fijalkowski fij...@gmail.com:
 How about fixing cyclic gc to deal with __del__ instead? That sounds
 like an awful change to the semantics.

 Hum? That's the purpose of the PEP 442 which is implemented in Python 3.4.

 As I wrote, it's not enough to fix all issues.

 Usually, I see an explicit call to gc.collect() as a workaround to a
 deeper issue. I prefer to modify my program to run smoothly without
 explict garbage collection.

 That's why I would prefer to avoid creating reference cycles from the 
 beginning.

 Victor

It was agreed long time ago that the immediate finalization is an
implementation specific detail and it's not guaranteed. You should not
rely on __del__s being called timely one way or another. Why would you
require this for the program to work correctly in the particular
example of __traceback__?

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What is the precise problem? [was: Reference cycles in Exception.__traceback__]

2014-03-10 Thread Maciej Fijalkowski
On Mon, Mar 10, 2014 at 3:23 PM, Victor Stinner
victor.stin...@gmail.com wrote:
 2014-03-10 13:11 GMT+01:00 Maciej Fijalkowski fij...@gmail.com:
 It was agreed long time ago that the immediate finalization is an
 implementation specific detail and it's not guaranteed. You should not
 rely on __del__s being called timely one way or another. Why would you
 require this for the program to work correctly in the particular
 example of __traceback__?

 For asyncio, it's very useful to see unhandled exceptions as early as
 possible. Otherwise, your program is blocked and you don't know why.

 Guido van Rossum suggests to run gc.collect() regulary:
 http://code.google.com/p/tulip/issues/detail?id=42

 Victor

twisted goes around it by attaching errback by hand. Would that work for tulip?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What is the precise problem? [was: Reference cycles in Exception.__traceback__]

2014-03-10 Thread Maciej Fijalkowski
On Mon, Mar 10, 2014 at 7:35 PM, Guido van Rossum gu...@python.org wrote:
 On Mon, Mar 10, 2014 at 10:30 AM, Maciej Fijalkowski fij...@gmail.com
 wrote:

 On Mon, Mar 10, 2014 at 3:23 PM, Victor Stinner
 victor.stin...@gmail.com wrote:
  2014-03-10 13:11 GMT+01:00 Maciej Fijalkowski fij...@gmail.com:
  It was agreed long time ago that the immediate finalization is an
  implementation specific detail and it's not guaranteed. You should not
  rely on __del__s being called timely one way or another. Why would you
  require this for the program to work correctly in the particular
  example of __traceback__?
 
  For asyncio, it's very useful to see unhandled exceptions as early as
  possible. Otherwise, your program is blocked and you don't know why.
 
  Guido van Rossum suggests to run gc.collect() regulary:
  http://code.google.com/p/tulip/issues/detail?id=42
 
  Victor

 twisted goes around it by attaching errback by hand. Would that work for
 tulip?


 Can you describe that idea in more detail?

Essentially, instead of relying on deferred to be garbage collected,
you attach an errback like this:

deferred.addErrback(callback_that_writes_to_log)

so in case of a failure, you get a traceback directly in the callback
immediately, without relying on garbage collection.

I'm sorry if I'm using twisted nomenclature here (it's also awfully
off-topic for python-dev), but making programs rely on refcounting
sounds like a bad idea for us (PyPy).

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What is the precise problem? [was: Reference cycles in Exception.__traceback__]

2014-03-10 Thread Maciej Fijalkowski
On Mon, Mar 10, 2014 at 7:49 PM, Guido van Rossum gu...@python.org wrote:
 On Mon, Mar 10, 2014 at 10:39 AM, Maciej Fijalkowski fij...@gmail.com
 wrote:

 On Mon, Mar 10, 2014 at 7:35 PM, Guido van Rossum gu...@python.org
 wrote:
  On Mon, Mar 10, 2014 at 10:30 AM, Maciej Fijalkowski fij...@gmail.com
  wrote:
 
  On Mon, Mar 10, 2014 at 3:23 PM, Victor Stinner
  victor.stin...@gmail.com wrote:
   2014-03-10 13:11 GMT+01:00 Maciej Fijalkowski fij...@gmail.com:
   It was agreed long time ago that the immediate finalization is an
   implementation specific detail and it's not guaranteed. You should
   not
   rely on __del__s being called timely one way or another. Why would
   you
   require this for the program to work correctly in the particular
   example of __traceback__?
  
   For asyncio, it's very useful to see unhandled exceptions as early as
   possible. Otherwise, your program is blocked and you don't know why.
  
   Guido van Rossum suggests to run gc.collect() regulary:
   http://code.google.com/p/tulip/issues/detail?id=42
  
   Victor
 
  twisted goes around it by attaching errback by hand. Would that work
  for
  tulip?
 
 
  Can you describe that idea in more detail?

 Essentially, instead of relying on deferred to be garbage collected,
 you attach an errback like this:

 deferred.addErrback(callback_that_writes_to_log)

 so in case of a failure, you get a traceback directly in the callback
 immediately, without relying on garbage collection.

 I'm sorry if I'm using twisted nomenclature here (it's also awfully
 off-topic for python-dev), but making programs rely on refcounting
 sounds like a bad idea for us (PyPy).


 IIUC the problem that Victor is trying to solve is what to do if nobody
 thought to attach an errback. Tulip makes a best effort to still log a
 traceback. We've found this very helpful (just as it is helpful that Python
 prints a traceback when synchronous code raises an exception and no except
 clause caught it).

 The best effort relies on GC. I am guessing that refcounting makes the
 traceback appear sooner, but there would be other ways to force it, like
 occasionally calling gc.collect() during idle times (presumably during busy
 times it will be called soon enough. :-)

 --
 --Guido van Rossum (python.org/~guido)

I agree this sounds like a solution. However I'm very skeptical about
changing details of __traceback__ and frames, just in order to make
refcounting work (since it would create something that would not work
on pypy for example).

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What is the precise problem? [was: Reference cycles in Exception.__traceback__]

2014-03-08 Thread Maciej Fijalkowski
On Sat, Mar 8, 2014 at 5:14 PM, Victor Stinner victor.stin...@gmail.com wrote:
 2014-03-08 14:33 GMT+01:00 Antoine Pitrou solip...@pitrou.net:
 Ok, it's actually quite trivial. The whole chain is kept alive by the
 fut global variable. If you arrange for it to be disposed of:

   fut = asyncio.Future()
   asyncio.Task(func(fut))
   del fut
   [etc.]

 then the problem disappears: as soon as gc.collect() happens, the
 MyObject instance is destroyed, the future is collected, and the
 future's traceback is printed out.

 Well, the problem is more general than this specific example. I would
 like to implement a general solution which would not hold references
 to local variables, to destroy objects when Python exits the except
 block.

 It looks like a exception summary containing only data to format the
 traceback would fit asyncio needs. If you don't want it in the
 traceback module, I will try to implement it in asyncio.

 It would be nice to provide an exception summary in the traceback
 module, because it looks like reference cycles related to exception
 and/or traceback is a common issue (see the list of links I gave in a
 previous email).

 Victor

How about fixing cyclic gc to deal with __del__ instead? That sounds
like an awful change to the semantics.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Remote Code Execution in socket.recvfrom_into()

2014-02-25 Thread Maciej Fijalkowski
On Tue, Feb 25, 2014 at 11:13 AM, Victor Stinner
victor.stin...@gmail.com wrote:
 Hi,

 2014-02-25 8:53 GMT+01:00 Nick Coghlan ncogh...@gmail.com:
 I've checked these, and noted the relevant hg.python.org links on the
 tracker issue at http://bugs.python.org/issue20246

 Would it be possible to have a table with all known Python security
 vulnerabilities and the Python versions which are fixed? Bonus point
 if we provide a link to the changeset fixing it for each branch. Maybe
 put this table on http://www.python.org/security/ ?

 Last issues:
 - hash DoS

is this fixed?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Remote Code Execution in socket.recvfrom_into()

2014-02-25 Thread Maciej Fijalkowski
On Tue, Feb 25, 2014 at 3:01 PM, Donald Stufft don...@stufft.io wrote:

 On Feb 25, 2014, at 7:59 AM, Maciej Fijalkowski fij...@gmail.com wrote:

 On Tue, Feb 25, 2014 at 11:13 AM, Victor Stinner
 victor.stin...@gmail.com wrote:
 Hi,

 2014-02-25 8:53 GMT+01:00 Nick Coghlan ncogh...@gmail.com:
 I've checked these, and noted the relevant hg.python.org links on the
 tracker issue at http://bugs.python.org/issue20246

 Would it be possible to have a table with all known Python security
 vulnerabilities and the Python versions which are fixed? Bonus point
 if we provide a link to the changeset fixing it for each branch. Maybe
 put this table on http://www.python.org/security/ ?

 Last issues:
 - hash DoS

 is this fixed?
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/donald%40stufft.io

 It is in 3.4.

Oh, I thought security fixes go to all python releases.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Remote Code Execution in socket.recvfrom_into()

2014-02-25 Thread Maciej Fijalkowski
On Tue, Feb 25, 2014 at 3:06 PM, Chris Angelico ros...@gmail.com wrote:
 On Tue, Feb 25, 2014 at 11:59 PM, Maciej Fijalkowski fij...@gmail.com wrote:
 Last issues:
 - hash DoS

 is this fixed?

 Yes, hash randomization was added as an option in 2.7.3 or 2.7.4 or
 thereabouts, and is on by default in 3.3+. You do have to set an
 environment variable for 2.7 (and I think 2.6 got that too (??)), as
 it can break code.

No, the hash randomization is broken, it does not provide enough
randomness (without changing the hash function which only happened in
3.4+)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Remote Code Execution in socket.recvfrom_into()

2014-02-25 Thread Maciej Fijalkowski
On Tue, Feb 25, 2014 at 5:22 PM, Barry Warsaw ba...@python.org wrote:
 On Feb 25, 2014, at 03:03 PM, Maciej Fijalkowski wrote:

Oh, I thought security fixes go to all python releases.

 Well, not the EOL'd ones of course.

yes of course sorry.


 Where's the analysis on backporting SIPHash to older Python versions?  Would
 such a backport break backward compatibility?  What other impacts would
 backporting have?  Would it break pickles, marshals, or other serialization
 protocols?  Are there performance penalties?

 While security should be a top priority, it isn't the only consideration in
 such cases.  A *lot* of discussion went into how to effect the hash
 randomization in Python 2.7, because of questions like these.  The same
 analysis would have to be done for backporting this change to active older
 Python versions.

My impression is that a lot of discussion went into hash
randomization, because it was a high profile issue. It got fixed,
then later someone discovered that the fix is completely broken and
was left at that without much discussion because it's no longer high
visibility. I would really *like* to perceive this process as a lot
of discussion going into because of ramification of changes.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cffi in stdlib

2013-12-19 Thread Maciej Fijalkowski
On Thu, Dec 19, 2013 at 3:17 AM, Gregory P. Smith g...@krypto.org wrote:



 On Tue, Dec 17, 2013 at 8:43 AM, Stefan Krah ste...@bytereef.org wrote:

 Maciej Fijalkowski fij...@gmail.com wrote:
  I would like to discuss on the language summit a potential inclusion
  of cffi[1] into stdlib. This is a project Armin Rigo has been working
  for a while, with some input from other developers.

 I've tried cffi (admittedly only in a toy script) and find it very nice
 to use.

 Here's a comparison (pi benchmark) between wrapping libmpdec using a
 C-extension (_decimal), cffi and ctypes:


 +---+--+--+-+
 |   | _decimal |  ctypes  |   cffi  |
 +===+==+==+=+
 | cpython-tip (with-system-ffi) |   0.19s  |   5.40s  |  5.14s  |
 +---+--+--+-+
 | cpython-2.7 (with-system-ffi) |n/a   |   4.46s  |  5.18s  |
 +---+--+--+-+
 |  Ubuntu-cpython-2.7   |n/a   |   3.63s  |-|
 +---+--+--+-+
 |  pypy-2.2.1-linux64   |n/a   |  125.9s  |  0.94s  |
 +---+--+--+-+
 | pypy3-2.1-beta1-linux64   |n/a   |  264.9s  |  2.93s  |
 +---+--+--+-+


 I guess the key points are that C-extensions are hard to beat and that
 cffi performance on pypy-2 is outstanding. Additionally it's worth noting
 that Ubuntu does something in their Python build that we should do, too.


 Ubuntu compiles their Python with FDO (feedback directed optimization /
 profile guided optimization) enabled. All distros should do this if they
 don't already. It's generally 20% interpreter speedup. Our makefile already
 supports it but it isn't the default build as it takes a long time given
 that it needs to compile everything twice and do a profiled benchmark run
 between compilations.

 -gps

Hey Greg.

We found out that this only speedups benchmarks that you tried during
profiling and not others, so we disabled it for the default pypy
build. Can you provide me with some more detailed study on how it
speeds up interpreters in general and CPython in particular?

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cffi in stdlib

2013-12-17 Thread Maciej Fijalkowski
On Tue, Dec 17, 2013 at 7:21 PM, Brett Cannon br...@python.org wrote:
 Maybe someone from PyPy should bring this up as an official topic at the
 language summit to figure out the blockers (again). Or it can join regex on
 the list of module discussed for addition at the language summit but never
 quite pushed to commitment. =)

we're still working on resolving discussed issues before officially
proposing it for inclusion.



 On Tue, Dec 17, 2013 at 11:43 AM, Stefan Krah ste...@bytereef.org wrote:

 Maciej Fijalkowski fij...@gmail.com wrote:
  I would like to discuss on the language summit a potential inclusion
  of cffi[1] into stdlib. This is a project Armin Rigo has been working
  for a while, with some input from other developers.

 I've tried cffi (admittedly only in a toy script) and find it very nice
 to use.

 Here's a comparison (pi benchmark) between wrapping libmpdec using a
 C-extension (_decimal), cffi and ctypes:


 +---+--+--+-+
 |   | _decimal |  ctypes  |   cffi  |
 +===+==+==+=+
 | cpython-tip (with-system-ffi) |   0.19s  |   5.40s  |  5.14s  |
 +---+--+--+-+
 | cpython-2.7 (with-system-ffi) |n/a   |   4.46s  |  5.18s  |
 +---+--+--+-+
 |  Ubuntu-cpython-2.7   |n/a   |   3.63s  |-|
 +---+--+--+-+
 |  pypy-2.2.1-linux64   |n/a   |  125.9s  |  0.94s  |
 +---+--+--+-+
 | pypy3-2.1-beta1-linux64   |n/a   |  264.9s  |  2.93s  |
 +---+--+--+-+


 I guess the key points are that C-extensions are hard to beat and that
 cffi performance on pypy-2 is outstanding. Additionally it's worth noting
 that Ubuntu does something in their Python build that we should do, too.


 +1 for cffi in the stdlib.



 Stefan Krah



 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/brett%40python.org



 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py)

2013-11-17 Thread Maciej Fijalkowski
On Sun, Nov 17, 2013 at 9:02 PM, Barry Warsaw ba...@python.org wrote:
 On Nov 17, 2013, at 05:14 PM, Victor Stinner wrote:

2013/11/16 Maciej Fijalkowski fij...@gmail.com:
 Can I see some writeup how -OO benefit embedded devices?

You get smaller .pyc files. In an embedded device, the whole OS may be
written in a small memory, something like 64 MB or smaller. Removing
doctrings help to fit in 64 MB.

 I'm in support of separate flags for stripping docstrings and asserts.  I'd
 even be fine with eliminating a flag to strip docstrings if we had a
 post-processing tool that you could apply to pyc files to strip out the
 docstrings.  Another problem that I had while addressing these options in
 Debian was the use of .pyo for both -O and -OO level.

 -Barry
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

My problem with -O and -OO is that their arguments are very circular.
Indeed, I understand the need why you would want in certain and
limited cases to remove both docstrings and asserts. So some options
for doing so are ok. But a lot of arguments I see are along the lines
of don't use asserts because -O removes them. If the option was
named --remove-asserts, noone would care, but people care since -O is
documented as do optimizations and people *assume* this is what it
does (makes code faster) and as unintended consequence removes
asserts.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The pysandbox project is broken

2013-11-16 Thread Maciej Fijalkowski
On Sat, Nov 16, 2013 at 12:12 PM, Nick Coghlan ncogh...@gmail.com wrote:
 On 16 Nov 2013 11:35, Christian Tismer tis...@stackless.com wrote:
 IOW: Do we really need a full abstraction, embedded in a virtual OS, or
 is there already a compromise that suits 98 percent of the common needs?

 I think as a starter, categorizing the expectations of some measure of 
 'secure python'
 would make sense. And I'm asking the people with better knowledge of these 
 matters
 than I have. (and not asking those who don't... ;-) )

 The litany of vulnerability reports against the Java sandbox has long
 confirmed my impression that secure sandboxing is a hard, not
 completely solved problem, best left to better resourced platform
 developers (or at least taking the appropriate steps to benefit from
 their work).

 A self-hosted language runtime level sandbox is, at best, a first line
 of defence that protects against basic, naive attacks. One of the
 assumptions I see from the folks working on operating systems, virtual
 machine and container security is that the sandboxes *will* be
 compromised at some point, so you have to make sure to understand what
 the consequences of those breaches will be, and the best answer is
 they run into the next line of defence, so the only thing they have
 gained is the ability to attack that).

 In terms of in-process sandboxing of CPython (*at all*, let alone
 self-hosted), we're currently missing some key foundational
 components:

 - the ability for a host process to cleanly configure the capabilities
 of an embedded CPython interpreter (that's what PEP 432 is all about)
 - elimination of all of the mechanisms by which hostile untrusted code
 can trigger a segfault in the runtime (any segfault bug can reasonably
 be assumed to be a security vulnerability waiting to be exploited, the
 only question is whether the CPython runtime is part of the exposed
 attack surface, and what the consequences are of compromising the
 runtime). While Victor Stinner's recent work with failmalloc has been
 a big step forward here, as have been various other changes in the
 CPython code base (like adding recursion depth constraints to the
 compiler toolchain), we're still a long way from being able to say
 CPython cannot be segfaulted by legal Python code that doesn't use
 ctypes or an equivalent FFI library.

 This is why I share Guido's (and the PyPy team's) view that secure,
 cross-platform sandboxing of (C)Python is currently not possible.
 Secure in-process sandboxing is hard even for languages like Lua,
 JavaScript and Java that were designed from the ground up with
 sandboxing in mind - sure, you can lock things down to the point where
 untrusted code assuredly can't do any damage, but it often can't do
 anything *useful* in that state, either.

 By contrast, the PyPy sandbox model which uses a deliberately
 constrained runtime to execute untrusted code in an OS level process
 that is designed to only permit communication with the parent process
 is *exactly* the kind of paranoid defence-in-depth approach that
 should be employed when running untrusted code. Ideally, all of the
 platform level this child process is not allowed to do anything
 except talk to me over stdin and stdout would also be brought to bear
 on the sandboxed runtime, so that as yet undiscovered vulnerabilities
 in the PyPy sandbox don't result in a system compromise.

 Anyone interested in sandboxing of Python code would be well-advised
 to direct their efforts towards the parent process bindings for
 http://doc.pypy.org/en/latest/sandbox.html, as well as identifying the
 associated platform specific settings to lock out the child process
 from all system access except communication with the parent process
 over the standard streams.

Note Nick that the part that runs stuff in child process (as opposed
to have two different pythons running in the same process) is really
not a limitation of the approach. It's just that it's a proof of
concept and various other options are also possible, just noone seems
to be interested to pursue them. Additional OS level blocking is
really only working against potential segfaults, since we know that
there is no IO possible from the inner process. A JIT-less PyPy
sandbox can be made very secure by locking the executable pages as
non-writable (we know the code does not do any IO).

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   3   4   >