from:"Maciej Fijalkowski"

The easiest version is to have global numbering (as opposed to local).

Anyway, I would strongly suggest getting some benchmarks done and
showing performance benefits first, because you don't want PEPs to be
final when you don't exactly know the details.

On Wed, Jan 20, 2016 at 7:02 PM, Yury Selivanov  wrote:
> On 2016-01-18 5:43 PM, Victor Stinner wrote:
>>
>> Is someone opposed to this PEP 509?
>>
>> The main complain was the change on the public Python API, but the PEP
>> doesn't change the Python API anymore.
>>
>> I'm not aware of any remaining issue on this PEP.
>
>
> Victor,
>
> I've been experimenting with the PEP to implement a per-opcode
> cache in ceval loop (I'll share my progress on that in a few
> days).  This allows to significantly speedup LOAD_GLOBAL and
> LOAD_METHOD opcodes, to the point, where they don't require
> any dict lookups at all.  Some macro-benchmarks (such as
> chameleon_v2) demonstrate impressive ~10% performance boost.
>
> I rely on your dict->ma_version to implement cache invalidation.
>
> However, besides guarding against version change, I also want
> to guard against the dict being swapped for another dict, to
> avoid situations like this:
>
>
> def foo():
> print(bar)
>
> exec(foo.__code__, {'bar': 1}, {})
> exec(foo.__code__, {'bar': 2}, {})
>
>
> What I propose is to add a pointer "ma_extra" (same 64bits),
> which will be set to NULL for most dict instances (instead of
> ma_version).  "ma_extra" can then point to a struct that has a
> globally unique dict ID (uint64), and a version tag (unit64).
> A macro like PyDict_GET_ID and PyDict_GET_VERSION could then
> efficiently fetch the version/unique ID of the dict for guards.
>
> "ma_extra" would also make it easier for us to extend dicts
> in the future.
>
> Yury
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 509: Add a private version to dict

On Wed, Jan 20, 2016 at 7:22 PM, Brett Cannon  wrote:
>
>
> On Wed, 20 Jan 2016 at 10:11 Yury Selivanov  wrote:
>>
>> On 2016-01-18 5:43 PM, Victor Stinner wrote:
>> > Is someone opposed to this PEP 509?
>> >
>> > The main complain was the change on the public Python API, but the PEP
>> > doesn't change the Python API anymore.
>> >
>> > I'm not aware of any remaining issue on this PEP.
>>
>> Victor,
>>
>> I've been experimenting with the PEP to implement a per-opcode
>> cache in ceval loop (I'll share my progress on that in a few
>> days).  This allows to significantly speedup LOAD_GLOBAL and
>> LOAD_METHOD opcodes, to the point, where they don't require
>> any dict lookups at all.  Some macro-benchmarks (such as
>> chameleon_v2) demonstrate impressive ~10% performance boost.
>
>
> Ooh, now my brain is trying to figure out the design of the cache. :)
>
>>
>>
>> I rely on your dict->ma_version to implement cache invalidation.
>>
>> However, besides guarding against version change, I also want
>> to guard against the dict being swapped for another dict, to
>> avoid situations like this:
>>
>>
>>  def foo():
>>  print(bar)
>>
>>  exec(foo.__code__, {'bar': 1}, {})
>>  exec(foo.__code__, {'bar': 2}, {})
>>
>>
>> What I propose is to add a pointer "ma_extra" (same 64bits),
>> which will be set to NULL for most dict instances (instead of
>> ma_version).  "ma_extra" can then point to a struct that has a
>> globally unique dict ID (uint64), and a version tag (unit64).
>> A macro like PyDict_GET_ID and PyDict_GET_VERSION could then
>> efficiently fetch the version/unique ID of the dict for guards.
>>
>> "ma_extra" would also make it easier for us to extend dicts
>> in the future.
>
>
> Why can't you simply use the id of the dict object as the globally unique
> dict ID? It's already globally unique amongst all Python objects which makes
> it inherently unique amongst dicts.
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>

Brett, you need two things - the ID of the dict and the version tag.
What we do in pypy is we have a small object (called, surprisingly,
VersionTag()) and we use the ID of that. That way you can change the
version id of an existing dict and have only one field.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 509: Add a private version to dict

On Wed, Jan 20, 2016 at 8:00 PM, Yury Selivanov <yselivanov...@gmail.com> wrote:
>
>
> On 2016-01-20 1:36 PM, Maciej Fijalkowski wrote:
>>
>> On Wed, Jan 20, 2016 at 7:22 PM, Brett Cannon <br...@python.org> wrote:
>>>
>>>
>>> On Wed, 20 Jan 2016 at 10:11 Yury Selivanov <yselivanov...@gmail.com>
>>> wrote:
>
> [..]
>>>>
>>>> "ma_extra" would also make it easier for us to extend dicts
>>>> in the future.
>>>
>>>
>>> Why can't you simply use the id of the dict object as the globally unique
>>> dict ID? It's already globally unique amongst all Python objects which
>>> makes
>>> it inherently unique amongst dicts.
>>>
>>>
>> Brett, you need two things - the ID of the dict and the version tag.
>> What we do in pypy is we have a small object (called, surprisingly,
>> VersionTag()) and we use the ID of that. That way you can change the
>> version id of an existing dict and have only one field.
>
>
>
> Yeah, that's essentially what I propose with ma_extra.
>
> Yury

The trick is we use only one field :-)

you're proposing to have both fields - version tag and dict id. Why
not just use the id of the object (without any fields)?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 509: Add a private version to dict

there is also the problem that you don't want it on all dicts. So
having two extra words is more to pay than having extra objects (also,
comparison is cheaper for guards)

On Wed, Jan 20, 2016 at 8:23 PM, Yury Selivanov <yselivanov...@gmail.com> wrote:
>
>
> On 2016-01-20 2:09 PM, Maciej Fijalkowski wrote:
>>>
>>> >
>>
>> You don't free a version tag that's stored in the guard. You store the
>> object and not id
>
>
> Ah, got it.  Although for my current cache design it would be
> more memory efficient to use the dict itself to store its own
> unique id and tag, hence my "ma_extra" proposal.  In any case,
> the current "ma_version" proposal is flawed :(
>
> Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 509: Add a private version to dict

On Wed, Jan 20, 2016 at 8:08 PM, Yury Selivanov <yselivanov...@gmail.com> wrote:
>
> On 2016-01-20 2:02 PM, Maciej Fijalkowski wrote:
>>
>> On Wed, Jan 20, 2016 at 8:00 PM, Yury Selivanov <yselivanov...@gmail.com>
>> wrote:
>>
> [..]
>>>>
>>>> Brett, you need two things - the ID of the dict and the version tag.
>>>> What we do in pypy is we have a small object (called, surprisingly,
>>>> VersionTag()) and we use the ID of that. That way you can change the
>>>> version id of an existing dict and have only one field.
>>>
>>> Yeah, that's essentially what I propose with ma_extra.
>>>
>>> Yury
>>
>> The trick is we use only one field :-)
>>
>> you're proposing to have both fields - version tag and dict id. Why
>> not just use the id of the object (without any fields)?
>
>
> What if your old dict is GCed, its "VersionTag()" (1) object is
> freed, and you have a new dict, for which a new "VersionTag()" (2)
> object happens to be allocated at the same address as (1)?
>
> Yury
>

You don't free a version tag that's stored in the guard. You store the
object and not id
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] _PyThreadState_Current

2016-01-18 Thread Maciej Fijalkowski

seems to work thanks.

That said, I would love to have PyThreadState_Get equivalent that
would let me handle the NULL.

On Mon, Jan 18, 2016 at 9:31 PM, Maciej Fijalkowski <fij...@gmail.com> wrote:
> Good point
>
> On Mon, Jan 18, 2016 at 9:25 PM, Victor Stinner
> <victor.stin...@gmail.com> wrote:
>> Hum, you can try to lie and define Py_BUILD_CORE?
>>
>> Victor
>>
>> 2016-01-18 21:18 GMT+01:00 Maciej Fijalkowski <fij...@gmail.com>:
>>> Hi
>>>
>>> change in between 3.5.0 and 3.5.1 (hiding _PyThreadState_Current and
>>> pyatomic.h) broke vmprof. The problem is that as a profile, vmprof can
>>> really encounter _PyThreadState_Current being null, while crashing an
>>> interpreter is a bit not ideal in this case.
>>>
>>> Any chance, a) _PyThreadState_Current can be restored in visibility?
>>> b) can I get a better API to get it in case it can be NULL, but also
>>> in 3.5 (since it works in 3.5.0 and breaks in 3.5.1)
>>>
>>> Cheers,
>>> fijal
>>> ___
>>> Python-Dev mailing list
>>> Python-Dev@python.org
>>> https://mail.python.org/mailman/listinfo/python-dev
>>> Unsubscribe: 
>>> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] _PyThreadState_Current

2016-01-18 Thread Maciej Fijalkowski

Good point

On Mon, Jan 18, 2016 at 9:25 PM, Victor Stinner
<victor.stin...@gmail.com> wrote:
> Hum, you can try to lie and define Py_BUILD_CORE?
>
> Victor
>
> 2016-01-18 21:18 GMT+01:00 Maciej Fijalkowski <fij...@gmail.com>:
>> Hi
>>
>> change in between 3.5.0 and 3.5.1 (hiding _PyThreadState_Current and
>> pyatomic.h) broke vmprof. The problem is that as a profile, vmprof can
>> really encounter _PyThreadState_Current being null, while crashing an
>> interpreter is a bit not ideal in this case.
>>
>> Any chance, a) _PyThreadState_Current can be restored in visibility?
>> b) can I get a better API to get it in case it can be NULL, but also
>> in 3.5 (since it works in 3.5.0 and breaks in 3.5.1)
>>
>> Cheers,
>> fijal
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: 
>> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 509: Add a private version to dict

2016-01-11 Thread Maciej Fijalkowski

Hi Victor.

You know that pypy does this stuff without changing and exposing
python semantics right? We have a version dict that does not leak
abstractions to the user.

In general, doing stuff like that where there is a public API that
leaks details of certain optimizations makes it harder and harder for
optimizing compilers to do their job properly, if you want to do
something slightly different.

Can we make this happen (as you noted in the prior art) WITHOUT
changing ANY of the things exposed to the user?

On Mon, Jan 11, 2016 at 6:49 PM, Victor Stinner
 wrote:
> Hi,
>
> After a first round on python-ideas, here is the second version of my
> PEP. The main changes since the first version are that the dictionary
> version is no more exposed at the Python level and the field type now
> also has a size of 64-bit on 32-bit platforms.
>
> The PEP is part of a serie of 3 PEP adding an API to implement a
> static Python optimizer specializing functions with guards. The second
> PEP is currently discussed on python-ideas and I'm still working on
> the third PEP.
>
> Thanks to Red Hat for giving me time to experiment on this.
>
>
> HTML version:
> https://www.python.org/dev/peps/pep-0509/
>
>
> PEP: 509
> Title: Add a private version to dict
> Version: $Revision$
> Last-Modified: $Date$
> Author: Victor Stinner 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 4-January-2016
> Python-Version: 3.6
>
>
> Abstract
> 
>
> Add a new private version to builtin ``dict`` type, incremented at each
> change, to implement fast guards on namespaces.
>
>
> Rationale
> =
>
> In Python, the builtin ``dict`` type is used by many instructions. For
> example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the
> global namespace, or in the builtins namespace (two dict lookups).
> Python uses ``dict`` for the builtins namespace, globals namespace, type
> namespaces, instance namespaces, etc. The local namespace (namespace of
> a function) is usually optimized to an array, but it can be a dict too.
>
> Python is hard to optimize because almost everything is mutable: builtin
> functions, function code, global variables, local variables, ... can be
> modified at runtime. Implementing optimizations respecting the Python
> semantics requires to detect when "something changes": we will call
> these checks "guards".
>
> The speedup of optimizations depends on the speed of guard checks. This
> PEP proposes to add a version to dictionaries to implement fast guards
> on namespaces.
>
> Dictionary lookups can be skipped if the version does not change which
> is the common case for most namespaces. The performance of a guard does
> not depend on the number of watched dictionary entries, complexity of
> O(1), if the dictionary version does not change.
>
> Example of optimization: copy the value of a global variable to function
> constants.  This optimization requires a guard on the global variable to
> check if it was modified. If the variable is modified, the variable must
> be loaded at runtime when the function is called, instead of using the
> constant.
>
> See the `PEP 510 -- Specialized functions with guards
> `_ for the concrete usage of
> guards to specialize functions and for the rationale on Python static
> optimizers.
>
>
> Guard example
> =
>
> Pseudo-code of an fast guard to check if a dictionary entry was modified
> (created, updated or deleted) using an hypothetical
> ``dict_get_version(dict)`` function::
>
> UNSET = object()
>
> class GuardDictKey:
> def __init__(self, dict, key):
> self.dict = dict
> self.key = key
> self.value = dict.get(key, UNSET)
> self.version = dict_get_version(dict)
>
> def check(self):
> """Return True if the dictionary entry did not changed."""
>
> # read the version field of the dict structure
> version = dict_get_version(self.dict)
> if version == self.version:
> # Fast-path: dictionary lookup avoided
> return True
>
> # lookup in the dictionary
> value = self.dict.get(self.key, UNSET)
> if value is self.value:
> # another key was modified:
> # cache the new dictionary version
> self.version = version
> return True
>
> # the key was modified
> return False
>
>
> Usage of the dict version
> =
>
> Specialized functions using guards
> --
>
> The `PEP 510 -- Specialized functions with guards
> `_ proposes an API to support
> specialized functions with guards. It allows to implement static
> optimizers for Python without breaking the Python semantics.
>
> Example of a static

Re: [Python-Dev] PEP 509: Add a private version to dict

2016-01-11 Thread Maciej Fijalkowski

On Mon, Jan 11, 2016 at 9:56 PM, Victor Stinner
<victor.stin...@gmail.com> wrote:
> Le 11 janv. 2016 8:09 PM, "Maciej Fijalkowski" <fij...@gmail.com> a écrit :
>> Hi Victor.
>>
>> You know that pypy does this stuff without changing and exposing
>> python semantics right? We have a version dict that does not leak
>> abstractions to the user.
>
> The PEP adds a field to the C structure PyDictObject. Are you asking me to
> hide it from the C structure?
>
> The first version of my PEP added a public read-only property at Python
> level, but I changed the PEP. See the alternatives section for more detail.
>
> Victor

I asked you to hide it from python, read the wrong version :-)

Cool!
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Idea: Dictionary references

2015-12-17 Thread Maciej Fijalkowski

You can very easily implement this with version tags on the globals
dictionaries - means that the dictionaries have versions and the guard
checking if everything is ok just checks the version tag on globals.

Generally speaking, such optimizations have been done in the past
(even in places like pypy, but also in literature) and as soon as we
have dynamic compilation (and FAT is a form of it), you can do such
tricks.

On Thu, Dec 17, 2015 at 3:48 PM, Steven D'Aprano  wrote:
> On Thu, Dec 17, 2015 at 12:53:13PM +0100, Victor Stinner quoted:
>> 2015-12-17 11:54 GMT+01:00 Franklin? Lee :
>
>> > Each function keeps an indirect, automagically updated
>> > reference to the current value of the names they use,
>
> Isn't that a description of globals()? If you want to look up a name
> "spam", you grab an indirect reference to it:
>
> globals()["spam"]
>
> which returns the current value of the name "spam".
>
>
>> > and will never need to look things up again.[*]
>
> How will this work?
>
> Naively, it sounds to me like Franklin is suggesting that on every
> global assignment, the interpreter will have to touch every single
> function in the module to update that name. Something like this:
>
> # on a global assignment
> spam = 23
>
> # the interpreter must do something like this:
> for function in module.list_of_functions:
> if "spam" in function.__code__.__global_names__:
> function.__code__.__global_names__["spam"] = spam
>
> As I said, that's a very naive way to implement this. Unless you have
> something much cleverer, I think this will be horribly slow.
>
> And besides, you *still* need to deal with the case that the name isn't
> a global at all, but in the built-ins namespace.
>
>
> --
> Steve
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Avoiding CPython performance regressions

Hi David.

Any reason you run a tiny tiny subset of benchmarks?

On Tue, Dec 1, 2015 at 5:26 PM, Stewart, David C
 wrote:
>
>
> From: Fabio Zadrozny >
> Date: Tuesday, December 1, 2015 at 1:36 AM
> To: David Stewart 
> >
> Cc: "R. David Murray" >, 
> "python-dev@python.org" 
> >
> Subject: Re: [Python-Dev] Avoiding CPython performance regressions
>
>
> On Mon, Nov 30, 2015 at 3:33 PM, Stewart, David C 
> > wrote:
>
> On 11/30/15, 5:52 AM, "Python-Dev on behalf of R. David Murray" 
> 
>  on behalf of rdmur...@bitdance.com> wrote:
>
>>
>>There's also an Intel project posted about here recently that checks
>>individual benchmarks for performance regressions and posts the results
>>to python-checkins.
>
> The description of the project is at https://01.org/lp - Python results are 
> indeed sent daily to python-checkins. (No results for Nov 30 and Dec 1 due to 
> Romania National Day holiday!)
>
> There is also a graphic dashboard at http://languagesperformance.intel.com/
>
> Hi Dave,
>
> Interesting, but I'm curious on which benchmark set are you running? From the 
> graphs it seems it has a really high standard deviation, so, I'm curious to 
> know if that's really due to changes in the CPython codebase / issues in the 
> benchmark set or in how the benchmarks are run... (it doesn't seem to be the 
> benchmarks from https://hg.python.org/benchmarks/ right?).
>
> Fabio – my advice to you is to check out the daily emails sent to 
> python-checkins. An example is 
> https://mail.python.org/pipermail/python-checkins/2015-November/140185.html. 
> If you still have questions, Stefan can answer (he is copied).
>
> The graphs are really just a manager-level indicator of trends, which I find 
> very useful (I have it running continuously on one of the monitors in my 
> office) but core developers might want to see day-to-day the effect of their 
> changes. (Particular if they thought one was going to improve performance. 
> It's nice to see if you get community confirmation).
>
> We do run nightly a subset of https://hg.python.org/benchmarks/ and run the 
> full set when we are evaluating our performance patches.
>
> Some of the "benchmarks" really do have a high standard deviation, which 
> makes them hardly very useful for measuring incremental performance 
> improvements, IMHO. I like to see it spelled out so I can tell whether I 
> should be worried or not about a particular delta.
>
> Dave
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Avoiding CPython performance regressions

On Tue, Dec 1, 2015 at 9:04 PM, Stewart, David C
<david.c.stew...@intel.com> wrote:
> On 12/1/15, 10:56 AM, "Maciej Fijalkowski" <fij...@gmail.com> wrote:
>
>
>
>>Hi David.
>>
>>Any reason you run a tiny tiny subset of benchmarks?
>
> We could always run more. There are so many in the full set in 
> https://hg.python.org/benchmarks/ with such divergent results that it seems 
> hard to see the forest because there are so many trees. I'm more interested 
> in gradually adding to the set rather than the huge blast of all of them in 
> daily email. Would you disagree?
>
> Part of the reason that I monitor ssbench so closely on Python 2 is that 
> Swift is a major element in cloud computing (and OpenStack in particular) and 
> has ~70% of its cycles in Python.

Last time I checked, Swift was quite a bit faster under pypy :-)

>
> We are really interested in workloads which are representative of the way 
> Python is used by a lot of people and which produce repeatable results. (and 
> which are open source). Do you have a suggestions?

You know our benchmark suite (https://bitbucket.org/pypy/benchmarks),
we're gradually incorporating what people report. That means that
(Typically) it'll be open source library benchmarks, if they get to
the point of writing some. I have for example coming django ORM
benchmark, can show you if you want. I don't think there is a
"representative benchmark" or maybe even "representative set", also
because open source code tends to be higher quality and less
spaghetti-like than closed source code that I've seen, but we're
adding and adding.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Avoiding CPython performance regressions

Hi

Thanks for doing the work! I'm on of the pypy devs and I'm very
interested in seeing this getting somewhere. I must say I struggle to
read the graph - is red good or is red bad for example?

I'm keen to help you getting anything you want to run it repeatedly.

PS. The intel stuff runs one benchmark in a very questionable manner,
so let's maybe not rely on it too much.

On Mon, Nov 30, 2015 at 3:52 PM, R. David Murray  wrote:
> On Mon, 30 Nov 2015 09:02:12 -0200, Fabio Zadrozny  wrote:
>> Note that uploading the data to SpeedTin should be pretty straightforward
>> (by using https://github.com/fabioz/pyspeedtin, so, the main issue would be
>> setting up o machine to run the benchmarks).
>
> Thanks, but Zach almost has this working using codespeed (he's still
> waiting on a review from infrastructure, I think).  The server was not in
> fact running; a large part of what Zach did was to get that server set up.
> I don't know what it would take to export the data to another consumer,
> but if you want to work on that I'm guessing there would be no objection.
> And I'm sure there would be no objection if you want to get involved
> in maintaining the benchmark server!
>
> There's also an Intel project posted about here recently that checks
> individual benchmarks for performance regressions and posts the results
> to python-checkins.
>
> --David
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Avoiding CPython performance regressions

On Tue, Dec 1, 2015 at 11:49 AM, Fabio Zadrozny <fabi...@gmail.com> wrote:
>
> On Tue, Dec 1, 2015 at 6:36 AM, Maciej Fijalkowski <fij...@gmail.com> wrote:
>>
>> Hi
>>
>> Thanks for doing the work! I'm on of the pypy devs and I'm very
>> interested in seeing this getting somewhere. I must say I struggle to
>> read the graph - is red good or is red bad for example?
>>
>> I'm keen to help you getting anything you want to run it repeatedly.
>>
>> PS. The intel stuff runs one benchmark in a very questionable manner,
>> so let's maybe not rely on it too much.
>
>
> Hi Maciej,
>
> Great, it'd be awesome having data on multiple Python VMs (my latest target
> is really having a way to compare across multiple VMs/versions easily and
> help each implementation keep a focus on performance). Ideally, a single,
> dedicated machine could be used just to run the benchmarks from multiple VMs
> (one less variable to take into account for comparisons later on, as I'm not
> sure it'd be reliable to normalize benchmark data from different machines --
> it seems Zach was the one to contact from that, but if there's such a
> machine already being used to run PyPy, maybe it could be extended to run
> other VMs too?).
>
> As for the graph, it should be easy to customize (and I'm open to
> suggestions). In the case, as it is, red is slower and blue is faster (so,
> for instance in
> https://www.speedtin.com/reports/1_CPython27x_Performance_Over_Time,  the
> fastest CPython version overall was 2.7.3 -- and 2.7.1 was the baseline).
> I've updated the comments to make it clearer (and changed the second graph
> to compare the latest against the fastest version (2.7.rc11 vs 2.7.3) for
> the individual benchmarks.
>
> Best Regards,
>
> Fabio

There is definitely a machine available. I suggest you ask
python-infra list for access. It definitely can be used to run more
than just pypy stuff. As for normalizing across multiple machines -
don't even bother. Different architectures make A LOT of difference,
especially with cache sizes and whatnot, that seems to have different
impact on different loads.

As for graph - I like the split on the benchmarks and a better
description (higher is better) would be good.

I have a lot of ideas about visualizations, pop in on IRC, I'm happy
to discuss :-)

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Benchmark results across all major Python implementations

2015-11-16 Thread Maciej Fijalkowski

Hi Brett

Any thoughts on improving the benchmark set (I think all of
{cpython,pypy,pyston} introduced new benchmarks to the set).
"speed.python.org" becoming a thing is generally stopped on "noone
cares enough to set it up".

Cheers,
fijal


On Mon, Nov 16, 2015 at 9:18 PM, Brett Cannon  wrote:
> I gave the opening keynote at PyCon CA and then gave the same talk at PyData
> NYC on the various interpreters of Python (Jupyter notebook of my
> presentation can be found at bit.ly/pycon-ca-keynote; no video yet). I
> figured people here might find the benchmark numbers interesting so I'm
> sharing the link here.
>
> I'm still hoping someday speed.python.org becomes a thing so I never have to
> spend so much time benchmarking so may Python implementations ever again and
> this sort of thing is just part of what we do to keep the implementation
> ecosystem healthy.
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Second milestone of FAT Python

2015-11-04 Thread Maciej Fijalkowski

How do you check that someone did not e.g. bind something different to "len"?

On Wed, Nov 4, 2015 at 8:50 AM, Victor Stinner  wrote:
> Hi,
>
> I'm writing a new "FAT Python" project to try to implement optimizations in
> CPython (inlining, constant folding, move invariants out of loops, etc.)
> using a "static" optimizer (not a JIT). For the background, see the thread
> on python-ideas:
> https://mail.python.org/pipermail/python-ideas/2015-October/036908.html
>
> See also the documentation:
> https://hg.python.org/sandbox/fatpython/file/tip/FATPYTHON.rst
> https://hg.python.org/sandbox/fatpython/file/tip/ASTOPTIMIZER.rst
>
> I implemented the most basic optimization to test my code: replace calls to
> builtin functions (with constant arguments) with the result. For example,
> len("abc") is replaced with 3. I reached the second milestone: it's now
> possible to run the full Python test suite with these optimizations enabled.
> It confirms that the optimizations don't break the Python semantic.
>
> Example:
> ---
 def func():
> ... return len("abc")
> ...
 import dis
 dis.dis(func)
>   2   0 LOAD_GLOBAL  0 (len)
>   3 LOAD_CONST   1 ('abc')
>   6 CALL_FUNCTION1 (1 positional, 0 keyword pair)
>   9 RETURN_VALUE
>
 len(func.get_specialized())
> 1
 specialized=func.get_specialized()[0]
 dis.dis(specialized['code'])
>   2   0 LOAD_CONST   1 (3)
>   3 RETURN_VALUE
 len(specialized['guards'])
> 2
>
 func()
> 3
>
 len=lambda obj: "mock"
 func()
> 'mock'
 func.get_specialized()
> []
> ---
>
> The function func() has specialized bytecode which returns directly 3
> instead of calling len("abc"). The specialized bytecode has two guards
> dictionary keys: builtins.__dict__['len'] and globals()['len']. If one of
> these keys is modified, the specialized bytecode is simply removed (when the
> function is called) and the original bytecode is executed.
>
>
> You cannot expect any speedup at this milestone, it's just to validate the
> implementation. You can only get speedup if you implement *manually*
> optimizations. See for example posixpath.isabs() which inlines manually the
> call to the _get_sep() function. More optimizations will be implemented in
> the third milestone. I don't know yet if I will be able to implement
> constant folding, function inlining and/or moving invariants out of loops.
>
>
> Download, compile and test FAT Python with:
>
> hg clone http://hg.python.org/sandbox/fatpython
> ./configure && make && ./python -m test test_astoptimizer test_fat
>
>
> Currently, only 24 functions are specialized in the standard library.
> Calling a builtin function with constant arguments in not common (it was
> expected, it's only the first step for my optimizer). But 161 functions are
> specialized in tests.
>
>
> To be honest, I had to modify some tests to make them pass in FAT mode. But
> most changes are related to the .pyc filename, or to the exact size in bytes
> of dictionary objects.
>
> FAT Python is still experimental. Currently, the main bug is that the AST
> optimizer can optimize a call to a function which is not the expected
> builtin function. I already started to implement code to understand
> namespaces (detect global and local variables), but it's not enough yet to
> detect when a builtin is overriden. See TODO.rst for known bugs and
> limitations.
>
> Victor
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Second milestone of FAT Python

2015-11-04 Thread Maciej Fijalkowski

Uh, sorry, misread your full mail, scratch that

On Wed, Nov 4, 2015 at 9:07 AM, Maciej Fijalkowski <fij...@gmail.com> wrote:
> How do you check that someone did not e.g. bind something different to "len"?
>
> On Wed, Nov 4, 2015 at 8:50 AM, Victor Stinner <victor.stin...@gmail.com> 
> wrote:
>> Hi,
>>
>> I'm writing a new "FAT Python" project to try to implement optimizations in
>> CPython (inlining, constant folding, move invariants out of loops, etc.)
>> using a "static" optimizer (not a JIT). For the background, see the thread
>> on python-ideas:
>> https://mail.python.org/pipermail/python-ideas/2015-October/036908.html
>>
>> See also the documentation:
>> https://hg.python.org/sandbox/fatpython/file/tip/FATPYTHON.rst
>> https://hg.python.org/sandbox/fatpython/file/tip/ASTOPTIMIZER.rst
>>
>> I implemented the most basic optimization to test my code: replace calls to
>> builtin functions (with constant arguments) with the result. For example,
>> len("abc") is replaced with 3. I reached the second milestone: it's now
>> possible to run the full Python test suite with these optimizations enabled.
>> It confirms that the optimizations don't break the Python semantic.
>>
>> Example:
>> ---
>>>>> def func():
>> ... return len("abc")
>> ...
>>>>> import dis
>>>>> dis.dis(func)
>>   2   0 LOAD_GLOBAL  0 (len)
>>   3 LOAD_CONST   1 ('abc')
>>   6 CALL_FUNCTION1 (1 positional, 0 keyword pair)
>>   9 RETURN_VALUE
>>
>>>>> len(func.get_specialized())
>> 1
>>>>> specialized=func.get_specialized()[0]
>>>>> dis.dis(specialized['code'])
>>   2   0 LOAD_CONST   1 (3)
>>   3 RETURN_VALUE
>>>>> len(specialized['guards'])
>> 2
>>
>>>>> func()
>> 3
>>
>>>>> len=lambda obj: "mock"
>>>>> func()
>> 'mock'
>>>>> func.get_specialized()
>> []
>> ---
>>
>> The function func() has specialized bytecode which returns directly 3
>> instead of calling len("abc"). The specialized bytecode has two guards
>> dictionary keys: builtins.__dict__['len'] and globals()['len']. If one of
>> these keys is modified, the specialized bytecode is simply removed (when the
>> function is called) and the original bytecode is executed.
>>
>>
>> You cannot expect any speedup at this milestone, it's just to validate the
>> implementation. You can only get speedup if you implement *manually*
>> optimizations. See for example posixpath.isabs() which inlines manually the
>> call to the _get_sep() function. More optimizations will be implemented in
>> the third milestone. I don't know yet if I will be able to implement
>> constant folding, function inlining and/or moving invariants out of loops.
>>
>>
>> Download, compile and test FAT Python with:
>>
>> hg clone http://hg.python.org/sandbox/fatpython
>> ./configure && make && ./python -m test test_astoptimizer test_fat
>>
>>
>> Currently, only 24 functions are specialized in the standard library.
>> Calling a builtin function with constant arguments in not common (it was
>> expected, it's only the first step for my optimizer). But 161 functions are
>> specialized in tests.
>>
>>
>> To be honest, I had to modify some tests to make them pass in FAT mode. But
>> most changes are related to the .pyc filename, or to the exact size in bytes
>> of dictionary objects.
>>
>> FAT Python is still experimental. Currently, the main bug is that the AST
>> optimizer can optimize a call to a function which is not the expected
>> builtin function. I already started to implement code to understand
>> namespaces (detect global and local variables), but it's not enough yet to
>> detect when a builtin is overriden. See TODO.rst for known bugs and
>> limitations.
>>
>> Victor
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] compatibility for C-accelerated types

2015-10-20 Thread Maciej Fijalkowski

For what is worth, that level of differences already exists on pypy
and it's really hard to get the *exact* same semantics if things are
implemented in python vs C or the other way around.

Example list of differences (which I think OrderedDict already breaks
if moved to C):

* do methods like items call special methods like __getitem__ (I think
it's undecided anyway)

* what happens if you take a method and rebind it to another subclass,
does it automatically become a method (there are differences between
built in and pure python)

* atomicity of operations. Some operations used to be non-atomic in
Python will be atomic now.

I personally think those (and the __class__ issue) are unavoidable

On Mon, Oct 19, 2015 at 11:47 PM, Serhiy Storchaka  wrote:
> On 20.10.15 00:00, Guido van Rossum wrote:
>>
>> Apart from Serhiy's detraction of the 3.5 bug report there wasn't any
>> discussion in this thread. I also don't really see any specific
>> questions, so maybe you don't have any. Are you just asking whether it's
>> okay to merge your code? Or are you asking for more code review?
>
>
> I think Eric ask whether it's okay to have some incompatibility between
> Python and C implementations.
>
> 1. Is it okay to have a difference in effect of __class__ assignment. Pure
> Python and extension classes have different restrictions. For example
> (tested example this time) following code works with Python implementation
> in 3.4, but fails with C implementation in 3.5:
>
> from collections import OrderedDict
> od = OrderedDict()
> class D(dict): pass
>
> od.__class__ = D
>
> 2. Is it okay to use obj.__class__ in Python implementation and type(obj) in
> C implementation for the sake of code simplification? Can we ignore subtle
> differences?
>
> 3. In general, is it okay to have some incompatibility between Python and C
> implementations for the sake of code simplification, and where the border
> line lies?
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] An example of Python 3 promotion attitude

2015-10-06 Thread Maciej Fijalkowski

There was a discussion a while ago about python 3 and the attitude on
social media and there was a lack of examples. Here is one example:

https://www.reddit.com/r/Python/comments/3nl5ut/ninite_the_popular_website_to_install_essential/

According to some people, it is everybodys job to promote python 3 and
force people to upgrade. This is really not something I enjoy (people
telling me pypy should promote python 3 - it's not really our job).

Now I sometimes feel that there is not enough sentiment in python-dev
to distance from such ideas. It *is* python-dev job to promote
python3, but it's also python-dev job sometimes to point out that
whatever helps in promoting the python ecosystem (e.g. in case of pypy
is speed) is a good enough reason to do those things.

I wonder what are other people ideas about that.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Issue #25256: Add sys.debug_build?

2015-10-02 Thread Maciej Fijalkowski

Speaking of other python implementations - why would you even care?
(the pypy debug build has very different properties and does very
different stuff for example). I would be very happy to have this
clearly marked as implementation-dependent and that's why it would be
cool to not be in sys (there are already 5 symbols there for this
reason, so hasattr totalrefcount is cool enough)

On Fri, Oct 2, 2015 at 2:19 PM, Victor Stinner  wrote:
> 2015-10-02 13:16 GMT+02:00 Nir Soffer :
>> Whats wrong with:
>>
> sysconfig.get_config_var('Py_DEBUG')
>> 0
>
> Again, refer to my first message "On the Internet, I found various
> recipes to check if Python is compiled is debug mode. Sadly, some of
> them are not portable."
>
> I don't think that sysconfig.get_config_var('Py_DEBUG') will work on
> other Python implementations.
>
> On Windows, there is no such file like "Makefile" used to fill
> syscofig.get_config_vars() :-( sysconfig._init_non_posix() only fills
> a few variables like BINDIR or INCLUDEPY, but not Py_DEBUG.
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] semantics of subclassing things from itertools

2015-09-14 Thread Maciej Fijalkowski

Hey Raymond

I'm sorry you got insulted, that was not my intention. I suppose
something like "itertools objects are implemented as classes
internally, which means they're subclassable like other builtin types"
is an improvement to documentation.

On Mon, Sep 14, 2015 at 12:17 AM, Raymond Hettinger
<raymond.hettin...@gmail.com> wrote:
>
>> On Sep 13, 2015, at 3:09 PM, Maciej Fijalkowski <fij...@gmail.com> wrote:
>>
>> Well, fair enough, but the semantics of "whatever happens to happen
>> because we decided subclassing is a cool idea" is possibly the worst
>> answer to those questions.
>
> It's hard to read this in any way that isn't insulting.
>
> It was subclassable because a) it was a class, 2) type/class unification was
> pushing us in the direction of making builtin types more like regular classes
> (which are subclassable), and 3) because it seemed potentially useful
> to users (and apparently it has been because users are subclassing it).
>
> FWIW, the code was modeled on what was done for enumerate() and
> reversed() where I got a lot of coaching and review from Tim Peters,
> Alex Martelli, Fredrik Lundh, and other python luminaries of the day.
>
>
>> Ideally, make it non-subclassable. If you
>> want to have it subclassable, then please have defined semantics as
>> opposed to undefined.
>
> No, I'm not going to change a 13 year-old API and break existing user code
> just because you've gotten worked-up about it.
>
> FWIW, the semantics wouldn't even be defined in the itertools docs.
> It is properly in some section that describes what happens to any C type
> that defines sets the Py_TPFLAGS_BASETYPE flag.   In general, all of
> the exposed dunder methods are overridable or extendable by subclassers.
>
>
> Raymond
>
>
> P.S.  Threads like this are why I've developed an aversion to python-dev.
> I've answered your questions with respect and candor. I've been sympathetic
> to your unique needs as someone building an implementation of a language
> that doesn't have a spec.  I was apologetic that the docs which have been
> helpful to users weren't precise enough for your needs.
>
> In return, you've suggested that my first contributions to Python were
> irresponsible and based on doing whatever seemed cool.
>
> In fact, the opposite is the case.  I spent a full summer researching how 
> similar
> tools were used in other languages and fitting them into Python in a way that
> supported known use cases.  I raised the standard of the Python docs by
> including rough python equivalent code, showing sample inputs and outputs,
> building a quick navigation and summary section as the top of the docs,
> adding a recipes section, making thorough unittests, and getting input from 
> Alex,
> Tim, and Fredrik (Guido also gave high level advice on the module design).
>
> I'm not inclined to go on with this thread. Your questions have been answered
> to the extent that I remember the answers.  If you have a doc patch you want
> to submit, please assign it to me on the tracker.  I would be happy to review 
> it.
>
>
>
>
>
>
>
>
>
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] semantics of subclassing things from itertools

2015-09-13 Thread Maciej Fijalkowski

On Fri, Sep 11, 2015 at 1:48 AM, Raymond Hettinger
<raymond.hettin...@gmail.com> wrote:
>
>> On Sep 10, 2015, at 3:23 AM, Maciej Fijalkowski <fij...@gmail.com> wrote:
>>
>> I would like to know what are the semantics if you subclass something
>> from itertools (e.g. islice).
>>
>> Right now it's allowed and people do it, which is why the
>> documentation is incorrect. It states "equivalent to: a function-or a
>> generator", but you can't subclass whatever it is equivalent to, which
>> is why in PyPy we're unable to make it work in pure python.
>>
>> I would like some clarification on that.
>
> The docs should say "roughly equivalent to" not "exactly equivalent to".
> The intended purpose of the examples in the itertools docs is to use
> pure python code to help people better understand each tool.  It is not
> is intended to dictate that tool x is a generator or is a function.
>
> The intended semantics are that the itertools are classes (not functions
> and not generators).  They are intended to be sub-classable (that is
> why they have Py_TPFLAGS_BASETYPE defined).

Ok, so what's completely missing from the documentation is what *are*
the semantics of subclasses of those classes? Can you override any
magic methods? Can you override next (which is or isn't a magic method
depending how you look)? Etc.

The documentation on this is completely missing and it's left guessing
with "whatever cpython happens to be doing".
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] semantics of subclassing things from itertools

2015-09-13 Thread Maciej Fijalkowski

On Sun, Sep 13, 2015 at 5:46 PM, Raymond Hettinger
<raymond.hettin...@gmail.com> wrote:
>
>> On Sep 13, 2015, at 3:49 AM, Maciej Fijalkowski <fij...@gmail.com> wrote:
>>
>>> The intended semantics are that the itertools are classes (not functions
>>> and not generators).  They are intended to be sub-classable (that is
>>> why they have Py_TPFLAGS_BASETYPE defined).
>>
>> Ok, so what's completely missing from the documentation is what *are*
>> the semantics of subclasses of those classes? Can you override any
>> magic methods? Can you override next (which is or isn't a magic method
>> depending how you look)? Etc.
>>
>> The documentation on this is completely missing and it's left guessing
>> with "whatever cpython happens to be doing".
>
> The reason it is underspecified is that this avenue of development was
> never explored (not thought about, planned, used, tested, or documented).
> IIRC, the entire decision process for having Py_TPFLAGS_BASETYPE
> boiled down to a single question:  Was there any reason to close this
> door and make the itertools not subclassable?
>
> For something like NoneType, there was a reason to be unsubclassable;
> otherwise, the default choice was to give users maximum flexibility
> (the itertools were intended to be a generic set of building blocks,
> forming what Guido termed an "iterator algebra").
>
> As an implementor of another version of Python, you are reasonably
> asking the question, what is the specification for subclassing semantics?
> The answer is somewhat unsatisfying -- I don't know because I've
> never thought about it.  As far as I can tell, this question has never
> come up in the 13 years of itertools existence and you may be the
> first person to have ever cared about this.
>
>
> Raymond

Well, fair enough, but the semantics of "whatever happens to happen
because we decided subclassing is a cool idea" is possibly the worst
answer to those questions. Ideally, make it non-subclassable. If you
want to have it subclassable, then please have defined semantics as
opposed to undefined.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] semantics of subclassing things from itertools

2015-09-10 Thread Maciej Fijalkowski

Hi

I would like to know what are the semantics if you subclass something
from itertools (e.g. islice).

Right now it's allowed and people do it, which is why the
documentation is incorrect. It states "equivalent to: a function-or a
generator", but you can't subclass whatever it is equivalent to, which
is why in PyPy we're unable to make it work in pure python.

I would like some clarification on that.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] semantics of subclassing things from itertools

2015-09-10 Thread Maciej Fijalkowski

On Thu, Sep 10, 2015 at 10:26 AM, Serhiy Storchaka <storch...@gmail.com> wrote:
> On 10.09.15 10:23, Maciej Fijalkowski wrote:
>>
>> I would like to know what are the semantics if you subclass something
>> from itertools (e.g. islice).
>>
>> Right now it's allowed and people do it, which is why the
>> documentation is incorrect. It states "equivalent to: a function-or a
>> generator", but you can't subclass whatever it is equivalent to, which
>> is why in PyPy we're unable to make it work in pure python.
>>
>> I would like some clarification on that.
>
>
> There is another reason why itertools iterators can't be implemented as
> simple generator functions. All iterators are pickleable in 3.x.

maybe the documentation should reflect that? (note that generators are
pickleable on pypy anyway)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] tp_finalize vs tp_del sematics

2015-09-03 Thread Maciej Fijalkowski

On Thu, Sep 3, 2015 at 9:23 AM, Valentine Sinitsyn
 wrote:
> Hi Armin,
>
> On 25.08.2015 13:00, Armin Rigo wrote:
>>
>> Hi Valentine,
>>
>> On 25 August 2015 at 09:56, Valentine Sinitsyn
>>  wrote:

 Yes, I think so.  There is a *highly obscure* corner case: __del__
 will still be called several times if you declare your class with
 "__slots__=()".
>>>
>>>
>>> Even on "post-PEP-0442" Python 3.4+? Could you share a link please?
>>
>>
>> class X(object):
>>  __slots__=() # <= try with and without this
>>  def __del__(self):
>>  global revive
>>  revive = self
>>  print("hi")
>>
>> X()
>> revive = None
>> revive = None
>> revive = None
>
> By accident, I found a solution to this puzzle:
>
> class X(object):
> __slots__ = ()
>
> class Y(object):
> pass
>
> import gc
> gc.is_tracked(X())  # False
> gc.is_tracked(Y())  # True
>
> An object with _empty_ slots is naturally untracked, as it can't create back
> references.
>
> Valentine
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

That does not make it ok to have del called several time, does it?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Profile Guided Optimization active by-default

2015-08-25 Thread Maciej Fijalkowski


 Interesting.  So pypy (with it's profiling JIT) would be in a similar boat,
 potentially.


PGO and what pypy does have pretty much nothing to do with each other.
I'm not sure what do you mean by similar boat
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Branch Prediction And The Performance Of Interpreters - Don't Trust Folklore

2015-08-10 Thread Maciej Fijalkowski

On Mon, Aug 10, 2015 at 4:44 PM, Larry Hastings la...@hastings.org wrote:


 This just went by this morning on reddit's /r/programming.  It's a paper
 that analyzed Python--among a handful of other languages--to answer the
 question are branch predictors still that bad at the big switch statement
 approach to interpreters?  Their conclusion: no.

 Our simulations [...] show that, as long as the payload in the bytecode
 remains limited and do not feature significant amount of extra indirect
 branches, then the misprediction rate on the interpreter can be even become
 insignificant (less than 0.5 MPKI).

 (MPKI = missed predictions per thousand instructions)

 Their best results were on simulated hardware with state-of-the-art
 prediction algorithms (TAGE and ITTAGE), but they also demonstrate that
 branch predictors in real hardware are getting better quickly.  When running
 the Unladen Swallow test suite on Python 3.3.2, compiled with
 USE_COMPUTED_GOTOS turned off, Intel's Nehalem experienced an average of
 12.8 MPKI--but Sandy Bridge drops that to 3.5 MPKI, and Haswell reduces it
 further to a mere *1.4* MPKI.  (AFAICT they didn't compare against Python
 3.3.2 using computed gotos, either in terms of MPKI or in overall
 performance.)

 The paper is here:

 https://hal.inria.fr/hal-01100647/document


 I suppose I wouldn't propose removing the labels-as-values opcode dispatch
 code yet.  But perhaps that day is in sight!


 /arry

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com


Hi Larry

Please also note that as far as I can tell this mostly applies to x86.
The ARM branch prediction is significantly dumber these days and as
long as python performance is considered on such platforms such tricks
do make the situation better. We found it out doing CPython/PyPy
comparison, where the difference PyPy vs cPython was bigger on ARM and
smaller on x86, despite our ARM assembler that we produce being less
well optimized.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python automatic optimization

2015-07-23 Thread Maciej Fijalkowski

As far as I can tell, the feedback directed optimizations don't give
much speedup on Python. There is a variety of tools for help: cython,
numba, pypy, numpy etc. if you care about performance of mathematical
operations.

On Thu, Jul 23, 2015 at 9:04 PM, Andrew Steinberg via Python-Dev
python-dev@python.org wrote:
 Hello everybody,

 I am using Python 2.7 as a backbone for some mathematical simulations. I
 recently discovered a tool called AutoFDO and I tried compiling my own
 Python version, but I did not manage to get it working. My question is, will
 sometime in the future Python include this tool?

 Thank you,
 Andrew

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Tighten-up code in the set iterator to use an entry pointer rather than

2015-07-07 Thread Maciej Fijalkowski

I must say I completely fail to understand the procedures under which
python is developed. If the change (unreviewed, just randomly applied)
causes crashes, then surely it should be reverted first and continued
on bug tracker instead of lingering (and the complain sitting on bug
tracker)?

On Tue, Jul 7, 2015 at 10:10 AM, Serhiy Storchaka storch...@gmail.com wrote:
 On 07.07.15 10:42, Serhiy Storchaka wrote:

 On 07.07.15 05:03, raymond.hettinger wrote:

 https://hg.python.org/cpython/rev/c9782a9ac031
 changeset:   96865:c9782a9ac031
 user:Raymond Hettinger pyt...@rcn.com
 date:Mon Jul 06 19:03:01 2015 -0700
 summary:
Tighten-up code in the set iterator to use an entry pointer rather
 than indexing.


 What if so-table was reallocated during the iteration, but so-used is
 left the same? This change looks unsafe to me.


 There is crash reproducer.

 http://bugs.python.org/issue24581


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Tighten-up code in the set iterator to use an entry pointer rather than

2015-07-07 Thread Maciej Fijalkowski

On Tue, Jul 7, 2015 at 2:14 PM, Guido van Rossum gu...@python.org wrote:
 FYI, do we have any indication that Raymond even read the comment? IIRC he
 doesn't regularly read python-dev. I also don't think code review comments
 ought to go to python-dev; the commiters list would seem more appropriate?
 (Though it looks like python-checkins is configured to direct replies to
 python-dev. Maybe we need to revisit that?)

I kind of thought that python does pre-commit reviews (at least seems
to apply to most people), so in case someone is completely exempt from
that, maybe he should read python-dev or wherever the reply is set to?
That also does not explain why a crashing commit has not been
reverted.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Tighten-up code in the set iterator to use an entry pointer rather than

2015-07-07 Thread Maciej Fijalkowski

On Tue, Jul 7, 2015 at 3:08 PM, Serhiy Storchaka storch...@gmail.com wrote:
 On 07.07.15 15:32, Maciej Fijalkowski wrote:

 I kind of thought that python does pre-commit reviews (at least seems
 to apply to most people), so in case someone is completely exempt from
 that, maybe he should read python-dev or wherever the reply is set to?
 That also does not explain why a crashing commit has not been
 reverted.


 There is no haste. Only developed branch is affected and we have enough time
 to fix it. No buildbots is broken. Just rolling back this changeset can be
 impossible because Raymond committed other changes after it. I'm not sure
 that this changeset is culprit, it can be previous one. Raymond is the most
 experienced person in this file, and writing good fix that conform to
 Raymond's view by other person can take more time than the time that is
 needed to Raymond to awake and read this topic.

Then maybe a good option would be to add the crasher to the test
suite, so the buildbots *are* actually broken showing the problem
exists?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] speed.python.org (was: 2.7 is here until 2020, please don't call it a waste.)

2015-06-04 Thread Maciej Fijalkowski

On Thu, Jun 4, 2015 at 4:32 PM, R. David Murray rdmur...@bitdance.com wrote:
On Thu, 04 Jun 2015 12:55:55 +0200, M.-A. Lemburg m...@egenix.com wrote:
On 04.06.2015 04:08, Tetsuya Morimoto wrote:
If someone were to volunteer to set up and run speed.python.org, I think
we could add some additional focus on performance regressions. Right now,
we don't have any way of reliably and reproducibly testing Python
performance.

I'm very interested in speed.python.org and feel regret that the project is
standing still. I have a mind to contribute something ...

On 03.06.2015 18:59, Maciej Fijalkowski wrote: On Wed, Jun 3, 2015 at 3:49
PM, R. David Murray
I think we should look into getting speed.python.org up and
running for both Python 2 and 3 branches:

https://speed.python.org/

What would it take to make that happen ?

I guess ideal would be some cooperation from some of the cpython devs,
so say someone can setup cpython buildbot

What does set up cpython buildbot mean in this context?

The way it works is dual - there is a program running the benchmarks
(the runner) which is in the pypy case run by the pypy buildbot and
the web side that reports stuff. So someone who has access to cpython
buildbot would be useful.

(I don't seem to have gotten a copy of Maciej's message, at least not
yet.)

OK, so what you are saying is that speed.python.org will run a buildbot
slave so that when a change is committed to cPython, a speed run will be
triggered? Is the runner a normal buildbot slave, or something
custom? In the normal case the master controls what the slave
runs...but regardless, you'll need to let us know how the slave
invocation needs to be configured on the master.

Ideally nightly (benchmarks take a while). The setup for pypy looks like this:

https://bitbucket.org/pypy/buildbot/src/5fa1f1a4990f842dfbee416c4c2e2f6f75d451c4/bot2/pypybuildbot/builds.py?at=default#cl-734

so fairly easy. This already generates a json file that you can plot.
We can setup an upload automatically too.

Ok, so there's interest and we have at least a few people who are
willing to help.

Now we need someone to take the lead on this and form a small
project group to get everything implemented. Who would be up
to such a task ?

The speed project already has a mailing list, so you could use
that for organizing the details.

If it's a low volume list I'm willing to sign up, but regardless I'm
willing to help with the buildbot setup on the CPython side. (As soon
as my credential-update request gets through infrastructure, at least :)

--David
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 2.7 is here until 2020, please don't call it a waste.

2015-06-03 Thread Maciej Fijalkowski

On Wed, Jun 3, 2015 at 11:38 AM, M.-A. Lemburg m...@egenix.com wrote:
 On 02.06.2015 21:07, Maciej Fijalkowski wrote:
 Hi

 There was a PSF-sponsored effort to improve the situation with the
 https://bitbucket.org/pypy/codespeed2/src being written (thank you
 PSF). It's not better enough than codespeed that I would like, but
 gives some opportunities.

 That said, we have a benchmark machine for benchmarking cpython and I
 never deployed nightly benchmarks of cpython for a variety of reasons.

 * would be cool to get a small VM to set up the web front

 * people told me that py3k is only interesting, so I did not set it up
 for py3k because benchmarks are mostly missing

 I'm willing to set up a nightly speed.python.org using nightly build
 on python 2 and possible python 3 if there is an interest. I need
 support from someone maintaining python buildbot to setup builds and a
 VM to set up stuff, otherwise I'm good to go

 DISCLAIMER: I did facilitate in codespeed rewrite that was not as
 successful as I would have hoped. I did not receive any money from the
 PSF on that though.

 I think we should look into getting speed.python.org up and
 running for both Python 2 and 3 branches:

  https://speed.python.org/

 What would it take to make that happen ?

I guess ideal would be some cooperation from some of the cpython devs,
so say someone can setup cpython buildbot
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 2.7 is here until 2020, please don't call it a waste.

2015-06-03 Thread Maciej Fijalkowski

On Wed, Jun 3, 2015 at 3:49 PM, R. David Murray rdmur...@bitdance.com wrote:
 On Wed, 03 Jun 2015 12:04:10 +0200, Maciej Fijalkowski fij...@gmail.com 
 wrote:
 On Wed, Jun 3, 2015 at 11:38 AM, M.-A. Lemburg m...@egenix.com wrote:
  On 02.06.2015 21:07, Maciej Fijalkowski wrote:
  Hi
 
  There was a PSF-sponsored effort to improve the situation with the
  https://bitbucket.org/pypy/codespeed2/src being written (thank you
  PSF). It's not better enough than codespeed that I would like, but
  gives some opportunities.
 
  That said, we have a benchmark machine for benchmarking cpython and I
  never deployed nightly benchmarks of cpython for a variety of reasons.
 
  * would be cool to get a small VM to set up the web front
 
  * people told me that py3k is only interesting, so I did not set it up
  for py3k because benchmarks are mostly missing
 
  I'm willing to set up a nightly speed.python.org using nightly build
  on python 2 and possible python 3 if there is an interest. I need
  support from someone maintaining python buildbot to setup builds and a
  VM to set up stuff, otherwise I'm good to go
 
  DISCLAIMER: I did facilitate in codespeed rewrite that was not as
  successful as I would have hoped. I did not receive any money from the
  PSF on that though.
 
  I think we should look into getting speed.python.org up and
  running for both Python 2 and 3 branches:
 
   https://speed.python.org/
 
  What would it take to make that happen ?

 I guess ideal would be some cooperation from some of the cpython devs,
 so say someone can setup cpython buildbot

 What does set up cpython buildbot mean in this context?

The way it works is dual - there is a program running the benchmarks
(the runner) which is in the pypy case run by the pypy buildbot and
the web side that reports stuff. So someone who has access to cpython
buildbot would be useful.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 2.7 is here until 2020, please don't call it a waste.

2015-06-02 Thread Maciej Fijalkowski

Hi

There was a PSF-sponsored effort to improve the situation with the
https://bitbucket.org/pypy/codespeed2/src being written (thank you
PSF). It's not better enough than codespeed that I would like, but
gives some opportunities.

That said, we have a benchmark machine for benchmarking cpython and I
never deployed nightly benchmarks of cpython for a variety of reasons.

* would be cool to get a small VM to set up the web front

* people told me that py3k is only interesting, so I did not set it up
for py3k because benchmarks are mostly missing

I'm willing to set up a nightly speed.python.org using nightly build
on python 2 and possible python 3 if there is an interest. I need
support from someone maintaining python buildbot to setup builds and a
VM to set up stuff, otherwise I'm good to go

DISCLAIMER: I did facilitate in codespeed rewrite that was not as
successful as I would have hoped. I did not receive any money from the
PSF on that though.

Cheers,
fijal


On Mon, Jun 1, 2015 at 1:14 PM, M.-A. Lemburg m...@egenix.com wrote:
 On 01.06.2015 12:44, Armin Rigo wrote:
 Hi Larry,

 On 31 May 2015 at 01:20, Larry Hastings la...@hastings.org wrote:
 p.s. Supporting this patch also helps cut into PyPy's reported performance
 lead--that is, if they ever upgrade speed.pypy.org from comparing against
 Python *2.7.2*.

 Right, we should do this upgrade when 2.7.11 is out.

 There is some irony in your comment which seems to imply PyPy is
 cheating by comparing with an old Python 2.7.2: it is inside a thread
 which started because we didn't backport performance improvements to
 2.7.x so far.

 Just to convince myself, I just ran a performance comparison.  I ran
 the same benchmark suite as speed.pypy.org, with 2.7.2 against 2.7.10,
 both freshly compiled with no configure options at all.  The
 differences are usually in the noise, but range from +5% to... -60%.
 If anything, this seems to show that CPython should take more care
 about performance regressions.  If someone is interested:

 * raytrace-simple is 1.19 times slower
 * bm_mako is 1.29 times slower
 * spitfire_cstringio is 1.60 times slower
 * a number of other benchmarks are around 1.08.

 The 7.0x faster number on speed.pypy.org would be significantly
 *higher* if we upgraded the baseline to 2.7.10 now.

 If someone were to volunteer to set up and run speed.python.org,
 I think we could add some additional focus on performance
 regressions. Right now, we don't have any way of reliably
 and reproducibly testing Python performance.

 Hint: The PSF would most likely fund such adventures :-)

 --
 Marc-Andre Lemburg
 eGenix.com

 Professional Python Services directly from the Source  (#1, Jun 01 2015)
 Python Projects, Coaching and Consulting ...  http://www.egenix.com/
 mxODBC Plone/Zope Database Adapter ...   http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/
 

 : Try our mxODBC.Connect Python Database Interface for free ! ::

eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Computed Goto dispatch for Python 2

2015-05-28 Thread Maciej Fijalkowski

 I'm -1 on the idea because:

 * Performance improvements are not bug fixes
 * The patch doesn't make the migration process from Python 2 to Python 3 
 easier

And this is why people have been porting Python applications to Go.
Maybe addressing Python performance and making Python (2 or 3) a
better language/platform would mitigate that.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] ctypes module

2015-04-08 Thread Maciej Fijalkowski

I presume the reason was that noone wants to maintain code for the
case where there are no buildbots available and there is no
development time available. You are free to put back in the files and
see if they work (they might not), but such things are usually removed
if they're a maintenance burden. I would be happy to assist you with
finding someone willing to do commercial maintenance of ctypes for
itanium, but asking python devs to do it for free is a bit too much.

Cheers,
fijal

On Tue, Apr 7, 2015 at 9:58 PM, Cristi Fati cristifa...@gmail.com wrote:
 Hi all,

 Not sure whether you got this question, or this is the right distribution
 list:

 Intel has deprecated Itanium architecture, and Windows also deprecated its
 versions(currently 2003 and 2008) that run on IA64.

 However Python (2.7.3) is compilable on Windows IA64, but ctypes module
 (1.1.0) which is now part of Python is not (the source files have been
 removed). What was the reason for its disablement?

 I am asking because an older version of ctypes (1.0.2) which came as a
 separate extension module (i used to compile it with Python 2.4.5) was
 available for WinIA64; i found (and fixed) a nasty buffer overrun in it.

 Regards,
 Cristi Fati.


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] ctypes module

2015-04-08 Thread Maciej Fijalkowski

for the record libffi supports itanium officially (but as usual I'm
very skeptical how well it works on less used platforms)
https://sourceware.org/libffi/

On Wed, Apr 8, 2015 at 1:32 PM, Nick Coghlan ncogh...@gmail.com wrote:
 On 8 April 2015 at 20:36, Maciej Fijalkowski fij...@gmail.com wrote:
 I presume the reason was that noone wants to maintain code for the
 case where there are no buildbots available and there is no
 development time available. You are free to put back in the files and
 see if they work (they might not), but such things are usually removed
 if they're a maintenance burden. I would be happy to assist you with
 finding someone willing to do commercial maintenance of ctypes for
 itanium, but asking python devs to do it for free is a bit too much.

 As a point of reference, even Red Hat dropped Itanium support for
 RHEL6+ - you have to go all the way back to RHEL5 to find a version we
 still support running on Itanium.

 For most of CPython, keeping it running on arbitrary architectures
 often isn't too difficult, as libc abstracts away a lot of the
 hardware details. libffi (and hence ctypes) are notable exceptions to
 that :)

 Cheers,
 Nick.

 --
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] libffi embedded in CPython

2015-03-25 Thread Maciej Fijalkowski

On Tue, Mar 24, 2015 at 11:31 PM, Paul Moore p.f.mo...@gmail.com wrote:
 On 12 March 2015 at 17:44, Paul Moore p.f.mo...@gmail.com wrote:
 On 12 March 2015 at 17:26, Brett Cannon br...@python.org wrote:
 I'm all for ditching our 'libffi_msvc' in favor of adding libffi as
 another 'external' for the Windows build.  I have managed to get
 _ctypes to build on Windows using vanilla libffi sources, prepared
 using their configure script from within Git Bash and built with our
 usual Windows build system (properly patched).  Unfortunately, making
 things usable will take some work on ctypes itself, which I'm not
 qualified to do. I'm happy to pass on my procedure and patches for
 getting to the point of successful compilation to anyone who feels up
 to fixing the things that are broken.


 So it seems possible to use upstream libffi but will require some work.

 I'd be willing to contemplate helping out on the Windows side of
 things, if nobody else steps up (with the proviso that I have little
 free time, and I'm saying this without much idea of what's involved
 :-)) If Zachary can give a bit more detail on what the work on ctypes
 is, and/or put what he has somewhere that I could have a look at, that
 might help.

 One thing that seems to be an issue. On Windows, ctypes detects if the
 FFI call used the wrong number of arguments off the stack, and raises
 a ValueError if it does. The tests rely on that behaviour. But it's
 based on ffi_call() returning a value, which upstream libffi doesn't
 do. As far as I can tell (not that the libffi docs are exactly
 comprehensive...) there's no way of getting that information from
 upstream libffi.

 What does Unix ctypes do when faced with a call being made with the
 wrong number of arguments? On Windows, using upstream libffi and
 omitting the existing check, it seems to crash the Python process,
 which obviously isn't good. But the test that fails is
 Windows-specific, and short of going through all the tests looking for
 one that checks passing the wrong number of arguments and isn't
 platform-specific, I don't know how Unix handles this.

 Can anyone on Unix tell me if a ctypes call with the wrong number of
 arguments returns ValueError on Unix? Something like strcmp() (with no
 args) should do as a test, I guess...

 If there's a way Unix handles this, I can see about replicating it on
 Windows. But if there isn't, I fear we could always need a patched
 libffi to maintain the interface we currently have...

 Thanks,
 Paul

Linux crashes. The mechanism for detecting the number of arguments is
only available on windows (note that this is a band-aid anyway, since
if your arguments are of the wrong kind you segfault anyway). We do
have two copies of libffi one for windows one for unix anyway, don't
we?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] libffi embedded in CPython

2015-03-12 Thread Maciej Fijalkowski

On Thu, Mar 12, 2015 at 8:35 PM, Ned Deily n...@acm.org wrote:
 In article
 CAP1=2w7cx5jpqv_pr61rqs1ubusjf5f6kg0cd-qcwr2+9ij...@mail.gmail.com,
 For UNIX OSs we could probably rely on the system libffi then. What's the
 situation on OS X? Anyone know if it has libffi, or would be need to be
 pulled in to be used like on Windows?

 Ronald (in http://bugs.python.org/issue23534):
 On OSX the internal copy of libffi that's used is based on the one in
 PyObjC, which in turn is based on the version of libffi on
 opensource.apple.com (IIRC with some small patches that fix minor issues
 found by the PyObjC testsuite).

 --
  Ned Deily,
  n...@acm.org

From pypy experience, libffi installed on OS X tends to just work (we
never had any issues with those)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] (ctypes example) libffi embedded in CPython

On Wed, Mar 11, 2015 at 8:17 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 11 Mar 2015 19:05:57 +0100
 Antoine Pitrou solip...@pitrou.net wrote:
  
   But they are not ctypes. For example, cffi wouldn't be obvious to use
   for interfacing with non-C code, since it requires you to write C-like
   declarations.
 
  You mean like Fortran? Or what precisely?

 Any toolchain that can generate native code. It can be Fortran, but it
 can also be code generated at runtime without there being any external
 declaration. Having to generate C declarations for such code would be
 a distraction.

 For instance, you can look at the compiler example that Eli wrote using
 llvmlite. It implements a JIT compiler for a toy language. The
 JIT-compiled function is then declared and called using a simple ctypes
 declaration:

 https://github.com/eliben/pykaleidoscope/blob/master/chapter7.py#L937

 Regards

 Antoine.

It might be a matter of taste, but I don't find declaring C functions
any more awkward than using strange interface that ctypes comes with.
the equivalent in cffi would be ffi.cast(double (*)(), x)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] libffi embedded in CPython

On Wed, Mar 11, 2015 at 11:34 PM, Victor Stinner
victor.stin...@gmail.com wrote:

 Le 11 mars 2015 18:29, Brett Cannon br...@python.org a écrit :
 I'm going to propose a somewhat controversial idea: let's deprecate the
 ctypes module.

 In the past I tried to deprecate many functions or modules because they are
 rarely or never used. Many developers prefered to keep them. By the way, I
 still want to remove plat-xxx modules like IN or CDROM :-)

 Getopt was deprecated when optparse was added to the stdlib. Then optparse
 was deprecated when argparse was added to the stdlib.

 Cython and cffi are not part of the stdlib and can be hard to install on
 some platforms. Ctypes is cool because it doesn't require C headers nor a C
 compiler.

 Is it possible to use cffi without a C compiler/headers as easily than
 ctypes?

yes, it has two modes, one that does that and the other that does
extra safety at the cost of a C compiler
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] libffi embedded in CPython

On Wed, Mar 11, 2015 at 8:31 PM, Wes Turner wes.tur...@gmail.com wrote:

 On Mar 11, 2015 12:55 PM, Maciej Fijalkowski fij...@gmail.com wrote:

 On Wed, Mar 11, 2015 at 7:50 PM, Antoine Pitrou solip...@pitrou.net
 wrote:
  On Wed, 11 Mar 2015 17:27:58 +
  Brett Cannon br...@python.org wrote:
 
  Did anyone ever step forward to do this? I'm a bit worried about the
  long-term viability of ctypes if we don't have a maintainer or at least
  someone making sure we are staying up-to-date with upstream libffi. The
  ctypes module is a dangerous thing, so having a chunk of C code that
  isn't
  being properly maintained seems to me to make it even more dangerous.
 
  Depends what you call dangerous. C code doesn't rot quicker than pure
  Python code :-) Also, libffi really offers a wrapper around platform
  ABIs, which rarely change.

 And yet, lesser known ABIs in libffi contain bugs (as we discovered
 trying to work there with anything else than x86 really). Also there
 *are* ABI differencies that change slowly over time (e.g. requiring
 stack to be 16 byte aligned)

 Are there tests for this?


What do you mean? The usual failure mode is will segfault every now
and again if the moon is in the right position (e.g. the stack
alignment thing only happens if the underlaying function uses certain
SSE instructions that compilers emit these days in certain
circumstances)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] libffi embedded in CPython

On Wed, Mar 11, 2015 at 8:05 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 11 Mar 2015 19:54:58 +0200
 Maciej Fijalkowski fij...@gmail.com wrote:
 
  Depends what you call dangerous. C code doesn't rot quicker than pure
  Python code :-) Also, libffi really offers a wrapper around platform
  ABIs, which rarely change.

 And yet, lesser known ABIs in libffi contain bugs (as we discovered
 trying to work there with anything else than x86 really). Also there
 *are* ABI differencies that change slowly over time (e.g. requiring
 stack to be 16 byte aligned)

 Well, sure. The point is, such bugs are unlikely to appear at a fast
 rate... Also, I don't understand why libffi issues would affect cffi
 any less than it affects ctypes, at least in the compiler-less mode of
 operation.

My point here was only about shipping own libffi vs using the system
one (and it does affect cffi equally with or without compiler)


  We now have things like cffi and Cython for people who need
  to interface with C code. Both of those projects are maintained. And they
  are not overly difficult to work with.
 
  But they are not ctypes. For example, cffi wouldn't be obvious to use
  for interfacing with non-C code, since it requires you to write C-like
  declarations.

 You mean like Fortran? Or what precisely?

 Any toolchain that can generate native code. It can be Fortran, but it
 can also be code generated at runtime without there being any external
 declaration. Having to generate C declarations for such code would be
 a distraction.

 Of course, if cffi gains the same ability as ctypes (namely to lookup
 a function and declare its signature without going through the
 FFI.cdef() interface), that issue disappears.

 As a side note, ctypes has a large number of users, so even if it were
 deprecated that wouldn't be a good reason to stop maintaining it.

 And calling cffi simple while it relies on a parser of the C language
 (which would then have to be bundled with Python) is a bit misleading
 IMO.

 Regards

 Antoine.
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] libffi embedded in CPython

On Wed, Mar 11, 2015 at 7:50 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 11 Mar 2015 17:27:58 +
 Brett Cannon br...@python.org wrote:

 Did anyone ever step forward to do this? I'm a bit worried about the
 long-term viability of ctypes if we don't have a maintainer or at least
 someone making sure we are staying up-to-date with upstream libffi. The
 ctypes module is a dangerous thing, so having a chunk of C code that isn't
 being properly maintained seems to me to make it even more dangerous.

 Depends what you call dangerous. C code doesn't rot quicker than pure
 Python code :-) Also, libffi really offers a wrapper around platform
 ABIs, which rarely change.

And yet, lesser known ABIs in libffi contain bugs (as we discovered
trying to work there with anything else than x86 really). Also there
*are* ABI differencies that change slowly over time (e.g. requiring
stack to be 16 byte aligned)


 I'm going to propose a somewhat controversial idea: let's deprecate the
 ctypes module.

 This is gratuitous.

I'm +1 on deprecating ctypes


 We now have things like cffi and Cython for people who need
 to interface with C code. Both of those projects are maintained. And they
 are not overly difficult to work with.

 But they are not ctypes. For example, cffi wouldn't be obvious to use
 for interfacing with non-C code, since it requires you to write C-like
 declarations.

You mean like Fortran? Or what precisely?

 I don't understand why cffi would be safer than ctypes. At least not in
 the operation mode where it doesn't need to invoke a C compiler.
 Cython is a completely different beast, it requires a separate
 compilation pass which makes it useless in some situations.


Our main motivation for safer is comes with less magic and less
gotchas, which also means does less stuff. It's also smaller.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] libffi embedded in CPython

On Thu, Mar 12, 2015 at 12:20 AM, Brett Cannon br...@python.org wrote:


 On Wed, Mar 11, 2015 at 6:03 PM Paul Moore p.f.mo...@gmail.com wrote:

 On 11 March 2015 at 21:45, Maciej Fijalkowski fij...@gmail.com wrote:
  Is it possible to use cffi without a C compiler/headers as easily than
  ctypes?
 
  yes, it has two modes, one that does that and the other that does
  extra safety at the cost of a C compiler

 So if someone were to propose a practical approach to including cffi
 into the stdlib, *and* assisting the many Windows projects using
 ctypes for access to the Windows API [1], then there may be a
 reasonable argument for deprecating ctypes. But nobody seems to be
 doing that, rather the suggestion appears to be just to deprecate a
 widely used part of the stdlib offering no migration path :-(


 You're ignoring that it's not maintained, which is the entire reason I
 brought this up. No one seems to want to touch the code. Who knows what
 improvements, bugfixes, etc. exist upstream in libffi that we lack because
 no one wants to go through and figure it out. If someone would come forward
 and help maintain it then I have no issue with it sticking around.

It's a bit worse than that. Each time someone wants to touch the code
(e.g. push back the upstream libffi fixes), there is we need to
review it, but there is noone to do it, noone knows how it works,
don't touch it kind of feedback, which leads to disincentives to
potential maintainers. I would be likely willing to rip off the libffi
from CPython as it is for example (and just use the upstream one)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] libffi embedded in CPython

On Thu, Mar 12, 2015 at 12:31 AM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 11 Mar 2015 23:10:14 +0200
 Maciej Fijalkowski fij...@gmail.com wrote:
 
  Well, sure. The point is, such bugs are unlikely to appear at a fast
  rate... Also, I don't understand why libffi issues would affect cffi
  any less than it affects ctypes, at least in the compiler-less mode of
  operation.

 My point here was only about shipping own libffi vs using the system
 one (and it does affect cffi equally with or without compiler)

 So what? If ctypes used the system libffi as cffi does, it would by
 construction be at least portable as cffi is.  The only reason the
 bundled libffi was patched at some point was to be *more* portable than
 vanilla libffi is.

 So, really, I don't see how switching from ctypes to cffi solves any of
 this.

You're missing my point. Ripping off the libffi from CPython is a good
idea to start with. Maybe deprecating ctypes is *also* a good idea,
but it's a separate discussion point. It certainly does not solve the
libffi problem.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] boxing and unboxing data types

2015-03-09 Thread Maciej Fijalkowski

Not all your examples are good.

* float(x) calls __float__ (not __int__)

* re.group requires __eq__ (and __hash__)

* I'm unsure about OSError

* the % thing at the very least works on pypy

On Mon, Mar 9, 2015 at 8:07 AM, Serhiy Storchaka storch...@gmail.com wrote:
 On 09.03.15 06:33, Ethan Furman wrote:

 I guess it could boil down to:  if IntEnum was not based on 'int', but
 instead had the __int__ and __index__ methods
 (plus all the other __xxx__ methods that int has), would it still be a
 drop-in replacement for actual ints?  Even when
 being used to talk to non-Python libs?


 If you don't call isinstance(x, int) (PyLong_Check* in C).

 Most conversions from Python to C implicitly call __index__ or __int__, but
 unfortunately not all.

 float(Thin(42))
 42.0
 float(Wrap(42))
 Traceback (most recent call last):
   File stdin, line 1, in module
 TypeError: float() argument must be a string or a number, not 'Wrap'

 '%*s' % (Thin(5), 'x')
 'x'
 '%*s' % (Wrap(5), 'x')
 Traceback (most recent call last):
   File stdin, line 1, in module
 TypeError: * wants int

 OSError(Thin(2), 'No such file or directory')
 FileNotFoundError(2, 'No such file or directory')
 OSError(Wrap(2), 'No such file or directory')
 OSError(__main__.Wrap object at 0xb6fe81ac, 'No such file or directory')

 re.match('(x)', 'x').group(Thin(1))
 'x'
 re.match('(x)', 'x').group(Wrap(1))
 Traceback (most recent call last):
   File stdin, line 1, in module
 IndexError: no such group

 And to be ideal drop-in replacement IntEnum should override such methods as
 __eq__ and __hash__ (so it could be used as mapping key). If all methods
 should be overridden to quack as int, why not take an int?



 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-08 Thread Maciej Fijalkowski

I'm working on vmprof (github.com/vmprof/vmprof-python) which works
for both cpython and pypy (pypy has special support, cpython is
patched on-the fly)

On Sun, Feb 8, 2015 at 6:39 AM, Gregory P. Smith g...@krypto.org wrote:
 To get at the Python thread state from a signal handler (using 2.7 as a
 reference here; but i don't believe 3.4 has changed this part much) you need
 to modify the interpreter to expose pystate.c's autoTLSkey and thread.c's
 struct key as well as keyhead and keymutex.

 From there, in your signal handler you must try to acquire the newly exposed
 keymutex and do nothing if you were unable to get it.  If you did acquire it
 (rare not to), you can walk the keyhead list looking for autoTLSkey to find
 the current valid thread state.

 I had an intern (hi Elena!) write a signal sampling based low overhead
 Python CPU profiler based on that last summer. I believe there are still
 bugs to shaken out (if they are even possible to fix... Armin's comments are
 true: signal handler code is super limited). I am stating this here because
 I want someone to pester me at PyCon if I haven't released our work as a
 proof of concept by then. The important take away: From what I could figure
 out, you need to modify the CPython interpreter to be more amenable to such
 introspection.

 A downside of a signal based profiler: *ALL* of the EINTR mishandling bugs
 within the Python interpreter, stdlib, and your own code will show up in
 your application. So until those are fixed (hooray for Antoine's PEP!), it
 may not be practical for use on production processes which is sort of the
 entire point of a low overhead sampling profiler...

 I'd like to get a buildbot setup that runs the testsuite while a continual
 barrage of signals are being generated. We really don't stress test that
 stuff (as evidence by the EINTR mishandling issues that are rampant) as
 non-fatal signals are so rare for most things... until they aren't.

 As a side note and encouragement: I wonder what PyPy could do for
 dynamically enabled and disabled low overhead CPU profiling. (take that as a
 hint that I want someone else to get extremely creative!)

 -gps

 On Sat Feb 07 2015 at 1:34:26 PM Greg Ewing greg.ew...@canterbury.ac.nz
 wrote:

 Maciej Fijalkowski wrote:
  However, you can't access thread
  locals from signal handlers (since in some cases it mallocs, thread
  locals are built lazily if you're inside the .so, e.g. if python is
  built with --shared)

 You might be able to use Py_AddPendingCall to schedule
 what you want done outside the context of the signal
 handler.

 The call will be made by the main thread, though,
 so if you need to access the frame of whatever thread
 was running when the signal occured, you will have
 to track down its PyThreadState somehow and get the
 frame from there. Not sure what would be involved
 in doing that.

 --
 Greg
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/greg%40krypto.org


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-08 Thread Maciej Fijalkowski

Hi Francis

Feel free to steal most of vmprof code, it should generally work
without requiring to patch cpython (python 3 patches appreciated :-).
As far as timer goes - it seems not to be going anywhere, I would
rather use a background thread or something

On Sun, Feb 8, 2015 at 10:03 PM, Francis Giraldeau
francis.girald...@gmail.com wrote:
 2015-02-08 4:01 GMT-05:00 Maciej Fijalkowski fij...@gmail.com:

 I'm working on vmprof (github.com/vmprof/vmprof-python) which works
 for both cpython and pypy (pypy has special support, cpython is
 patched on-the fly)


 This looks interesting. I'm working on a profiler that is similar, but not
 based on timer. Instead, the signal is generated when an hardware
 performance counter overflows. It required a special linux kernel module,
 and the tracepoint is recorded using LTTng-UST.

 https://github.com/giraldeau/perfuser
 https://github.com/giraldeau/perfuser-modules
 https://github.com/giraldeau/python-profile-ust

 This is of course very experimental, requires a special setup, an I don't
 even know if it's going to produce good results. I'll report the results in
 the coming weeks.

 Cheers,

 Francis Giraldeau
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-07 Thread Maciej Fijalkowski

On Sat, Feb 7, 2015 at 12:48 AM, Francis Giraldeau
francis.girald...@gmail.com wrote:
 2015-02-06 6:04 GMT-05:00 Armin Rigo ar...@tunes.org:

 Hi,

 On 6 February 2015 at 08:24, Maciej Fijalkowski fij...@gmail.com wrote:
  I don't think it's safe to assume f_code is properly filled by the
  time you might read it, depending a bit where you find the frame
  object. Are you sure it's not full of garbage?


 Yes, before discussing how to do the utf8 decoding, we should realize
 that it is really unsafe code starting from the line before.  From a
 signal handler you're only supposed to read data that was written to
 volatile fields.  So even PyEval_GetFrame(), which is done by
 reading the thread state's frame field, is not safe: this is not a
 volatile.  This means that the compiler is free to do crazy things
 like *first* write into this field and *then* initialize the actual
 content of the frame.  The uninitialized content may be garbage, not
 just NULLs.


 Thanks for these comments. Of course accessing frames withing a signal
 handler is racy. I confirm that code encoded in non-ascii is not accessible
 from the uft8 buffer pointer. However, a call to PyUnicode_AsUTF8() encodes
 the data and caches it in the unicode object. Later access returns the byte
 buffer without memory allocation and re-encoding.

 I think it is possible to solve both safety problems by registering a
 handler with PyPyEval_SetProfile(). On function entry, the handler will call
 PyUnicode_AsUTF8() on the required frame members to make sure the utf8
 encoded string is available. Then, we increment the refcount of the frame
 and assign it to a thread local pointer. On function return, the refcount is
 decremented. These operations occurs in the normal context and they are not
 racy. The signal handler will use the thread local frame pointer instead of
 calling PyEval_GetFrame(). Does that sounds good?

 Thanks again for your feedback!

 Francis

You still didn't explain what are you trying to achieve nor adressed
armins questions about volatile. However, you can't access thread
locals from signal handlers (since in some cases it mallocs, thread
locals are built lazily if you're inside the .so, e.g. if python is
built with --shared)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-05 Thread Maciej Fijalkowski

Hi Francis

I don't think it's safe to assume f_code is properly filled by the
time you might read it, depending a bit where you find the frame
object. Are you sure it's not full of garbage?

Besides, are you writing a profiler, or what exactly are you doing?

On Fri, Feb 6, 2015 at 1:27 AM, Francis Giraldeau
francis.girald...@gmail.com wrote:
 I need to access frame members from within a signal handler for tracing
 purpose. My first attempt to access co_filename was like this (omitting
 error checking):

 PyFrameObject *frame = PyEval_GetFrame();
 PyObject *ob = PyUnicode_AsUTF8String(frame-f_code-co_filename)
 char *str = PyBytes_AsString(ob)

 However, the function PyUnicode_AsUTF8String() calls PyObject_Malloc(),
 which is not reentrant. If the signal handler nest over PyObject_Malloc(),
 it causes a segfault, and it could also deadlock.

 Instead, I access members directly:
 char *str = PyUnicode_DATA(frame-f_code-co_filename);
 size_t len = PyUnicode_GET_DATA_SIZE(frame-f_code-co_filename);

 Is it safe to assume that unicode objects co_filename and co_name are always
 UTF-8 data for loaded code? I looked at the PyTokenizer_FromString() and it
 seems to convert everything to UTF-8 upfront, and I would like to make sure
 this assumption is valid.

 Thanks!

 Francis

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 468 (Ordered kwargs)

2015-01-24 Thread Maciej Fijalkowski

Hi Guido.

I *think* part of the reason why our implementation works is that
machines are significantly different than at the times of Knuth.
Avoiding cache misses is a very effective way to improve performance
these days.

Cheers,
fijal

On Sat, Jan 24, 2015 at 7:39 PM, Guido van Rossum gu...@python.org wrote:
 Wow, very cool. When I implemented the very first Python dict (cribbing from
 an algorithm in Knuth) I had no idea that 25 years later there would still
 be ways to improve upon it! I've got a feeling Knuth probably didn't expect
 this either...

 On Sat, Jan 24, 2015 at 2:51 AM, Maciej Fijalkowski fij...@gmail.com
 wrote:

 On Sat, Jan 24, 2015 at 12:50 PM, Maciej Fijalkowski fij...@gmail.com
 wrote:
  Hi
 
  I would like to point out that we implemented rhettingers idea in PyPy
  that makes all the dicts ordered by default and we don't have any
  adverse performance effects (in fact, there is quite significant
  memory saving coming from it). The measurments on CPython could be
  different, but in principle OrderedDict can be implemented as
  efficiently as normal dict.
 
  Writeup:
  http://morepypy.blogspot.com/2015/01/faster-more-memory-efficient-and-more.html
 
  Previous discussion:
  https://mail.python.org/pipermail/python-dev/2012-December/123028.html
 
  Cheers,
  fijal

 also as a sidenote: PEP should maybe mention that PyPy is already
 supporting it, a bit by chance
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/guido%40python.org




 --
 --Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 468 (Ordered kwargs)

2015-01-24 Thread Maciej Fijalkowski

On Sat, Jan 24, 2015 at 12:50 PM, Maciej Fijalkowski fij...@gmail.com wrote:
 Hi

 I would like to point out that we implemented rhettingers idea in PyPy
 that makes all the dicts ordered by default and we don't have any
 adverse performance effects (in fact, there is quite significant
 memory saving coming from it). The measurments on CPython could be
 different, but in principle OrderedDict can be implemented as
 efficiently as normal dict.

 Writeup: 
 http://morepypy.blogspot.com/2015/01/faster-more-memory-efficient-and-more.html

 Previous discussion:
 https://mail.python.org/pipermail/python-dev/2012-December/123028.html

 Cheers,
 fijal

also as a sidenote: PEP should maybe mention that PyPy is already
supporting it, a bit by chance
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP 468 (Ordered kwargs)

2015-01-24 Thread Maciej Fijalkowski

Hi

I would like to point out that we implemented rhettingers idea in PyPy
that makes all the dicts ordered by default and we don't have any
adverse performance effects (in fact, there is quite significant
memory saving coming from it). The measurments on CPython could be
different, but in principle OrderedDict can be implemented as
efficiently as normal dict.

Writeup: 
http://morepypy.blogspot.com/2015/01/faster-more-memory-efficient-and-more.html

Previous discussion:
https://mail.python.org/pipermail/python-dev/2012-December/123028.html

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2015-01-01 Thread Maciej Fijalkowski

On Wed, Dec 31, 2014 at 3:12 PM, Serhiy Storchaka storch...@gmail.com wrote:
 On 10.12.12 03:44, Raymond Hettinger wrote:

 The current memory layout for dictionaries is
 unnecessarily inefficient.  It has a sparse table of
 24-byte entries containing the hash value, key pointer,
 and value pointer.

 Instead, the 24-byte entries should be stored in a
 dense table referenced by a sparse table of indices.


 FYI PHP 7 will use this technique [1]. In conjunction with other
 optimizations this will decrease memory consumption of PHP hashtables up to
 4 times.

up to 4 times is a bit of a stretch, given that most of their
savings come from:

* saving on the keeping of ordering
* other optimizations in zval

None of it applies to python

PHP does not implement differing sizes of ints in key dict, which
makes memory saving php-only (if we did the same thing as PHP, we
would save more or less nothing, depending on how greedy you are with
the list overallocation)

We implemented the same strategy in PyPy as of last year, testing it
to become a default dict and OrderedDict for PyPy in the next
release.

Cheers,
fijal

PS. I wonder who came up with the idea first, PHP or rhettinger and
who implemented it first (I'm pretty sure it was used in hippy before
it was used in Zend PHP)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] libffi embedded in CPython

2014-12-19 Thread Maciej Fijalkowski

On Thu, Dec 18, 2014 at 10:36 PM, Jim J. Jewett jimjjew...@gmail.com wrote:


 On Thu, Dec 18, 2014, at 14:13, Maciej Fijalkowski wrote:
 ... http://bugs.python.org/issue23085 ...
 is there any reason any more for libffi being included in CPython?

 [And why a fork, instead of just treating it as an external dependency]

 Benjamin Peterson responded:

 It has some sort of Windows related patches. No one seems to know
 whether they're still needed for newer libffi. Unfortunately, ctypes
 doesn't currently have a maintainer.

 Are any of the following false?

 (1)  Ideally, we would treat it as an external dependency.

 (2)  At one point, it was intentionally forked to get in needed
 patches, including at least some for 64 bit windows with MSVC.

 (3)  Upstream libffi maintenance has picked back up.

 (4)  Alas, that means the switch merge would not be trivial.

 (5)  In theory, we could now switch to the external version.
 [In particular, does libffi have a release policy such that we
 could assume the newest released version is safe, so long as
 our integration doesn't break?]

 (6)  By its very nature, libffi changes are risky and undertested.
 At the moment, that is also true of its primary user, ctypes.

 (7)  So a switch is OK in theory, but someone has to do the
 non-trivial testing and merging, and agree to support both libffi
 and and ctypes in the future.  Otherwise, stable wins.

 (8)  The need for future support makes this a bad candidate for
 patches wanted/bug bounty/GSoC.

 -jJ

I would like to add that not doing anything is not a good strategy
either, because you accumulate bugs that get fixed upstream (I'm
pretty sure all the problems from cpython got fixed in upstream
libffi, but not all libffi fixes made it to cpython).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] libffi embedded in CPython

2014-12-18 Thread Maciej Fijalkowski

After reading this http://bugs.python.org/issue23085 and remembering
struggling having our own patches into cpython's libffi (but not into
libffi itself), I wonder, is there any reason any more for libffi
being included in CPython?

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] libffi embedded in CPython

2014-12-18 Thread Maciej Fijalkowski

On Thu, Dec 18, 2014 at 9:17 PM, Steve Dower steve.do...@microsoft.com wrote:
 Maciej Fijalkowski wrote:
 After reading this http://bugs.python.org/issue23085 and remembering 
 struggling
 having our own patches into cpython's libffi (but not into libffi itself), I
 wonder, is there any reason any more for libffi being included in CPython?

 We use it for ctypes, so there's certainly still a need. Are you asking 
 whether we need a fork of it as opposed to treating it like an external (like 
 OpenSSL)?

yes (why is there a copy of libffi in the cpython source). And I'm
asking not why it landed there, but why it is still there
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] libffi embedded in CPython

2014-12-18 Thread Maciej Fijalkowski

well, the problem is essentially that libffi gets patched (e.g. for
ARM) and it does not make it's way to CPython quickly. This is
unlikely to be a security issue (for a variety of reasons, including
ctypes), but it's still an issue I think. Segfaults related to e.g.
stack alignment are hard to debug

On Thu, Dec 18, 2014 at 9:30 PM, Benjamin Peterson benja...@python.org wrote:


 On Thu, Dec 18, 2014, at 14:13, Maciej Fijalkowski wrote:
 After reading this http://bugs.python.org/issue23085 and remembering
 struggling having our own patches into cpython's libffi (but not into
 libffi itself), I wonder, is there any reason any more for libffi
 being included in CPython?

 It has some sort of Windows related patches. No one seems to know
 whether they're still needed for newer libffi. Unfortunately, ctypes
 doesn't currently have a maintainer.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Should standard library modules optimize for CPython?

2014-06-02 Thread Maciej Fijalkowski

On Mon, Jun 2, 2014 at 10:43 AM, Victor Stinner
victor.stin...@gmail.com wrote:
 2014-06-01 10:11 GMT+02:00 Steven D'Aprano st...@pearwood.info:
 My feeling is that the CPython standard library should be written for
 CPython,

 Right. PyPy, Jython and IronPython already have their own standard
 library when they need a different implement.

 PyPy: lib_pypy directory (lib-python is the CPython stdlib):
 https://bitbucket.org/pypy/pypy/src/ac52eb7059d0b8d001a2103774917cf7396f/lib_pypy/?at=default

it's for stuff that's in CPython implemented in C, not a
reimplementation of python stuff. we patched the most obvious
CPython-specific hacks, but it's a loosing battle, you guys will go
way out of your way to squeeze extra 2% by doing very obscure hacks.


 Jython: Lib directory (lib-python is the CPython stdlib):
 https://bitbucket.org/jython/jython/src/9cd9ab75eadea898e2e74af82ae414925d6a1135/Lib/?at=default

 IronPython: IronPython.Modules directory:
 http://ironpython.codeplex.com/SourceControl/latest#IronPython_Main/Languages/IronPython/IronPython.Modules/

 See for example the _fsum.py module of Jython:
 https://bitbucket.org/jython/jython/src/9cd9ab75eadea898e2e74af82ae414925d6a1135/Lib/_fsum.py?at=default

 Victor
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Language Summit notes

2014-04-11 Thread Maciej Fijalkowski

On Fri, Apr 11, 2014 at 2:22 PM, Paul Moore p.f.mo...@gmail.com wrote:
 On 11 April 2014 10:36, Armin Rigo ar...@tunes.org wrote:
 This would be superficial, but change the perception of CFFI to be a
 preprocessor that produces C extension modules.

 Thanks, that clarification helps a lot. Does this mean that API-mode
 CFFI is competing with things like swig (which is not used much these
 days, as far as I know) and Cython (which is used a lot in the numeric
 community)? (ABI-mode CFFI is obviously directly competing with
 ctypes).

Yes.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] collections.sortedtree

2014-03-27 Thread Maciej Fijalkowski

On Thu, Mar 27, 2014 at 10:11 AM, Stephen J. Turnbull
step...@xemacs.org wrote:
 Nick Coghlan writes:

On 27 Mar 2014 07:02, Guido van Rossum gu...@python.org wrote:
   Actually, the first step is publish it on PyPI, the second is to
   get a fair number of happy users there. The bar for getting something
   included into the stdlib is pretty high

   The why not a third party module? bar also got a fair bit higher
   with Python 3.4 - by bundling pip, we have deliberately made third
   party modules easier to consume, thus weakening the convenience
   argument that applies to stdlib inclusion.

 Maybe.  That depends on if you care about the convenience of folks who
 have to get new modules past Corporate Security, but it's easier to
 get an upgrade of the whole shebang.  I don't think it's ever really
 been resolved whether they're a typical case that won't go away or a
 special group whose special needs should be considered.

 Steve

And random pieces of C included in the standard library can be
shuffled under the carpet under the disguise of upgrade or what are
you suggesting?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] collections.sortedtree

2014-03-27 Thread Maciej Fijalkowski

On Thu, Mar 27, 2014 at 11:07 AM, Paul Moore p.f.mo...@gmail.com wrote:
 On 27 March 2014 08:16, Maciej Fijalkowski fij...@gmail.com wrote:
 And random pieces of C included in the standard library can be
 shuffled under the carpet under the disguise of upgrade or what are
 you suggesting?

 The sort of thing that happens is that the relevant approvers will
 accept python-dev as a trusted supplier and then Python upgrades are
 acceptable subject to review of the changes, etc. For a new module,
 there is a whole other level of questions around how do we trust the
 person who developed the code, do we need to do a full code review,
 etc?

 It's a bit unfair to describe the process as random pieces of C
 being shuffled under the carpet. (Although there probably are
 environments where that is uncomfortably close to the truth :-()

 Paul

I just find my company is stupid so let's work around it by putting
stuff to python standard library unacceptable argument for python-dev
and all the python community.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intricacies of calling eq

2014-03-21 Thread Maciej Fijalkowski

On Wed, Mar 19, 2014 at 11:43 PM, Nick Coghlan ncogh...@gmail.com wrote:

 On 20 Mar 2014 07:38, Nick Coghlan ncogh...@gmail.com wrote:

 Correct, but I think this discussion has established that how many times
 dict lookup calls __eq__ on the key is one such thing. In CPython, it
 already varies based on:

 - dict contents (due to the identity check and the distribution of entries
 across hash buckets)
 - pointer size (due to the hash bucket distribution differing between 32
 bit and 64 bit builds)
 - dict tuning parameters (there are some settings in the dict
 implementation that affect when dicts resize up and down, etc, which can
 mean the hash bucket distribution may already change without much notice in
 feature releases)

 I just realised that hash randomisation also comes into play here - the
 distribution of entries across hash buckets is inherently variable between
 runs for any key types that rely directly or indirectly on a randomised
 hash.

 Cheers,
 Nick.


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com


at the end of the day we settled for dicts with str int or identity
keys, so we're perfectly safe
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intricacies of calling eq

On Tue, Mar 18, 2014 at 11:19 PM, Paul Moore p.f.mo...@gmail.com wrote:
 On 18 March 2014 19:46, Maciej Fijalkowski fij...@gmail.com wrote:
 A question: how far away will this optimization apply?

 if x in d:
 do_this()
 do_that()
 do_something_else()
 spam = d[x]

 it depends what those functions do. JIT will inline them and if
 they're small, it should work (although a modification of a different
 dict is illegal, since aliasing is not proven), but at some point
 it'll give up (Note that it'll also give up with a call to C releasing
 GIL since some other thread can modify it).

 Surely in the presence of threads the optimisation is invalid anyway
 as other threads can run in between each opcode (I don't know how
 you'd phrase that in a way that wasn't language dependent other than
 everywhere :-)) so

 if x in d:
 # HERE
 spam = d[x]

 d can be modified at HERE. (If d is a local variable, obviously the
 chance that another thread has access to d is a lot lower, but do you
 really do that level of alias tracking?)

 Paul

not in the case of JIT that *knows* where the GIL can be released. We
precisely make it not possible every few bytecodes to avoid such
situations.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intricacies of calling eq

On Wed, Mar 19, 2014 at 2:42 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Tue, 18 Mar 2014 09:52:05 +0200
 Maciej Fijalkowski fij...@gmail.com wrote:

 We're thinking about doing an optimization where say:

 if x in d:
return d[x]

 where d is a dict would result in only one dict lookup (the second one
 being constant folded away). The question is whether it's ok to do it,
 despite the fact that it changes the semantics on how many times
 __eq__ is called on x.

 I don't think it's ok. If the author of the code had wanted only one
 lookup, they would have written:

   try:
   return d[x]
   except KeyError:
   pass

 I agree that an __eq__ method with side effects is rather bad, of
 course.
 What you could do is instruct people that the latter idiom (EAFP)
 performs better on PyPy.

 Regards

 Antoine.

I would like to point out that instructing people does not really
work. Besides, other examples like this:

if d[x] = 3:
   d[x] += 1 don't really work.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intricacies of calling eq

On Wed, Mar 19, 2014 at 3:17 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 19 Mar 2014 15:09:04 +0200
 Maciej Fijalkowski fij...@gmail.com wrote:

 I would like to point out that instructing people does not really
 work. Besides, other examples like this:

 if d[x] = 3:
d[x] += 1 don't really work.

 That's a good point. But then, perhaps PyPy should analyze the __eq__
 method and decide whether it's likely to have side effects or not (the
 answer can be hard-coded for built-in types such as str).

 Regards

 Antoine.

Ok. But then how is it valid to have is fast-path?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intricacies of calling eq

On Wed, Mar 19, 2014 at 3:26 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 19 Mar 2014 15:21:16 +0200
 Maciej Fijalkowski fij...@gmail.com wrote:

 On Wed, Mar 19, 2014 at 3:17 PM, Antoine Pitrou solip...@pitrou.net wrote:
  On Wed, 19 Mar 2014 15:09:04 +0200
  Maciej Fijalkowski fij...@gmail.com wrote:
 
  I would like to point out that instructing people does not really
  work. Besides, other examples like this:
 
  if d[x] = 3:
 d[x] += 1 don't really work.
 
  That's a good point. But then, perhaps PyPy should analyze the __eq__
  method and decide whether it's likely to have side effects or not (the
  answer can be hard-coded for built-in types such as str).
 
  Regards
 
  Antoine.

 Ok. But then how is it valid to have is fast-path?

 What do you mean?


I mean that dict starts with is before calling __eq__, so the number
of calls to __eq__ can as well be zero.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intricacies of calling eq

On Wed, Mar 19, 2014 at 8:38 AM, Kevin Modzelewski k...@dropbox.com wrote:
 Sorry, I definitely didn't mean to imply that this kind of optimization is
 valid on arbitrary subscript expressions; I thought we had restricted
 ourselves to talking about builtin dicts.  If we do, I think this becomes a
 discussion about what subset of the semantics of CPython's builtins are
 language-specified vs implementation-dependent; my argument is that just
 because something results in an observable behavioral difference doesn't
 necessarily mean that it's a change in language semantics, if it's just a
 change in the implementation-dependent behavior.


 On Tue, Mar 18, 2014 at 9:54 PM, Stephen J. Turnbull step...@xemacs.org
 wrote:

 Kevin Modzelewski writes:

   I think in this case, though, if we say for the sake of argument
   that the guaranteed semantics of a dictionary lookup are zero or

 I don't understand the point of that argument.  It's simply false that
 semantics are guaranteed, and all of the dunders might be user
 functions.

   more calls to __hash__ plus zero or more calls to __eq__, then two
   back-to-back dictionary lookups wouldn't have any observable
   differences from doing only one, unless you start to make
   assumptions about the behavior of the implementation.

 That's false.  The inverse is true: you should allow the possibility of
 observable differences, unless you make assumptions about the behavior
 (implying there are none).

   To me there seems to be a bit of a gap between seeing a dictionary
   lookup and knowing the exact sequence of user-functions that get
   called, far more than for example something like a  b.

 The point here is that we *know* that there may be a user function
 (the dunder that implements []) being called, and it is very hard to
 determine that that function is pure.

 Your example of a caching hash is exactly the kind of impure function
 that one would expect, but who knows what might be called -- there
 could be a reference to a database on Mars involved (do we have a
 vehicle on Mars at the moment? anyway...), which calls a pile of
 Twisted code, and has latencies of many seconds.

 So Steven is precisely right -- in order to allow this optimization,
 it would have to be explicitly allowed.

 Like Steven, I have no strong feeling against it, but then, I don't
 have a program talking to a deep space vehicle in my near future.
 Darn it! :-(



 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com


we're discussing builtin dicts
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Intricacies of calling eq

Hi

I have a question about calling __eq__ in some cases.

We're thinking about doing an optimization where say:

if x in d:
   return d[x]

where d is a dict would result in only one dict lookup (the second one
being constant folded away). The question is whether it's ok to do it,
despite the fact that it changes the semantics on how many times
__eq__ is called on x.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intricacies of calling eq

On Tue, Mar 18, 2014 at 11:35 AM, Nick Coghlan ncogh...@gmail.com wrote:
 On 18 March 2014 17:52, Maciej Fijalkowski fij...@gmail.com wrote:
 Hi

 I have a question about calling __eq__ in some cases.

 We're thinking about doing an optimization where say:

 if x in d:
return d[x]

 where d is a dict would result in only one dict lookup (the second one
 being constant folded away). The question is whether it's ok to do it,
 despite the fact that it changes the semantics on how many times
 __eq__ is called on x.

 I'll assume the following hold:

 - we're only talking about true builtin dicts (the similarity between
 __contains__ and __getitem__ can't be assumed otherwise)

yes

 - guards will trigger if d is mutated (e.g. by another thread) between
 the containment check and the item retrieval

yes


 Semantically, what you propose is roughly equivalent to reinterpreting
 the look-before-you-leap version to the exception handling based
 fallback:

 try:
 return d[x]
 except KeyError:
 pass

 For a builtin dict and any *reasonable* x, those two operations will
 behave the same way. Differences arise only if x.__hash__ or x.__eq__
 is defined in a way that most people would consider unreasonable.

 For an optimisation that actually changes the language semantics like
 that, though, I would expect it to be buying a significant payoff in
 speed, especially given that most cases where the key lookup is known
 to be a bottleneck can already be optimised by hand.

 Cheers,
 Nick.

the payoff is significant. Note that __eq__ might not be called at all
(since dicts check identity first). It turns out not all people write
reasonable code and we can't expect them to micro-optimize by hand. It
also covers cases that are hard to optimize, like:

if d[x]  3:
   d[x] += 1

etc.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intricacies of calling eq

On Tue, Mar 18, 2014 at 1:18 PM, Steven D'Aprano st...@pearwood.info wrote:
 On Tue, Mar 18, 2014 at 05:05:56AM -0400, Terry Reedy wrote:
 On 3/18/2014 3:52 AM, Maciej Fijalkowski wrote:
 Hi
 
 I have a question about calling __eq__ in some cases.
 
 We're thinking about doing an optimization where say:
 
 if x in d:
 return d[x]

 if d.__contains__(x): return d.__getitem__(x)

 [Aside: to be pedantic, Python only calls dunder methods on the class,
 not the instance, in response to operators and other special calls. That
 is, type(d).__contains__ rather than d.__contains__, etc. And to be even
 more pedantic, that's only true for new-style classes.]


 I do not see any requirement to call x.__eq__ any particular number of
 times. The implementation of d might always call somekey.__eq__(x). The
 concept of sets (and dicts) requires coherent equality comparisons.

 To what extent does Python the language specify that user-defined
 classes must be coherent? How much latitude to shoot oneself in the foot
 should the language allow?

 What counts as coherent can depend on the types involved. For instance,
 I consider IEEE-754 Not-A-Numbers to be coherent, albeit weird. Python
 goes only so far to accomodate NANs: while it allows a NAN to test
 unequal even to itself (`NAN == NAN` returns False), containers are
 allowed to assume that instances are equal to themselves (`NAN in {NAN}`
 returns True). This was discussed in great detail a few years ago, and
 if I recall correctly, the conclusion was that containers can assume
 that their elements are reflexive (they equal themselves), but equality
 == cannot make the same assumption and bypass calling __eq__.


 where d is a dict would result in only one dict lookup (the second one
 being constant folded away). The question is whether it's ok to do it,
 despite the fact that it changes the semantics on how many times
 __eq__ is called on x.

 A __eq__ that has side-effects violates the intended and expected
 semanitics of __eq__.

 Nevertheless, an __eq__ with side-effects is legal Python and may in
 fact be useful.

 It's a tricky one... I don't know how I feel about short-cutting normal
 Python semantics for speed. On the one hand, faster is good. But on the
 other hand, it makes it harder to reason about code when things go
 wrong. Why is my __eq__ method not being called?


 --
 Steven
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

note that this is specifically about dicts, where __eq__ will be
called undecided number of times anyway (depending on collisions in
hash/buckets which is implementation specific to start with)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Intricacies of calling eq

On Tue, Mar 18, 2014 at 4:21 PM, Steven D'Aprano st...@pearwood.info wrote:
 On Tue, Mar 18, 2014 at 01:21:05PM +0200, Maciej Fijalkowski wrote:

 note that this is specifically about dicts, where __eq__ will be
 called undecided number of times anyway (depending on collisions in
 hash/buckets which is implementation specific to start with)

 Exactly. Using a __eq__ method with side-effects is a good way to find
 out how many collisions your dict has :-)

 But specifically with your example,

 if x in d:
 return d[x]

 my sense of this is that it falls into the same conceptual area as the
 identity optimization for checking list or set containment: slightly
 unclean, but justified. Provided d is an actual built-in dict, and it
 hasn't been modified between one call and the next, I think it would be
 okay to optimize the second lookup d[x].

 A question: how far away will this optimization apply?

 if x in d:
 do_this()
 do_that()
 do_something_else()
 spam = d[x]

it depends what those functions do. JIT will inline them and if
they're small, it should work (although a modification of a different
dict is illegal, since aliasing is not proven), but at some point
it'll give up (Note that it'll also give up with a call to C releasing
GIL since some other thread can modify it).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] What is the precise problem? [was: Reference cycles in Exception.traceback]

On Mon, Mar 10, 2014 at 12:10 PM, Victor Stinner
victor.stin...@gmail.com wrote:
 2014-03-08 16:30 GMT+01:00 Maciej Fijalkowski fij...@gmail.com:
 How about fixing cyclic gc to deal with __del__ instead? That sounds
 like an awful change to the semantics.

 Hum? That's the purpose of the PEP 442 which is implemented in Python 3.4.

 As I wrote, it's not enough to fix all issues.

 Usually, I see an explicit call to gc.collect() as a workaround to a
 deeper issue. I prefer to modify my program to run smoothly without
 explict garbage collection.

 That's why I would prefer to avoid creating reference cycles from the 
 beginning.

 Victor

It was agreed long time ago that the immediate finalization is an
implementation specific detail and it's not guaranteed. You should not
rely on __del__s being called timely one way or another. Why would you
require this for the program to work correctly in the particular
example of __traceback__?

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] What is the precise problem? [was: Reference cycles in Exception.traceback]

On Mon, Mar 10, 2014 at 3:23 PM, Victor Stinner
victor.stin...@gmail.com wrote:
 2014-03-10 13:11 GMT+01:00 Maciej Fijalkowski fij...@gmail.com:
 It was agreed long time ago that the immediate finalization is an
 implementation specific detail and it's not guaranteed. You should not
 rely on __del__s being called timely one way or another. Why would you
 require this for the program to work correctly in the particular
 example of __traceback__?

 For asyncio, it's very useful to see unhandled exceptions as early as
 possible. Otherwise, your program is blocked and you don't know why.

 Guido van Rossum suggests to run gc.collect() regulary:
 http://code.google.com/p/tulip/issues/detail?id=42

 Victor

twisted goes around it by attaching errback by hand. Would that work for tulip?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] What is the precise problem? [was: Reference cycles in Exception.traceback]

On Mon, Mar 10, 2014 at 7:35 PM, Guido van Rossum gu...@python.org wrote:
 On Mon, Mar 10, 2014 at 10:30 AM, Maciej Fijalkowski fij...@gmail.com
 wrote:

 On Mon, Mar 10, 2014 at 3:23 PM, Victor Stinner
 victor.stin...@gmail.com wrote:
  2014-03-10 13:11 GMT+01:00 Maciej Fijalkowski fij...@gmail.com:
  It was agreed long time ago that the immediate finalization is an
  implementation specific detail and it's not guaranteed. You should not
  rely on __del__s being called timely one way or another. Why would you
  require this for the program to work correctly in the particular
  example of __traceback__?
 
  For asyncio, it's very useful to see unhandled exceptions as early as
  possible. Otherwise, your program is blocked and you don't know why.
 
  Guido van Rossum suggests to run gc.collect() regulary:
  http://code.google.com/p/tulip/issues/detail?id=42
 
  Victor

 twisted goes around it by attaching errback by hand. Would that work for
 tulip?


 Can you describe that idea in more detail?

Essentially, instead of relying on deferred to be garbage collected,
you attach an errback like this:

deferred.addErrback(callback_that_writes_to_log)

so in case of a failure, you get a traceback directly in the callback
immediately, without relying on garbage collection.

I'm sorry if I'm using twisted nomenclature here (it's also awfully
off-topic for python-dev), but making programs rely on refcounting
sounds like a bad idea for us (PyPy).

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] What is the precise problem? [was: Reference cycles in Exception.traceback]

On Mon, Mar 10, 2014 at 7:49 PM, Guido van Rossum gu...@python.org wrote:
 On Mon, Mar 10, 2014 at 10:39 AM, Maciej Fijalkowski fij...@gmail.com
 wrote:

 On Mon, Mar 10, 2014 at 7:35 PM, Guido van Rossum gu...@python.org
 wrote:
  On Mon, Mar 10, 2014 at 10:30 AM, Maciej Fijalkowski fij...@gmail.com
  wrote:
 
  On Mon, Mar 10, 2014 at 3:23 PM, Victor Stinner
  victor.stin...@gmail.com wrote:
   2014-03-10 13:11 GMT+01:00 Maciej Fijalkowski fij...@gmail.com:
   It was agreed long time ago that the immediate finalization is an
   implementation specific detail and it's not guaranteed. You should
   not
   rely on __del__s being called timely one way or another. Why would
   you
   require this for the program to work correctly in the particular
   example of __traceback__?
  
   For asyncio, it's very useful to see unhandled exceptions as early as
   possible. Otherwise, your program is blocked and you don't know why.
  
   Guido van Rossum suggests to run gc.collect() regulary:
   http://code.google.com/p/tulip/issues/detail?id=42
  
   Victor
 
  twisted goes around it by attaching errback by hand. Would that work
  for
  tulip?
 
 
  Can you describe that idea in more detail?

 Essentially, instead of relying on deferred to be garbage collected,
 you attach an errback like this:

 deferred.addErrback(callback_that_writes_to_log)

 so in case of a failure, you get a traceback directly in the callback
 immediately, without relying on garbage collection.

 I'm sorry if I'm using twisted nomenclature here (it's also awfully
 off-topic for python-dev), but making programs rely on refcounting
 sounds like a bad idea for us (PyPy).


 IIUC the problem that Victor is trying to solve is what to do if nobody
 thought to attach an errback. Tulip makes a best effort to still log a
 traceback. We've found this very helpful (just as it is helpful that Python
 prints a traceback when synchronous code raises an exception and no except
 clause caught it).

 The best effort relies on GC. I am guessing that refcounting makes the
 traceback appear sooner, but there would be other ways to force it, like
 occasionally calling gc.collect() during idle times (presumably during busy
 times it will be called soon enough. :-)

 --
 --Guido van Rossum (python.org/~guido)

I agree this sounds like a solution. However I'm very skeptical about
changing details of __traceback__ and frames, just in order to make
refcounting work (since it would create something that would not work
on pypy for example).

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] What is the precise problem? [was: Reference cycles in Exception.traceback]

2014-03-08 Thread Maciej Fijalkowski

On Sat, Mar 8, 2014 at 5:14 PM, Victor Stinner victor.stin...@gmail.com wrote:
 2014-03-08 14:33 GMT+01:00 Antoine Pitrou solip...@pitrou.net:
 Ok, it's actually quite trivial. The whole chain is kept alive by the
 fut global variable. If you arrange for it to be disposed of:

   fut = asyncio.Future()
   asyncio.Task(func(fut))
   del fut
   [etc.]

 then the problem disappears: as soon as gc.collect() happens, the
 MyObject instance is destroyed, the future is collected, and the
 future's traceback is printed out.

 Well, the problem is more general than this specific example. I would
 like to implement a general solution which would not hold references
 to local variables, to destroy objects when Python exits the except
 block.

 It looks like a exception summary containing only data to format the
 traceback would fit asyncio needs. If you don't want it in the
 traceback module, I will try to implement it in asyncio.

 It would be nice to provide an exception summary in the traceback
 module, because it looks like reference cycles related to exception
 and/or traceback is a common issue (see the list of links I gave in a
 previous email).

 Victor

How about fixing cyclic gc to deal with __del__ instead? That sounds
like an awful change to the semantics.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python Remote Code Execution in socket.recvfrom_into()

On Tue, Feb 25, 2014 at 11:13 AM, Victor Stinner
victor.stin...@gmail.com wrote:
 Hi,

 2014-02-25 8:53 GMT+01:00 Nick Coghlan ncogh...@gmail.com:
 I've checked these, and noted the relevant hg.python.org links on the
 tracker issue at http://bugs.python.org/issue20246

 Would it be possible to have a table with all known Python security
 vulnerabilities and the Python versions which are fixed? Bonus point
 if we provide a link to the changeset fixing it for each branch. Maybe
 put this table on http://www.python.org/security/ ?

 Last issues:
 - hash DoS

is this fixed?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python Remote Code Execution in socket.recvfrom_into()

On Tue, Feb 25, 2014 at 3:01 PM, Donald Stufft don...@stufft.io wrote:

 On Feb 25, 2014, at 7:59 AM, Maciej Fijalkowski fij...@gmail.com wrote:

 On Tue, Feb 25, 2014 at 11:13 AM, Victor Stinner
 victor.stin...@gmail.com wrote:
 Hi,

 2014-02-25 8:53 GMT+01:00 Nick Coghlan ncogh...@gmail.com:
 I've checked these, and noted the relevant hg.python.org links on the
 tracker issue at http://bugs.python.org/issue20246

 Would it be possible to have a table with all known Python security
 vulnerabilities and the Python versions which are fixed? Bonus point
 if we provide a link to the changeset fixing it for each branch. Maybe
 put this table on http://www.python.org/security/ ?

 Last issues:
 - hash DoS

 is this fixed?
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/donald%40stufft.io

 It is in 3.4.

Oh, I thought security fixes go to all python releases.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python Remote Code Execution in socket.recvfrom_into()

On Tue, Feb 25, 2014 at 3:06 PM, Chris Angelico ros...@gmail.com wrote:
 On Tue, Feb 25, 2014 at 11:59 PM, Maciej Fijalkowski fij...@gmail.com wrote:
 Last issues:
 - hash DoS

 is this fixed?

 Yes, hash randomization was added as an option in 2.7.3 or 2.7.4 or
 thereabouts, and is on by default in 3.3+. You do have to set an
 environment variable for 2.7 (and I think 2.6 got that too (??)), as
 it can break code.

No, the hash randomization is broken, it does not provide enough
randomness (without changing the hash function which only happened in
3.4+)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python Remote Code Execution in socket.recvfrom_into()