date:20161010

[Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Larry Hastings



The documentation for PyWeakref_GetObject() states:

   Return value: Borrowed reference.

   Return the referenced object from a weak reference, ref. If the
   referent is no longer live, returns Py_None.

Given that the weakref doesn't have a reference to the object--merely a 
weak reference, different thing--whose reference is it borrowing?


FWIW, yes, this is playing merry hell with the Gilectomy.  If there are 
two threads, and one calls PyWeakref_GetObject(obj), and there's only 
one reference to obj, and the other thread blows it away... now what?  
It's my contention that this API is simply untenable under the 
Gilectomy, and that it needs to change to returning a new (strong) 
reference.



//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Devin Jeanpierre

> It's my contention that this API is simply untenable under the Gilectomy

Yeah, I agree with your contention. The only circumstance PyWeakref_GetObject
would still be valid without a GIL is if you happened to know you had a
reference elsewhere that was keeping it alive. But if you did, you probably
wouldn't need to use the weakref. If you don't, then with GILectomy you
have no thread-safe options.

Trivial bikeshedding:  rather than change the existing API, wouldn't it be
better to add a new API, and not support the old one in GILectomized
CPython? Introducing memory leaks would be a little unfortunate, and it's
hard to audit C code for correct use of refcounted pointers even *before*
you change the ownership of returned pointers across versions.

w.r.t. the exact question:

> whose reference is it borrowing?

I think this is a red herring sort of question. "borrowed" only means
"unowned". But anyway, we borrowed from the weakref. (Borrowing from
somebody doesn't imply they own a reference -- we can borrow from a
borrowed reference, for example, and this is common.)

The term "borrowed" is supposed to imply a sensible scope during which
you're free to use the object, and weakrefs don't have that (except for
what is granted by the GIL), so this does sound wacky. I bet it was for
performance.

-- Devin
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Larry Hastings




On 10/10/2016 11:05 AM, Devin Jeanpierre wrote:

> whose reference is it borrowing?

I think this is a red herring sort of question. "borrowed" only means 
"unowned". But anyway, we borrowed from the weakref. (Borrowing from 
somebody doesn't imply they own a reference -- we can borrow from a 
borrowed reference, for example, and this is common.)


Huh?  In all other circumstances, a "borrowed" reference is exactly 
that: X has a reference, and you are relying on X's reference to keep 
the object alive.  Borrowing from a borrowed reference is simply a chain 
of these; Z borrows from Y, Y borrows from X, and X is the original 
person who did the incref.  But you're borrowing from something 
specific, somebody who the API guarantees has a legitimate reference on 
the object and won't drop it while you're using it.  I bet for every 
other spot in the API I can tell you from whom you're borrowing the 
reference.


In contrast, the "borrowed" reference returned by PyWeakRef_GetObject() 
seems to be "borrowed" from some unspecified entity.  The fact that the 
object is live in Python directly implies that, yes, *somebody* must 
have a reference, somewhere.  But ISTM (and apparently you) that this is 
relying on the GIL preventing that unknown other actor from dropping 
their reference while you've borrow it.  A guarantee that the 
post-Gilectomy Python interpreter can no longer make!


In any case, I see nothing in the documentation that suggests "borrowed 
only means unowned" as you suggest.  In contrast, the documentation 
seems to suggest that the metaphor is how I understood it; that when you 
"borrow" a reference, there is another object who has a reference and 
you're relying on their reference to keep the object alive.


   https://docs.python.org/3.6/extending/extending.html#reference-counts



//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Chris Angelico

On Mon, Oct 10, 2016 at 8:35 PM, Larry Hastings  wrote:
> Huh?  In all other circumstances, a "borrowed" reference is exactly that: X
> has a reference, and you are relying on X's reference to keep the object
> alive.  Borrowing from a borrowed reference is simply a chain of these; Z
> borrows from Y, Y borrows from X, and X is the original person who did the
> incref.  But you're borrowing from something specific, somebody who the API
> guarantees has a legitimate reference on the object and won't drop it while
> you're using it.  I bet for every other spot in the API I can tell you from
> whom you're borrowing the reference.

Okay. Here's a test:

PyObject* PyList_GetItem(PyObject *list, Py_ssize_t index)
Return value: Borrowed reference.

Presumably you own a reference to the list itself before you call
this, and the list has a reference to its items. But suppose another
thread clear()s the list immediately after you call this; whose
reference are you borrowing now? The list's is gone.

Or is this another API that will have to change post gilectomy?

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Greg Ewing


Larry Hastings wrote:
In contrast, the "borrowed" reference returned by PyWeakRef_GetObject() 
seems to be "borrowed" from some unspecified entity.


It's a delocalised quantum reference, borrowing a little
bit from all strong references in existence. :-)

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Xavier Morel

> On 2016-10-10, at 11:05 , Devin Jeanpierre  wrote:
> The term "borrowed" is supposed to imply a sensible scope during which you're 
> free to use the object, and weakrefs don't have that (except for what is 
> granted by the GIL), so this does sound wacky. I bet it was for performance.

Especially as it handles both getting an object from a weakref and
checking whether the weakref is still alive.

OTOH it could be an enshrined bug, http://bugs.python.org/issue520087
fixed a discrepancy between the doc and the implementation by matching
the doc to the implementation (of returning a borrowed ref').

Also of note, pypy developers have been reporting issues with that
specific API since ~2010[0][1], and IIRC they have added a
PyWeakref_LockObject to cpyext.

[0] http://bugs.python.org/issue8578
[1] http://bugs.python.org/issue16602#msg177272___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Fred Drake

On Mon, Oct 10, 2016 at 4:17 AM, Larry Hastings  wrote:
> Given that the weakref doesn't have a reference to the object--merely a weak
> reference, different thing--whose reference is it borrowing?

As others have said, it doesn't really matter who's reference it was;
just that there was another at the time it was returned.  Clearly it
can't be considered valid once additional Python code might be run.

> FWIW, yes, this is playing merry hell with the Gilectomy.  If there are two
> threads, and one calls PyWeakref_GetObject(obj), and there's only one
> reference to obj, and the other thread blows it away... now what?  It's my
> contention that this API is simply untenable under the Gilectomy, and that
> it needs to change to returning a new (strong) reference.

+1 for this change.


  -Fred

-- 
Fred L. Drake, Jr.
"A storm broke loose in my mind."  --Albert Einstein
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] pythonhosted.org upload no longer works

2016-10-10 Thread Giampaolo Rodola'

This is what I've bumped into just now:

python setup.py upload_sphinx --upload-dir=docs/_build/html
running upload_sphinx
Submitting documentation to https://upload.pypi.org/legacy/
Upload failed (410): Uploading documentation is no longer supported, we
recommend using https://readthedocs.org/.

Personally I prefer pythonhosted over readthedocs as I can provide the same
style as official Python's doc (see https://pythonhosted.org/psutil/) and,
most importantly, because I can automatize doc upload just by running "make
doc-upload".
Is there a reason upload functionality has been disabled?
Is pythonhosted.org going to be dismissed or something?

-- 
Giampaolo - http://grodola.blogspot.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] pythonhosted.org upload no longer works

2016-10-10 Thread Paul Moore

On 10 October 2016 at 14:34, Giampaolo Rodola'  wrote:
> This is what I've bumped into just now:
>
> python setup.py upload_sphinx --upload-dir=docs/_build/html
> running upload_sphinx
> Submitting documentation to https://upload.pypi.org/legacy/
> Upload failed (410): Uploading documentation is no longer supported, we
> recommend using https://readthedocs.org/.
>
> Personally I prefer pythonhosted over readthedocs as I can provide the same
> style as official Python's doc (see https://pythonhosted.org/psutil/) and,
> most importantly, because I can automatize doc upload just by running "make
> doc-upload".
> Is there a reason upload functionality has been disabled?
> Is pythonhosted.org going to be dismissed or something?

This was discussed on distutils-sig back in May 2015. The thread
starts here: 
https://mail.python.org/pipermail/distutils-sig/2015-May/026327.html.
One of the actions in the final proposal
(https://mail.python.org/pipermail/distutils-sig/2015-May/026381.html)
included a step to contact projects that were using pythonhosted.org.
Did you not get any contact?

It might be worth picking this up on distutils-sig, as that's where
the original discussion occurred. I've copied this reply to there, and
suggest followups go to that list.

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Elliot Gorokhovsky

Hi,

I posted here a while back asking what you guys thought about implementing
radix sort for strings in list.sort(). You gave me a lot of reasons why
that would be a bad idea. However, it got me thinking, and I came up with
something that you may find interesting.

First, some simple benchmark results (numbers are seconds, check out the
extension module at https://github.com/embg/python-fast-listsort.git):

*** 1e3 ints ***
F.fastsort(): 0.00018930435180664062
F.sort(): 0.0002830028533935547
*** 1e3 strings ***
F.fastsort(): 0.0003533363342285156
F.sort(): 0.00044846534729003906
*** 1e7 ints ***
F.fastsort(): 5.479267358779907
F.sort(): 8.063318014144897
*** 1e7 strings ***
F.fastsort(): 9.992833137512207
F.sort(): 13.730914115905762

The optimization uses the fact that, in practice, we are almost always
sorting keys of the same type (note: not objects of the same type, *keys*
of the same type; we could have a complex key function like str that takes
many different types, but the keys are still all the same type). So we can
just do one typecheck up front and then do unsafe comparisons during the
sort (if our initial check passes). Specifically, we check that for all the
PyObject* in saved_ob_item, the ob->ob_type are the same. Then we replace
PyObject_RichCompare with ob_type->tp_richcompare. Additionally, we can
check for the very common cases of ints and strings and give optimized
comparison functions for those cases. It might not seem like this would
save a lot of time, but it turns out that PyObject_RichCompare is a massive
clusterf**k that has to test tons of type stuff before it can actually
compare. Cutting that out ends up saving a *lot* of time, as the benchmark
demonstrates.

What do you think? I'm planning on writing this up into a patch, but wanted
to get some feedback on the implementation and ideas for improvement first.

Thanks,
Elliot
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] pythonhosted.org upload no longer works

2016-10-10 Thread Giampaolo Rodola'

Thanks Paul,
no I wasn't aware of that, and I will subscribe to distutils-sig from now
on.
I don't remember ever receiving a warrant about this decision but it's
entirely possible I may have forgot. =)
So long story short is that I will have to move the doc on readthedocs,
correct?
Does pythonhosted provide an automatic redirect or I'll have to upload a
static page which states the doc has been moved?

On Mon, Oct 10, 2016 at 4:41 PM, Paul Moore  wrote:

> On 10 October 2016 at 14:34, Giampaolo Rodola'  wrote:
> > This is what I've bumped into just now:
> >
> > python setup.py upload_sphinx --upload-dir=docs/_build/html
> > running upload_sphinx
> > Submitting documentation to https://upload.pypi.org/legacy/
> > Upload failed (410): Uploading documentation is no longer supported, we
> > recommend using https://readthedocs.org/.
> >
> > Personally I prefer pythonhosted over readthedocs as I can provide the
> same
> > style as official Python's doc (see https://pythonhosted.org/psutil/)
> and,
> > most importantly, because I can automatize doc upload just by running
> "make
> > doc-upload".
> > Is there a reason upload functionality has been disabled?
> > Is pythonhosted.org going to be dismissed or something?
>
> This was discussed on distutils-sig back in May 2015. The thread
> starts here: https://mail.python.org/pipermail/distutils-sig/2015-
> May/026327.html.
> One of the actions in the final proposal
> (https://mail.python.org/pipermail/distutils-sig/2015-May/026381.html)
> included a step to contact projects that were using pythonhosted.org.
> Did you not get any contact?
>
> It might be worth picking this up on distutils-sig, as that's where
> the original discussion occurred. I've copied this reply to there, and
> suggest followups go to that list.
>
> Paul
>



-- 
Giampaolo - http://grodola.blogspot.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Bernardo Sulzbach


On 10/10/2016 03:18 AM, Elliot Gorokhovsky wrote:

Hi,

I posted here a while back asking what you guys thought about
implementing radix sort for strings in list.sort(). You gave me a lot of
reasons why that would be a bad idea. However, it got me thinking, and I
came up with something that you may find interesting.

First, some simple benchmark results (numbers are seconds, check out the
extension module at https://github.com/embg/python-fast-listsort.git):

*** 1e3 ints ***
F.fastsort(): 0.00018930435180664062
F.sort(): 0.0002830028533935547
*** 1e3 strings ***
F.fastsort(): 0.0003533363342285156
F.sort(): 0.00044846534729003906
*** 1e7 ints ***
F.fastsort(): 5.479267358779907
F.sort(): 8.063318014144897
*** 1e7 strings ***
F.fastsort(): 9.992833137512207
F.sort(): 13.730914115905762



The numbers are good. How does this interfere with sorting very small 
lists? Obviously, even if it does not help with very small lists, we can 
always put a threshold on the size and take this path or not.


--
Bernardo Sulzbach
http://www.mafagafogigante.org/
mafagafogiga...@mafagafogigante.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Chris Angelico

On Mon, Oct 10, 2016 at 5:18 PM, Elliot Gorokhovsky
 wrote:
> First, some simple benchmark results (numbers are seconds, check out the
> extension module at https://github.com/embg/python-fast-listsort.git):
>
> *** 1e3 ints ***
> F.fastsort(): 0.00018930435180664062
> F.sort(): 0.0002830028533935547
> *** 1e3 strings ***
> F.fastsort(): 0.0003533363342285156
> F.sort(): 0.00044846534729003906
> *** 1e7 ints ***
> F.fastsort(): 5.479267358779907
> F.sort(): 8.063318014144897
> *** 1e7 strings ***
> F.fastsort(): 9.992833137512207
> F.sort(): 13.730914115905762
>
> The optimization uses the fact that, in practice, we are almost always
> sorting keys of the same type

(Might be a topic for python-ideas rather than python-dev.)

Can you pick up a standard benchmark suite and use that instead of
these simplistic operations? How much slower are sorts of
heterogeneous lists - what's the impact/cost on those? If a large test
suite ("make test" if you don't have a nice benchmark handy - the
CPython test suite is a solid mass of code) shows an overall slowdown,
this probably isn't worth pursuing; if it shows a minor but
insignificant increase, there might be something worth looking into;
and if it shows a huge increase... well, to be honest, if it shows a
huge increase, I'd suspect a fault in the measurements, because
sorting isn't a significant part of most test suites :)

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Distutils] pythonhosted.org upload no longer works

2016-10-10 Thread Donald Stufft

You’re using the new PyPI (either by getting a new default from a new version 
of Python or Twine OR explicitly switching to it yourself) which has this 
disabled. You can still upload to legacy PyPI by switching the default back. 
That will restore the ability to upload docs to pythonhosted.org.

We do plan on adding the ability to provide an automatic redirect to the new 
PyPI, but it’s not there yet.

> On Oct 10, 2016, at 11:17 AM, Giampaolo Rodola'  wrote:
> 
> Thanks Paul,
> no I wasn't aware of that, and I will subscribe to distutils-sig from now on. 
> I don't remember ever receiving a warrant about this decision but it's 
> entirely possible I may have forgot. =)
> So long story short is that I will have to move the doc on readthedocs, 
> correct?
> Does pythonhosted provide an automatic redirect or I'll have to upload a 
> static page which states the doc has been moved?
> 
> On Mon, Oct 10, 2016 at 4:41 PM, Paul Moore  > wrote:
> On 10 October 2016 at 14:34, Giampaolo Rodola'  > wrote:
> > This is what I've bumped into just now:
> >
> > python setup.py upload_sphinx --upload-dir=docs/_build/html
> > running upload_sphinx
> > Submitting documentation to https://upload.pypi.org/legacy/ 
> > 
> > Upload failed (410): Uploading documentation is no longer supported, we
> > recommend using https://readthedocs.org/ .
> >
> > Personally I prefer pythonhosted over readthedocs as I can provide the same
> > style as official Python's doc (see https://pythonhosted.org/psutil/ 
> > ) and,
> > most importantly, because I can automatize doc upload just by running "make
> > doc-upload".
> > Is there a reason upload functionality has been disabled?
> > Is pythonhosted.org  going to be dismissed or 
> > something?
> 
> This was discussed on distutils-sig back in May 2015. The thread
> starts here: 
> https://mail.python.org/pipermail/distutils-sig/2015-May/026327.html 
> .
> One of the actions in the final proposal
> (https://mail.python.org/pipermail/distutils-sig/2015-May/026381.html 
> )
> included a step to contact projects that were using pythonhosted.org 
> .
> Did you not get any contact?
> 
> It might be worth picking this up on distutils-sig, as that's where
> the original discussion occurred. I've copied this reply to there, and
> suggest followups go to that list.
> 
> Paul
> 
> 
> 
> -- 
> Giampaolo - http://grodola.blogspot.com 
> 
> ___
> Distutils-SIG maillist  -  distutils-...@python.org 
> 
> https://mail.python.org/mailman/listinfo/distutils-sig 
> 

—
Donald Stufft



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Distutils] pythonhosted.org upload no longer works

2016-10-10 Thread Guido van Rossum

Honestly I like readthedocs a lot, and I actually *don't* like docs
that look too much like the standard Python docs -- it should be clear
to readers (subliminally, through page style, not just be parsing the
URL) that they're reading third party docs.

On Mon, Oct 10, 2016 at 9:06 AM, Donald Stufft  wrote:
> You’re using the new PyPI (either by getting a new default from a new
> version of Python or Twine OR explicitly switching to it yourself) which has
> this disabled. You can still upload to legacy PyPI by switching the default
> back. That will restore the ability to upload docs to pythonhosted.org.
>
> We do plan on adding the ability to provide an automatic redirect to the new
> PyPI, but it’s not there yet.
>
> On Oct 10, 2016, at 11:17 AM, Giampaolo Rodola'  wrote:
>
> Thanks Paul,
> no I wasn't aware of that, and I will subscribe to distutils-sig from now
> on.
> I don't remember ever receiving a warrant about this decision but it's
> entirely possible I may have forgot. =)
> So long story short is that I will have to move the doc on readthedocs,
> correct?
> Does pythonhosted provide an automatic redirect or I'll have to upload a
> static page which states the doc has been moved?
>
> On Mon, Oct 10, 2016 at 4:41 PM, Paul Moore  wrote:
>>
>> On 10 October 2016 at 14:34, Giampaolo Rodola'  wrote:
>> > This is what I've bumped into just now:
>> >
>> > python setup.py upload_sphinx --upload-dir=docs/_build/html
>> > running upload_sphinx
>> > Submitting documentation to https://upload.pypi.org/legacy/
>> > Upload failed (410): Uploading documentation is no longer supported, we
>> > recommend using https://readthedocs.org/.
>> >
>> > Personally I prefer pythonhosted over readthedocs as I can provide the
>> > same
>> > style as official Python's doc (see https://pythonhosted.org/psutil/)
>> > and,
>> > most importantly, because I can automatize doc upload just by running
>> > "make
>> > doc-upload".
>> > Is there a reason upload functionality has been disabled?
>> > Is pythonhosted.org going to be dismissed or something?
>>
>> This was discussed on distutils-sig back in May 2015. The thread
>> starts here:
>> https://mail.python.org/pipermail/distutils-sig/2015-May/026327.html.
>> One of the actions in the final proposal
>> (https://mail.python.org/pipermail/distutils-sig/2015-May/026381.html)
>> included a step to contact projects that were using pythonhosted.org.
>> Did you not get any contact?
>>
>> It might be worth picking this up on distutils-sig, as that's where
>> the original discussion occurred. I've copied this reply to there, and
>> suggest followups go to that list.
>>
>> Paul
>
>
>
>
> --
> Giampaolo - http://grodola.blogspot.com
>
> ___
> Distutils-SIG maillist  -  distutils-...@python.org
> https://mail.python.org/mailman/listinfo/distutils-sig
>
>
>
> —
> Donald Stufft
>
>
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Guido van Rossum

Modified +1: you can't change the behavior of the existing API, but
you can deprecate it and introduce a better one with a different name.
We'll have until Python 4.0 to carry through the deprecation anyways.
And I doubt this is the only C API change needed for happy gil-free
coding...

On Mon, Oct 10, 2016 at 5:29 AM, Fred Drake  wrote:
> On Mon, Oct 10, 2016 at 4:17 AM, Larry Hastings  wrote:
>> Given that the weakref doesn't have a reference to the object--merely a weak
>> reference, different thing--whose reference is it borrowing?
>
> As others have said, it doesn't really matter who's reference it was;
> just that there was another at the time it was returned.  Clearly it
> can't be considered valid once additional Python code might be run.
>
>> FWIW, yes, this is playing merry hell with the Gilectomy.  If there are two
>> threads, and one calls PyWeakref_GetObject(obj), and there's only one
>> reference to obj, and the other thread blows it away... now what?  It's my
>> contention that this API is simply untenable under the Gilectomy, and that
>> it needs to change to returning a new (strong) reference.
>
> +1 for this change.
>
>
>   -Fred
>
> --
> Fred L. Drake, Jr.
> "A storm broke loose in my mind."  --Albert Einstein
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/guido%40python.org



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread MRAB


On 2016-10-10 10:45, Chris Angelico wrote:

On Mon, Oct 10, 2016 at 8:35 PM, Larry Hastings  wrote:

Huh?  In all other circumstances, a "borrowed" reference is exactly that: X
has a reference, and you are relying on X's reference to keep the object
alive.  Borrowing from a borrowed reference is simply a chain of these; Z
borrows from Y, Y borrows from X, and X is the original person who did the
incref.  But you're borrowing from something specific, somebody who the API
guarantees has a legitimate reference on the object and won't drop it while
you're using it.  I bet for every other spot in the API I can tell you from
whom you're borrowing the reference.


Okay. Here's a test:

PyObject* PyList_GetItem(PyObject *list, Py_ssize_t index)
Return value: Borrowed reference.

Presumably you own a reference to the list itself before you call
this, and the list has a reference to its items. But suppose another
thread clear()s the list immediately after you call this; whose
reference are you borrowing now? The list's is gone.

Or is this another API that will have to change post gilectomy?


Couldn't the reference to the list also be borrowed?

If you lookup something in a dict, it'll be a borrowed reference.

If the dict is globals() and there's no GIL, another thread could delete 
the item while your code had the borrowed reference.


It looks like there might be a lot that will need to changed post gilectomy!

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Paul Moore

On 10 October 2016 at 17:49, MRAB  wrote:
> If you lookup something in a dict, it'll be a borrowed reference.
>
> If the dict is globals() and there's no GIL, another thread could delete the
> item while your code had the borrowed reference.
>
> It looks like there might be a lot that will need to changed post gilectomy!

It seems to me that the whole concept of a borrowed reference may be
risky in a post-GIL world. There may be occasional cases where it's
possible to prove that borrowing is safe, but I suspect they'll be
pretty rare.

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Chris Angelico

On Tue, Oct 11, 2016 at 3:49 AM, MRAB  wrote:
> On 2016-10-10 10:45, Chris Angelico wrote:
>>
>> On Mon, Oct 10, 2016 at 8:35 PM, Larry Hastings 
>> wrote:
>>>
>>> Huh?  In all other circumstances, a "borrowed" reference is exactly that:
>>> X
>>> has a reference, and you are relying on X's reference to keep the object
>>> alive.  Borrowing from a borrowed reference is simply a chain of these; Z
>>> borrows from Y, Y borrows from X, and X is the original person who did
>>> the
>>> incref.  But you're borrowing from something specific, somebody who the
>>> API
>>> guarantees has a legitimate reference on the object and won't drop it
>>> while
>>> you're using it.  I bet for every other spot in the API I can tell you
>>> from
>>> whom you're borrowing the reference.
>>
>>
>> Okay. Here's a test:
>>
>> PyObject* PyList_GetItem(PyObject *list, Py_ssize_t index)
>> Return value: Borrowed reference.
>>
>> Presumably you own a reference to the list itself before you call
>> this, and the list has a reference to its items. But suppose another
>> thread clear()s the list immediately after you call this; whose
>> reference are you borrowing now? The list's is gone.
>>
>> Or is this another API that will have to change post gilectomy?
>>
> Couldn't the reference to the list also be borrowed?
>
> If you lookup something in a dict, it'll be a borrowed reference.
>
> If the dict is globals() and there's no GIL, another thread could delete the
> item while your code had the borrowed reference.
>
> It looks like there might be a lot that will need to changed post gilectomy!

If you're trying to manipulate a list, you should probably own a
reference to it. Otherwise, you're taking great risks. The trouble is
that there's time enough for a context switch between PyList_GetItem
and even an immediately-following incref, so you could have a junk
pointer despite your best efforts.

Contrast:

PyObject* PySequence_GetItem(PyObject *o, Py_ssize_t i)
Return value: New reference.

If you use sequence protocol, you are guaranteed to have a valid
reference to the object. To be sure, the item might no longer be in
the sequence by the time you look at it... but you know you're not
looking at junk memory. With PyList_GetItem, it would be possible for
you to have a dud pointer.

Also contrast:

PyObject* PyTuple_GetItem(PyObject *p, Py_ssize_t pos)
Return value: Borrowed reference.

As long as you own a reference to the tuple itself, you can be
confident that the borrowed reference will still be valid. There's no
way that the object's refcount can validly hit zero so long as the
tuple exists. (I'm ignoring PyTuple_SetItem here, which basically is
for initialization.) It's only with lists that you have this risk.

PyObject* PyDict_GetItem(PyObject *p, PyObject *key)
Return value: Borrowed reference.

... okay, it's only with lists and dictionaries. Here come the other
lot called 'Python', I guess.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Chris Angelico

On Tue, Oct 11, 2016 at 4:03 AM, Paul Moore  wrote:
> On 10 October 2016 at 17:49, MRAB  wrote:
>> If you lookup something in a dict, it'll be a borrowed reference.
>>
>> If the dict is globals() and there's no GIL, another thread could delete the
>> item while your code had the borrowed reference.
>>
>> It looks like there might be a lot that will need to changed post gilectomy!
>
> It seems to me that the whole concept of a borrowed reference may be
> risky in a post-GIL world. There may be occasional cases where it's
> possible to prove that borrowing is safe, but I suspect they'll be
> pretty rare.

Not really. If you have an object that forever owns a reference to
another object, it's safe to return borrowed references. A quick
search for 'borrowed' showed up these perfectly safe[1] examples:

PyObject* PyFunction_GetGlobals(PyObject *op)
Return value: Borrowed reference.
-- a function's __globals__ attribute is read-only
-- though its __code__ is not, so possibly PyFunction_GetCode may be dangerous??

PyObject* PyTuple_GetItem(PyObject *p, Py_ssize_t pos)
Return value: Borrowed reference.
-- as mentioned previously, tuples are immutable and thus safe

PyObject* PyMethod_Function(PyObject *meth)
Return value: Borrowed reference.
PyObject* PyMethod_Self(PyObject *meth)
Return value: Borrowed reference.
-- a bound method's __func__ and __self__ attributes are read-only

In these cases, it's easy to see what reference you're borrowing, and
it's not a problem.

ChrisA

[1] As far as I know!
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Nathaniel Smith

On Mon, Oct 10, 2016 at 10:03 AM, Paul Moore  wrote:
> On 10 October 2016 at 17:49, MRAB  wrote:
>> If you lookup something in a dict, it'll be a borrowed reference.
>>
>> If the dict is globals() and there's no GIL, another thread could delete the
>> item while your code had the borrowed reference.
>>
>> It looks like there might be a lot that will need to changed post gilectomy!
>
> It seems to me that the whole concept of a borrowed reference may be
> risky in a post-GIL world. There may be occasional cases where it's
> possible to prove that borrowing is safe, but I suspect they'll be
> pretty rare.

I don't think it's that bad...

In a post-GIL world, the problem cases we're talking about like
deleting items from containers all require some sort of locking, even
before we start thinking about borrowed references. If two threads
start mucking around resizing the internals of a list or dict at the
same time, then you are unlikely to go to space today. IIRC to handle
this gilectomy adds per-object mutexes that you have to hold whenever
you're mucking around with that object's internals.

If we say that borrowing reference from a dict is one of the things
that counts as mucking about with that dict, and thus requires you to
hold the dict lock for as long as you hold the borrowed reference,
then all should be well.

I assume Larry is way ahead of us on this, which is why he's suddenly
confusing us all by emphasizing that when you borrow a reference you
should know who you're borrowing it from, so you know which lock to
hold.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Paul Moore

On 10 October 2016 at 18:50, Nathaniel Smith  wrote:
> I assume Larry is way ahead of us on this,

Yeah, I'd imagine you're right on that :-)
Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread MRAB

On 2016-10-10 18:50, Nathaniel Smith wrote:

On Mon, Oct 10, 2016 at 10:03 AM, Paul Moore  wrote:
> On 10 October 2016 at 17:49, MRAB  wrote:
>> If you lookup something in a dict, it'll be a borrowed reference.
>>
>> If the dict is globals() and there's no GIL, another thread could delete the
>> item while your code had the borrowed reference.
>>
>> It looks like there might be a lot that will need to changed post gilectomy!
>
> It seems to me that the whole concept of a borrowed reference may be
> risky in a post-GIL world. There may be occasional cases where it's
> possible to prove that borrowing is safe, but I suspect they'll be
> pretty rare.

I don't think it's that bad...

In a post-GIL world, the problem cases we're talking about like
deleting items from containers all require some sort of locking, even
before we start thinking about borrowed references. If two threads
start mucking around resizing the internals of a list or dict at the
same time, then you are unlikely to go to space today. IIRC to handle
this gilectomy adds per-object mutexes that you have to hold whenever
you're mucking around with that object's internals.

If we say that borrowing reference from a dict is one of the things
that counts as mucking about with that dict, and thus requires you to
hold the dict lock for as long as you hold the borrowed reference,
then all should be well.

I assume Larry is way ahead of us on this, which is why he's suddenly
confusing us all by emphasizing that when you borrow a reference you
should know who you're borrowing it from, so you know which lock to
hold.

Instead of locking the object, could we keep the GIL, but have it 
normally released?

A thread could then still call a function such as PyWeakref_GetObject() 
that returns a borrowed reference, but only if it's holding the GIL. It 
would be able to INCREF the reference before releasing the GIL again.

There could still be a new function that returned a strong reference.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Random832

On Mon, Oct 10, 2016, at 14:04, MRAB wrote:
> Instead of locking the object, could we keep the GIL, but have it 
> normally released?
> 
> A thread could then still call a function such as PyWeakref_GetObject() 
> that returns a borrowed reference, but only if it's holding the GIL. It 
> would be able to INCREF the reference before releasing the GIL again.

So, what stops the other thread which never asks for the GIL from
blowing away the reference? Or is this a special kind of lock that you
can "assert isn't locked" without locking it for yourself, and
INCREF/DECREF does so?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Chris Angelico

On Tue, Oct 11, 2016 at 5:24 AM, Random832  wrote:
> On Mon, Oct 10, 2016, at 14:04, MRAB wrote:
>> Instead of locking the object, could we keep the GIL, but have it
>> normally released?
>>
>> A thread could then still call a function such as PyWeakref_GetObject()
>> that returns a borrowed reference, but only if it's holding the GIL. It
>> would be able to INCREF the reference before releasing the GIL again.
>
> So, what stops the other thread which never asks for the GIL from
> blowing away the reference? Or is this a special kind of lock that you
> can "assert isn't locked" without locking it for yourself, and
> INCREF/DECREF does so?

"assert isn't locked" is pretty cheap (race conditions wouldn't be
applicable here, as you would have to completely obtain the GIL before
even attempting a dangerous operation), but what would INCREF/DECREF
do if the GIL is locked by another thread?

Hmm. Here's a naughty, and maybe dangerous, theory. Obtain a "memory
deallocation lock". While it is held (by any thread - it's a guard,
more than a lock), Py_DECREF will not actually deallocate memory -
objects can fall to zero references without being wiped. Once the
lock/guard is freed/cleared, anything that had fallen to zero is now
deallocated. This probably would mean stuffing them onto a list of
"doomed objects", and upon release of the guard, any doomed objects
that still have no refs would get deallocated.

That would, by definition, make *all* borrowed refs legal; you can
either use them and finish, or INCREF them before releasing the
deallocation lock, and either way, you have a guarantee that (a)
you're not messing with junk memory, and (b) it really is the object
you think it is. There should be no risk of race conditions, because
you would completely claim the deallocation lock before calling
something that gives you a borrowed ref, meaning that the whole time
the ref is borrowed, you must have that lock.

But I'm guessing other people have thought far more about this than I have.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread MRAB


On 2016-10-10 20:36, Chris Angelico wrote:

On Tue, Oct 11, 2016 at 5:24 AM, Random832  wrote:

On Mon, Oct 10, 2016, at 14:04, MRAB wrote:

Instead of locking the object, could we keep the GIL, but have it
normally released?

A thread could then still call a function such as PyWeakref_GetObject()
that returns a borrowed reference, but only if it's holding the GIL. It
would be able to INCREF the reference before releasing the GIL again.


So, what stops the other thread which never asks for the GIL from
blowing away the reference? Or is this a special kind of lock that you
can "assert isn't locked" without locking it for yourself, and
INCREF/DECREF does so?


"assert isn't locked" is pretty cheap (race conditions wouldn't be
applicable here, as you would have to completely obtain the GIL before
even attempting a dangerous operation), but what would INCREF/DECREF
do if the GIL is locked by another thread?

Hmm. Here's a naughty, and maybe dangerous, theory. Obtain a "memory
deallocation lock". While it is held (by any thread - it's a guard,
more than a lock), Py_DECREF will not actually deallocate memory -
objects can fall to zero references without being wiped. Once the
lock/guard is freed/cleared, anything that had fallen to zero is now
deallocated. This probably would mean stuffing them onto a list of
"doomed objects", and upon release of the guard, any doomed objects
that still have no refs would get deallocated.

That would, by definition, make *all* borrowed refs legal; you can
either use them and finish, or INCREF them before releasing the
deallocation lock, and either way, you have a guarantee that (a)
you're not messing with junk memory, and (b) it really is the object
you think it is. There should be no risk of race conditions, because
you would completely claim the deallocation lock before calling
something that gives you a borrowed ref, meaning that the whole time
the ref is borrowed, you must have that lock.

But I'm guessing other people have thought far more about this than I have.

I have to admit that the GIL thing was just a vague feeling I had, and I 
hadn't thought too deeply about it, but I think that you've crystallised 
it. :-)


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Elliot Gorokhovsky

I'd like to reply to both questions with one email, if that's all right.

Bernardo Sulzbach asked, "How does this interfere with sorting very small
lists?", and ChrisA asked, "How much slower are sorts of heterogeneous
lists?". I modified the benchmark to answer both questions, the former at
the top, and the latter at the bottom (see the git repo):

*** 10 ints ***
F.fastsort(): 2.86102294921875e-06
F.sort(): 3.337860107421875e-06
*** 10 strings ***
F.fastsort(): 1.6689300537109375e-06
F.sort(): 1.6689300537109375e-06
*** 1e3 ints ***
F.fastsort(): 0.00013589859008789062
F.sort(): 0.00018906593322753906
*** 1e3 strings ***
F.fastsort(): 0.0002529621124267578
F.sort(): 0.0002772808074951172
*** 1e7 ints ***
F.fastsort(): 5.472854375839233
F.sort(): 7.8826072216033936
*** 1e7 strings ***
F.fastsort(): 10.104042053222656
F.sort(): 13.139304876327515
*** 1e7 ints + 1 float (to disable the optimization while keeping the
precheck)***
F.fastsort(): 7.57237982749939
F.sort(): 7.666172504425049

ChrisA suggested I also try "make test" or something to get a more
realistic benchmark. I will do that once I implement this as a patch, right
now it's an extension module that subclasses list, so I can't just drop it
into existing code without modification.

Let me know if you have any other questions/suggestions!

On Mon, Oct 10, 2016 at 12:18 AM Elliot Gorokhovsky <
elliot.gorokhov...@gmail.com> wrote:

> Hi,
>
> I posted here a while back asking what you guys thought about implementing
> radix sort for strings in list.sort(). You gave me a lot of reasons why
> that would be a bad idea. However, it got me thinking, and I came up with
> something that you may find interesting.
>
> First, some simple benchmark results (numbers are seconds, check out the
> extension module at https://github.com/embg/python-fast-listsort.git):
>
> *** 1e3 ints ***
> F.fastsort(): 0.00018930435180664062
> F.sort(): 0.0002830028533935547
> *** 1e3 strings ***
> F.fastsort(): 0.0003533363342285156
> F.sort(): 0.00044846534729003906
> *** 1e7 ints ***
> F.fastsort(): 5.479267358779907
> F.sort(): 8.063318014144897
> *** 1e7 strings ***
> F.fastsort(): 9.992833137512207
> F.sort(): 13.730914115905762
>
> The optimization uses the fact that, in practice, we are almost always
> sorting keys of the same type (note: not objects of the same type, *keys*
> of the same type; we could have a complex key function like str that takes
> many different types, but the keys are still all the same type). So we can
> just do one typecheck up front and then do unsafe comparisons during the
> sort (if our initial check passes). Specifically, we check that for all the
> PyObject* in saved_ob_item, the ob->ob_type are the same. Then we replace
> PyObject_RichCompare with ob_type->tp_richcompare. Additionally, we can
> check for the very common cases of ints and strings and give optimized
> comparison functions for those cases. It might not seem like this would
> save a lot of time, but it turns out that PyObject_RichCompare is a massive
> clusterf**k that has to test tons of type stuff before it can actually
> compare. Cutting that out ends up saving a *lot* of time, as the benchmark
> demonstrates.
>
> What do you think? I'm planning on writing this up into a patch, but
> wanted to get some feedback on the implementation and ideas for improvement
> first.
>
> Thanks,
> Elliot
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Nathaniel Smith

On Mon, Oct 10, 2016 at 1:42 PM, Elliot Gorokhovsky
 wrote:
> *** 10 strings ***
> F.fastsort(): 1.6689300537109375e-06
> F.sort(): 1.6689300537109375e-06

I think something has gone wrong with your timing harness...

For accurately timing microbenchmarks, you should use timeit, or
better yet Victor Stinner's perf package:
  https://perf.readthedocs.io/

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Larry Hastings



On 10/10/2016 06:37 PM, Guido van Rossum wrote:

Modified +1: you can't change the behavior of the existing API, but
you can deprecate it and introduce a better one with a different name.
We'll have until Python 4.0 to carry through the deprecation anyways.
And I doubt this is the only C API change needed for happy gil-free
coding...


First, "deprecate" won't work for these semantics for the Gilectomy 
branch.  I simply cannot safely support the semantics of the existing 
function.  All call sites need to change.


Second, consider that every function that returns a borrowed 
reference--PyDict_GetItem(), PyList_GetItem()--has to change too, to 
return an actual (non-borrowed) reference.  It's going to be a major 
upheaval in the C API.



For now I'm going to leave the names as-is and just change the semantics 
everywhere.  If this approach really works, and if we decide we want to 
merge it back into trunk--and those are both still big if's--we can 
revisit this decision.



I haven't yet ruled out abandoning reference counting completely and 
going to 100% tracing garbage collecting,



//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Larry Hastings



On 10/10/2016 09:36 PM, Chris Angelico wrote:

Hmm. Here's a naughty, and maybe dangerous, theory. Obtain a "memory
deallocation lock". While it is held (by any thread - it's a guard,
more than a lock), Py_DECREF will not actually deallocate memory -
objects can fall to zero references without being wiped. Once the
lock/guard is freed/cleared, anything that had fallen to zero is now
deallocated. This probably would mean stuffing them onto a list of
"doomed objects", and upon release of the guard, any doomed objects
that still have no refs would get deallocated.


If this worked, this would actually be really easy with my current 
"buffered reference counting" approach.  I literally already have a list 
of doomed objects, which a separate thread queues up for each thread to 
process later.


(This was a necessary part of "buffered reference counting", in which 
reference count changes are stored in a transaction log and later 
committed by a single thread.  Since there's only one thread making 
reference count changes, there's no contention, so it doesn't have to 
use atomic incr/decr, which is a big performance win.)


But I don't think this fixes the problem.  Consider:

1. Thread A calls Q = PyList_GetItem(L, 0), which returns a borrowed
   reference.  Thread A then gets suspended, before it has a chance to
   Py_INCREF(Q).
2. Thread B does L.clear(), the reference count of Q goes to 0, Q is
   added to the doomed objects list.
3. Thread A gets to run again.  It does Py_INCREF(Q); Q's refcount is
   now back up to 1, as if Q had been resurrected.
4. At some point in the future the "memory deallocation lock" is
   released and Q is deleted.

If you say "well, just look at the reference count, and if it's 1 throw 
it off the doomed objects list", keep in mind that I no longer have 
accurate real-time reference counts.  These hacks where we play games 
with the reference count are mostly removed in my branch.


Also, I don't know when it would ever be safe to release the "memory 
deallocation lock".  Just because it's safe for your thread doesn't mean 
it's safe for another thread.  And if you do it on a thread-by-thread 
basis, in the above example it might be safe from thread B's perspective 
to release its "memory deallocation lock", but as illustrated that can 
have an effect on thread A.


I appreciate the brainstorming but I'm not currently sanguine about this 
idea,



//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Elliot Gorokhovsky

Hm... that is strange, but I don't think there's anything wrong with the
way I'm timing, though I agree perf/timeit would be better. I ran the
benchmark a couple of times and the numbers seem to exactly line up
something like one in five times; perhaps not that crazy considering
they're executing nearly the same code?

Anyway, benchmarking technique aside, the point is that it it works well
for small lists (i.e. doesn't affect performance).

On Mon, Oct 10, 2016 at 2:53 PM Nathaniel Smith  wrote:

> On Mon, Oct 10, 2016 at 1:42 PM, Elliot Gorokhovsky
>  wrote:
> > *** 10 strings ***
> > F.fastsort(): 1.6689300537109375e-06
> > F.sort(): 1.6689300537109375e-06
>
> I think something has gone wrong with your timing harness...
>
> For accurately timing microbenchmarks, you should use timeit, or
> better yet Victor Stinner's perf package:
>   https://perf.readthedocs.io/
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Chris Angelico

On Tue, Oct 11, 2016 at 7:42 AM, Elliot Gorokhovsky
 wrote:
> ChrisA suggested I also try "make test" or something to get a more realistic
> benchmark. I will do that once I implement this as a patch, right now it's
> an extension module that subclasses list, so I can't just drop it into
> existing code without modification.

Oh, okay. If it's done that way, there's (in a way) a guarantee that
it won't worsen anything, because you have to explicitly request the
new behaviour. So if you KNOW that you're going to be doing a ton of
string-only sorting, you can call on this new list subclass, and if
you're not, you simply don't.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Chris Angelico

On Tue, Oct 11, 2016 at 8:14 AM, Larry Hastings  wrote:
> But I don't think this fixes the problem.  Consider:
>
> Thread A calls Q = PyList_GetItem(L, 0), which returns a borrowed reference.
> Thread A then gets suspended, before it has a chance to Py_INCREF(Q).
> Thread B does L.clear(), the reference count of Q goes to 0, Q is added to
> the doomed objects list.
> Thread A gets to run again.  It does Py_INCREF(Q); Q's refcount is now back
> up to 1, as if Q had been resurrected.
> At some point in the future the "memory deallocation lock" is released and Q
> is deleted.
>
> If you say "well, just look at the reference count, and if it's 1 throw it
> off the doomed objects list", keep in mind that I no longer have accurate
> real-time reference counts.  These hacks where we play games with the
> reference count are mostly removed in my branch.

That's exactly what I would have said, because I was assuming that
refcounts would be accurate. I'm not sure what you mean by "play games
with", but it sounds like you have a much more sophisticated system
going on than I had in my head, and you'll need a correspondingly
sophisticated solution to this problem.

Like I said, others have thought a LOT more about this than I have.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Nathaniel Smith

On Mon, Oct 10, 2016 at 2:16 PM, Elliot Gorokhovsky
 wrote:
> Hm... that is strange, but I don't think there's anything wrong with the way
> I'm timing, though I agree perf/timeit would be better. I ran the benchmark
> a couple of times and the numbers seem to exactly line up something like one
> in five times; perhaps not that crazy considering they're executing nearly
> the same code?

No, computer clocks are precise enough, and CPUs are wonky enough
(cache effects, etc.), that it should be effectively impossible to get
the same timing result twice in row, even for running exactly the same
code. I'm not sure what's going on, but it's weird. Making these kinds
of measurements is much more complicated than it looks and you really
need to use something like timeit or perf if you want trustworthy
results. Fortunately, they're easy to use :-)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Greg Ewing


On Tue, Oct 11, 2016 at 4:03 AM, Paul Moore  wrote:


If you have an object that forever owns a reference to
another object, it's safe to return borrowed references.


Even if there are such cases, it seems we're going to
have to be a lot more careful about identifying them.

It might be safer not to have any APIs that return
borrowed references, or at least name them in a way
that make it clear they're dangerous and only to be
used with extreme care.

Having things like PyDict_GetItem where the *normal*
version of the API returns a borrowed reference seems
like a footgun.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Greg Ewing


Nathaniel Smith wrote:

IIRC to handle
this gilectomy adds per-object mutexes that you have to hold whenever
you're mucking around with that object's internals.


What counts as "mucking around with the object's internals",
though?

If I do the C equivalent of:

   x = somedict[5]
   x.dosomething()

am I mucking around with the internals of somedict?
Intuitively, I would not think so. But the way
PyDict_GetItem currently works, this would be
dangerous.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Greg Ewing


Random832 wrote:

Or is this a special kind of lock that you
can "assert isn't locked" without locking it for yourself, and
INCREF/DECREF does so?


I don't think that would work. It might be unlocked at
the moment you test it, but someone might lock it between
then and the following INCREF/DECREF.

Locking is like pregnancy -- you can't have just a
little bit of it. :-)

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Guido van Rossum

On Mon, Oct 10, 2016 at 1:57 PM, Larry Hastings  wrote:
>
> On 10/10/2016 06:37 PM, Guido van Rossum wrote:
>
> Modified +1: you can't change the behavior of the existing API, but
> you can deprecate it and introduce a better one with a different name.
> We'll have until Python 4.0 to carry through the deprecation anyways.
> And I doubt this is the only C API change needed for happy gil-free
> coding...
>
>
> First, "deprecate" won't work for these semantics for the Gilectomy branch.
> I simply cannot safely support the semantics of the existing function.  All
> call sites need to change.

Oh, in the gilectomy you can fix all call sites. Even in CPython. But
3rd party extensions should not have this API removed in Python 3.7.

> Second, consider that every function that returns a borrowed
> reference--PyDict_GetItem(), PyList_GetItem()--has to change too, to return
> an actual (non-borrowed) reference.  It's going to be a major upheaval in
> the C API.

Yeah, this has been discussed to death in the rest of the thread.

> For now I'm going to leave the names as-is and just change the semantics
> everywhere.  If this approach really works, and if we decide we want to
> merge it back into trunk--and those are both still big if's--we can revisit
> this decision.

Changing something as subtle as this to an API sounds really bad even
in an experimental branch. But it's your branch and you can do what
you want (then again, why did you bring it up here? :-).

> I haven't yet ruled out abandoning reference counting completely and going
> to 100% tracing garbage collecting,

I guess that would have its own set of pros and cons.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread MRAB


On 2016-10-10 22:14, Larry Hastings wrote:


On 10/10/2016 09:36 PM, Chris Angelico wrote:

Hmm. Here's a naughty, and maybe dangerous, theory. Obtain a "memory
deallocation lock". While it is held (by any thread - it's a guard,
more than a lock), Py_DECREF will not actually deallocate memory -
objects can fall to zero references without being wiped. Once the
lock/guard is freed/cleared, anything that had fallen to zero is now
deallocated. This probably would mean stuffing them onto a list of
"doomed objects", and upon release of the guard, any doomed objects
that still have no refs would get deallocated.


If this worked, this would actually be really easy with my current
"buffered reference counting" approach.  I literally already have a list
of doomed objects, which a separate thread queues up for each thread to
process later.

(This was a necessary part of "buffered reference counting", in which
reference count changes are stored in a transaction log and later
committed by a single thread.  Since there's only one thread making
reference count changes, there's no contention, so it doesn't have to
use atomic incr/decr, which is a big performance win.)

But I don't think this fixes the problem.  Consider:

 1. Thread A calls Q = PyList_GetItem(L, 0), which returns a borrowed
reference.  Thread A then gets suspended, before it has a chance to
Py_INCREF(Q).
 2. Thread B does L.clear(), the reference count of Q goes to 0, Q is
added to the doomed objects list.
 3. Thread A gets to run again.  It does Py_INCREF(Q); Q's refcount is
now back up to 1, as if Q had been resurrected.
 4. At some point in the future the "memory deallocation lock" is
released and Q is deleted.

If you say "well, just look at the reference count, and if it's 1 throw
it off the doomed objects list", keep in mind that I no longer have
accurate real-time reference counts.  These hacks where we play games
with the reference count are mostly removed in my branch.

Also, I don't know when it would ever be safe to release the "memory
deallocation lock".  Just because it's safe for your thread doesn't mean
it's safe for another thread.  And if you do it on a thread-by-thread
basis, in the above example it might be safe from thread B's perspective
to release its "memory deallocation lock", but as illustrated that can
have an effect on thread A.

The deallocation lock could be a counter. Doomed objects would be 
collectable when it's zero.



I appreciate the brainstorming but I'm not currently sanguine about this
idea,



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Chris Angelico

On Tue, Oct 11, 2016 at 9:52 AM, MRAB  wrote:
>> Also, I don't know when it would ever be safe to release the "memory
>> deallocation lock".  Just because it's safe for your thread doesn't mean
>> it's safe for another thread.  And if you do it on a thread-by-thread
>> basis, in the above example it might be safe from thread B's perspective
>> to release its "memory deallocation lock", but as illustrated that can
>> have an effect on thread A.
>>
> The deallocation lock could be a counter. Doomed objects would be
> collectable when it's zero.

Yeah, which is why I described it as a "guard". As long as you have
atomic increment/decrement (ie if two threads simultaneously try to
increment it, it WILL go up by two), it should be fine. Well, that
part should, anyway.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] [RELEASE] Python 3.6.0b2 is now available

2016-10-10 Thread Ned Deily

On behalf of the Python development community and the Python 3.6 release
team, I'm happy to announce the availability of Python 3.6.0b2. 3.6.0b2
is the second of four planned beta releases of Python 3.6, the next major
release of Python, and marks the end of the feature development phase
for 3.6.

Among the new major new features in Python 3.6 are:

* PEP 468 - Preserving the order of **kwargs in a function
* PEP 487 - Simpler customization of class creation
* PEP 495 - Local Time Disambiguation
* PEP 498 - Literal String Formatting
* PEP 506 - Adding A Secrets Module To The Standard Library
* PEP 509 - Add a private version to dict
* PEP 515 - Underscores in Numeric Literals
* PEP 519 - Adding a file system path protocol
* PEP 520 - Preserving Class Attribute Definition Order
* PEP 523 - Adding a frame evaluation API to CPython
* PEP 524 - Make os.urandom() blocking on Linux (during system startup)
* PEP 525 - Asynchronous Generators (provisional)
* PEP 526 - Syntax for Variable Annotations (provisional)
* PEP 528 - Change Windows console encoding to UTF-8 (provisional)
* PEP 529 - Change Windows filesystem encoding to UTF-8 (provisional)
* PEP 530 - Asynchronous Comprehensions

Please see "What’s New In Python 3.6" for more information:

https://docs.python.org/3.6/whatsnew/3.6.html

You can find Python 3.6.0b2 here:

https://www.python.org/downloads/release/python-360b2/

Beta releases are intended to give the wider community the opportunity
to test new features and bug fixes and to prepare their projects to
support the new feature release. We strongly encourage maintainers of
third-party Python projects to test with 3.6 during the beta phase and
report issues found to bugs.python.org as soon as possible. While the
release is feature complete entering the beta phase, it is possible that
features may be modified or, in rare cases, deleted up until the start
of the release candidate phase (2016-12-05). Our goal is have no changes
after rc1. To achieve that, it will be extremely important to get as
much exposure for 3.6 as possible during the beta phase. Please keep in
mind that this is a preview release and its use is not recommended for
production environments

The next planned release of Python 3.6 will be 3.6.0b3, currently
scheduled for 2016-10-31. More information about the release schedule
can be found here:

https://www.python.org/dev/peps/pep-0494/

--
  Ned Deily
  n...@python.org -- []

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Steven D'Aprano

On Mon, Oct 10, 2016 at 09:16:32PM +, Elliot Gorokhovsky wrote:

> Anyway, benchmarking technique aside, the point is that it it works well
> for small lists (i.e. doesn't affect performance).

You've been shown that there is something suspicious about your 
benchmarking technique, something that suggests that the timing results 
aren't trustworthy. Until you convince us that your timing results are 
reliable and trustworthy, you shouldn't be drawing *any* conclusions 
about your fastsort versus the standard sort.

-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Guido van Rossum

So maybe someone should explain to Elliott *why* his own benchmarks
are not trustworthy, rather than just repeat "use perf or timeit".
Actually, there are two things: (a) when something new comes along it
*always* needs to prove beyond a shadow of a doubt that it is actually
an improvement and not a timing artifact or a trick; (b) you can't
time sorting 10 values *once* and get a useful result. You have to do
it many times. And you have to make sure that creating a list of 10
random values isn't taken as part of your test -- that's tricky since
random() isn't all that fast; but it has to be done.

Although Elliott had it coming when he used needlessly offensive
language in his first post.

On Mon, Oct 10, 2016 at 5:09 PM, Steven D'Aprano  wrote:
> On Mon, Oct 10, 2016 at 09:16:32PM +, Elliot Gorokhovsky wrote:
>
>> Anyway, benchmarking technique aside, the point is that it it works well
>> for small lists (i.e. doesn't affect performance).
>
> You've been shown that there is something suspicious about your
> benchmarking technique, something that suggests that the timing results
> aren't trustworthy. Until you convince us that your timing results are
> reliable and trustworthy, you shouldn't be drawing *any* conclusions
> about your fastsort versus the standard sort.
>
>
> --
> Steve
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/guido%40python.org

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Nathaniel Smith

On Mon, Oct 10, 2016 at 3:27 PM, Greg Ewing  wrote:
> Nathaniel Smith wrote:
>>
>> IIRC to handle
>> this gilectomy adds per-object mutexes that you have to hold whenever
>> you're mucking around with that object's internals.
>
>
> What counts as "mucking around with the object's internals",
> though?
>
> If I do the C equivalent of:
>
>x = somedict[5]
>x.dosomething()
>
> am I mucking around with the internals of somedict?
> Intuitively, I would not think so. But the way
> PyDict_GetItem currently works, this would be
> dangerous.

I guess you already have to be aware of this to some extent -- e.g.,
the C version of this code is dangerous, even though in Python it
would be fine:

  x = somedict[5]
  del somedict
  x.dosomething()

But Larry pointed out to me separately that while the per-object locks
exist and can be used, the they don't help as much as you'd hope
because the goal is to *not* require C API users to change their code.
So it's safe to have two simultaneous calls to PyDict_Clear, or to any
other single API call, because each PyDict_Clear call wil *internally*
takes the lock, so they won't stomp on each other. But any code
sequence that makes multiple C API calls and assumes that the world
won't shift under its feet becomes problematic. E.g. pre-GILectomy you
can do

Py_ssize_t count = PyList_Size(list);
for (int i = 0; i < count; ++i) {
PyObject *obj = PyList_Get_Item(list, i);
/* code which blithely proceeds without checking for obj == NULL */
...
}

but post-GILectomy, even putting aside the borrowing issues, this code
might segfault if someone shrinks the list when you aren't looking. I
can't see any way to avoid breaking the API here -- it seems like
callers will just have to take a lock or otherwise update their code.

I guess you can think of borrowed references as a particularly common
and insidious form of this problem -- so the tactical question is
whether it's better to look for a general solution like teaching
everyone to take locks, or try to attack borrowed references
specifically.

(Full API compatibility is never going to be preserved with the
GILectomy anyway -- if nothing else C extensions currently use the GIL
to protect internal globals, and that will have to be replaced by some
kind of more specific locking. So the open question is more like, can
the porting burden be kept low enough to make it viable.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Tim Peters

[please restrict follow-ups to python-ideas]

Let's not get hung up on meta-discussion here - I always thought "massive
clusterf**k" was a precise technical term anyway ;-)

While timing certainly needs to be done more carefully, it's obvious to me
that this approach _should_ pay off significantly when it applies.
Comparisons are extraordinarily expensive in Python, precisely because of
the maze of test-and-branch code it requires just to figure out which
bottom-level comparison function to invoke each time.  That's why I spent
months of my life (overall) devising a sequence of sorting algorithms for
Python that reduced the number of comparisons needed.

Note that when Python's current sort was adopted in Java, they still kept a
quicksort variant for "unboxed" builtin types.  The adaptive merge sort
incurs many overheads that often cost more than they save unless
comparisons are in fact very expensive compared to the cost of pointer
copying (and in Java comparison of unboxed types is cheap).  Indeed, for
native numeric types, where comparison is dirt cheap, quicksort generally
runs faster than mergesort despite that the former does _more_ comparisons
(because mergesort does so much more pointer-copying).

I had considered something "like this" for Python 2, but didn't pursue it
because comparison was defined between virtually any two types (34 < [1],
etc), and people were careless about that (both by design and by
accident).  In Python 3, comparison "blows up" for absurdly mixed types, so
specializing for homogeneously-typed lists is a more promising idea on the
face of it.

The comparisons needed to determine _whether_ a list's objects have a
common type is just len(list)-1 C-level pointer comparisons, and so goes
fast.  So I expect that, when it applies, this would speed even sorting an
already-ordered list with at least 2 elements.

For a mixed-type list with at least 2 elements, it will always be pure
loss.  But (a) I expect such lists are uncommon (and especially uncommon in
Python 3); and (b) a one-time scan doing C-level pointer comparisons until
finding a mismatched type is bound to be a relatively tiny cost compared to
the expense of all the "rich comparisons" that follow.

So +1 from me on pursuing this.

Elliot, please:

- Keep this on python-ideas.  python-dev is for current issues in Python
development, not for speculating about changes.

- Open an issue on the tracker:  https://bugs.python.org/

- At least browse the info for developers:
https://docs.python.org/devguide/

- Don't overlook Lib/test/sortperf.py.  As is, it should be a good test of
what your approach so far _doesn't_ help, since it sorts only lists of
floats (& I don't think you're special-casing them).  If the timing results
it reports aren't significantly hurt (and I expect they won't be), then add
specialization for floats too and gloat about the speedup :-)

- I expect tuples will also be worth specializing (complex sort keys are
often implemented as tuples).

Nice start! :-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Elliot Gorokhovsky

So here's a simple attempt at taking lots of measurements just using
time.time() with lists of ints. The results are great, if they are valid
(which I leave to you to judge); even for lists with just one element, it's
16% faster! The columns are length, number of trials, and percent
improvement:

Integer benchmarks
1 1000 1-fastsort()/sort(): 0.14896401672523163
5 200 1-fastsort()/sort(): 0.2915859704616376
10 100 1-fastsort()/sort(): 0.3576859149132914
1000 1 1-fastsort()/sort(): 0.3230363920681264
100 10 1-fastsort()/sort(): 0.292595011903772

And here's the benchmark script:
from fastlist import FastList
import random; random.seed(42)
from time import time

def bench_list(size,  trials):
L = [random.randrange(size) for _ in range(size)]
trials = int(trials); size = int(size)

fastlist_time = 0
for _ in range(trials):
F = FastList(L)
start = time()
F.fastsort()
fastlist_time += time() - start


defaultlist_time = 0
for _ in range(trials):
F = FastList(L)
start = time()
F.sort()
defaultlist_time += time() - start


print(size, trials,
  "1-fastsort()/sort():", 1-fastlist_time/defaultlist_time)

print("Integer benchmarks")
bench_list(1, 1e7)
bench_list(5, 1e7/5)
bench_list(10, 1e7/10)
bench_list(1000, 1e7/1000)
bench_list(100, 1e7/1e6)

Is that a valid way to benchmark the implementation?

On Mon, Oct 10, 2016 at 8:15 PM Elliot Gorokhovsky <
elliot.gorokhov...@gmail.com> wrote:

> Right - sorry about that last bit!
>
> I couldn't figure out how to use timeit/perf for this because the list has
> to be reinitialized each time it gets sorted. So with time.time() I can
> just wrap the sorting and cut the initialization out (and I could always
> throw it in a for loop to do it as many times as necessary). But with
> timeit/perf, as I understand, you just give it a number of iterations and a
> code snippet and it loops for you. So I don't see how I could get it to cut
> out the initialization. There's an option to provide setup code, of course,
> but I need to set up before each trial, not just before the loop.
>
> Regardless, one could always wrap the whole contribution with a length
> test and disable for lists with less than, say, 1000 elements. And though
> for the above reason I couldn't figure out how to benchmark properly for
> small lists, it's clear that for large lists this gives a real improvement
> with very little change in the code: the fact is that it really doesn't
> make sense to do all this typechecking nlogn times when we can just get it
> over with once at the beginning. The cost is very, very small, as
> demonstrated by the last benchmark (which is for a 1e7 list, so it is
> probably more valid with my crude technique), so heterogeneous lists don't
> get slowed down perceptibly.
>
> On Mon, Oct 10, 2016 at 8:06 PM Guido van Rossum  wrote:
>
> So maybe someone should explain to Elliott *why* his own benchmarks
>
> are not trustworthy, rather than just repeat "use perf or timeit".
>
> Actually, there are two things: (a) when something new comes along it
>
> *always* needs to prove beyond a shadow of a doubt that it is actually
>
> an improvement and not a timing artifact or a trick; (b) you can't
>
> time sorting 10 values *once* and get a useful result. You have to do
>
> it many times. And you have to make sure that creating a list of 10
>
> random values isn't taken as part of your test -- that's tricky since
>
> random() isn't all that fast; but it has to be done.
>
>
>
> Although Elliott had it coming when he used needlessly offensive
>
> language in his first post.
>
>
>
> On Mon, Oct 10, 2016 at 5:09 PM, Steven D'Aprano 
> wrote:
>
> > On Mon, Oct 10, 2016 at 09:16:32PM +, Elliot Gorokhovsky wrote:
>
> >
>
> >> Anyway, benchmarking technique aside, the point is that it it works well
>
> >> for small lists (i.e. doesn't affect performance).
>
> >
>
> > You've been shown that there is something suspicious about your
>
> > benchmarking technique, something that suggests that the timing results
>
> > aren't trustworthy. Until you convince us that your timing results are
>
> > reliable and trustworthy, you shouldn't be drawing *any* conclusions
>
> > about your fastsort versus the standard sort.
>
> >
>
> >
>
> > --
>
> > Steve
>
> > ___
>
> > Python-Dev mailing list
>
> > Python-Dev@python.org
>
> > https://mail.python.org/mailman/listinfo/python-dev
>
> > Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>
>
>
>
>
> --
>
> --Guido van Rossum (python.org/~guido)
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Elliot Gorokhovsky

Right - sorry about that last bit!

I couldn't figure out how to use timeit/perf for this because the list has
to be reinitialized each time it gets sorted. So with time.time() I can
just wrap the sorting and cut the initialization out (and I could always
throw it in a for loop to do it as many times as necessary). But with
timeit/perf, as I understand, you just give it a number of iterations and a
code snippet and it loops for you. So I don't see how I could get it to cut
out the initialization. There's an option to provide setup code, of course,
but I need to set up before each trial, not just before the loop.

Regardless, one could always wrap the whole contribution with a length test
and disable for lists with less than, say, 1000 elements. And though for
the above reason I couldn't figure out how to benchmark properly for small
lists, it's clear that for large lists this gives a real improvement with
very little change in the code: the fact is that it really doesn't make
sense to do all this typechecking nlogn times when we can just get it over
with once at the beginning. The cost is very, very small, as demonstrated
by the last benchmark (which is for a 1e7 list, so it is probably more
valid with my crude technique), so heterogeneous lists don't get slowed
down perceptibly.

On Mon, Oct 10, 2016 at 8:06 PM Guido van Rossum  wrote:

> So maybe someone should explain to Elliott *why* his own benchmarks
> are not trustworthy, rather than just repeat "use perf or timeit".
> Actually, there are two things: (a) when something new comes along it
> *always* needs to prove beyond a shadow of a doubt that it is actually
> an improvement and not a timing artifact or a trick; (b) you can't
> time sorting 10 values *once* and get a useful result. You have to do
> it many times. And you have to make sure that creating a list of 10
> random values isn't taken as part of your test -- that's tricky since
> random() isn't all that fast; but it has to be done.
>
> Although Elliott had it coming when he used needlessly offensive
> language in his first post.
>
> On Mon, Oct 10, 2016 at 5:09 PM, Steven D'Aprano 
> wrote:
> > On Mon, Oct 10, 2016 at 09:16:32PM +, Elliot Gorokhovsky wrote:
> >
> >> Anyway, benchmarking technique aside, the point is that it it works well
> >> for small lists (i.e. doesn't affect performance).
> >
> > You've been shown that there is something suspicious about your
> > benchmarking technique, something that suggests that the timing results
> > aren't trustworthy. Until you convince us that your timing results are
> > reliable and trustworthy, you shouldn't be drawing *any* conclusions
> > about your fastsort versus the standard sort.
> >
> >
> > --
> > Steve
> > ___
> > Python-Dev mailing list
> > Python-Dev@python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Distutils] pythonhosted.org upload no longer works

2016-10-10 Thread tritium-list



> -Original Message-
> From: Distutils-SIG [mailto:distutils-sig-bounces+tritium-
> list=sdamon@python.org] On Behalf Of Guido van Rossum
> Sent: Monday, October 10, 2016 12:27 PM
> To: Donald Stufft 
> Cc: Distutils ; Giampaolo Rodola'
> ; python-dev 
> Subject: Re: [Distutils] [Python-Dev] pythonhosted.org upload no longer
> works
> 
> Honestly I like readthedocs a lot, and I actually *don't* like docs
> that look too much like the standard Python docs -- it should be clear
> to readers (subliminally, through page style, not just be parsing the
> URL) that they're reading third party docs.
> 

That’s quite tangential to pythonhosted.org hosting documentation... But that 
is a consequence of the theme used by docs.python.org/2/ using the default 
sphinx theme.  That problem was already solved for python3 docs - they don’t 
use the default theme.  I really don’t see how this is related at all to the 
uploading of docs to pythonhosted.org.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Distutils] pythonhosted.org upload no longer works

2016-10-10 Thread Guido van Rossum

I only mentioned it because the OP mentioned the theme as one of his
reasons to prefer one over the other.

On Mon, Oct 10, 2016 at 8:16 PM,   wrote:
>
>
>> -Original Message-
>> From: Distutils-SIG [mailto:distutils-sig-bounces+tritium-
>> list=sdamon@python.org] On Behalf Of Guido van Rossum
>> Sent: Monday, October 10, 2016 12:27 PM
>> To: Donald Stufft 
>> Cc: Distutils ; Giampaolo Rodola'
>> ; python-dev 
>> Subject: Re: [Distutils] [Python-Dev] pythonhosted.org upload no longer
>> works
>>
>> Honestly I like readthedocs a lot, and I actually *don't* like docs
>> that look too much like the standard Python docs -- it should be clear
>> to readers (subliminally, through page style, not just be parsing the
>> URL) that they're reading third party docs.
>>
>
> That’s quite tangential to pythonhosted.org hosting documentation... But that 
> is a consequence of the theme used by docs.python.org/2/ using the default 
> sphinx theme.  That problem was already solved for python3 docs - they don’t 
> use the default theme.  I really don’t see how this is related at all to the 
> uploading of docs to pythonhosted.org.
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Nick Coghlan

On 11 October 2016 at 08:49, Guido van Rossum  wrote:
> On Mon, Oct 10, 2016 at 1:57 PM, Larry Hastings  wrote:
>> For now I'm going to leave the names as-is and just change the semantics
>> everywhere.  If this approach really works, and if we decide we want to
>> merge it back into trunk--and those are both still big if's--we can revisit
>> this decision.
>
> Changing something as subtle as this to an API sounds really bad even
> in an experimental branch. But it's your branch and you can do what
> you want (then again, why did you bring it up here? :-).

I'm guessing Larry was hoping someone else would see a possible
solution that he'd missed.

On that front, while I don't see an alternative to making borrowed
references real references in a GIL-free Python, I do think we may
need to keep the borrowed/new distinction for the sake of folks
writing cross-version compatible code. To explain why, consider these
scenarios:

* CPython with a GIL:
  - new references MUST be decref'ed
  - borrowed references MUST NOT be decref'ed

* CPython without a GIL:
  - new references MUST be decref'ed
  - borrowed references MUST be decref'ed

Assuming that the GILectomy does end up needing to implicitly treat
all borrowed references as new references, that suggests we're going
to need APIs like "PyBorrowed_DECREF" to provide the "only decref when
built without the GIL" semantics, and the rules for GIL-independent
source code would become:

- new references must be decref'ed using the normal APIs
- borrowed references must be decref'ed using the PyBorrowed* APIs
(with a no-op compatibility shim for older GIL-only CPython versions)

Regards,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Optimizing list.sort() by checking type in advance

2016-10-10 Thread Greg Ewing


Elliot Gorokhovsky wrote:
I ran the 
benchmark a couple of times and the numbers seem to exactly line up 
something like one in five times; perhaps not that crazy considering 
they're executing nearly the same code?


Could this be a result of a time being measured in seconds
somewhere and then divided down?


*** 1e7 ints + 1 float (to disable the optimization while keeping the 
precheck)***
F.fastsort(): 7.57237982749939
F.sort(): 7.666172504425049


This result looks a bit suspicious too -- it's hard to see
how fastsort could be faster even when the optimisation
is not being used.

--
Greg

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Fwd: Re: PyWeakref_GetObject() borrows its reference from... whom?

2016-10-10 Thread Larry Hastings



Hit "Reply" instead of "Reply All" last night, oops.  Forwarding to the 
list for posterity's sakes.


/
/arry/

 Forwarded Message 
Subject: 	Re: [Python-Dev] PyWeakref_GetObject() borrows its reference 
from... whom?

Date:   Mon, 10 Oct 2016 23:01:10 +0200
From:   Larry Hastings 
To: Nathaniel Smith 


On 10/10/2016 07:50 PM, Nathaniel Smith wrote:

If we say that borrowing reference from a dict is one of the things
that counts as mucking about with that dict, and thus requires you to
hold the dict lock for as long as you hold the borrowed reference,
then all should be well.


That's not how locking works in the Gilectomy right now.  If you call 
PyDict_GetItem(), it locks the dict at the beginning, then looks up the 
thingy, then releases the lock just before returning.  It's hard for me 
to imagine how the dict would magically know when it could drop the 
borrowed reference returned by PyDict_GetItem().




I assume Larry is way ahead of us on this,


I would say "that's your first mistake!" but I bet you've made others.  
Anyway, tbh most of the time I feel awfully unqualified to work on the 
Gilectomy.



Fools rush in,


//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

52 matches

Mail list logo