[issue28638] Optimize namedtuple creation

2017-09-10 Thread Raymond Hettinger

Raymond Hettinger added the comment:


New changeset 8b57d7363916869357848e666d03fa7614c47897 by Raymond Hettinger in 
branch 'master':
bpo-28638: Optimize namedtuple() creation time by minimizing use of exec() 
(#3454)
https://github.com/python/cpython/commit/8b57d7363916869357848e666d03fa7614c47897


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-09-10 Thread Raymond Hettinger

Changes by Raymond Hettinger :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-09-10 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Microbenchmark for caching docstrings:

$ ./python -m perf timeit -s "from collections import namedtuple; names = 
['field%d' % i for i in range(1000)]" -- "namedtuple('A', names)"

With sys.intern(): Mean +- std dev: 3.57 ms +- 0.05 ms
With Python-level caching: Mean +- std dev: 3.25 ms +- 0.05 ms

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-09-08 Thread Josh Rosenberg

Josh Rosenberg added the comment:

Side-note: Some of the objections to a C level namedtuple implementation appear 
to be based on the maintenance hurdle, and other have noted that a 
structseq-based namedtuple might be an option. I have previously attempted to 
write a C replacement for namedtuple that dynamically created a StructSequence. 
I ran into a roadblock due to PyStructSequence_NewType (the API that exists to 
allow creation of runtime defined structseq) being completely broken (#28709).

If the struct sequence API was fixed, it should be a *lot* easier to implement 
a C level namedtuple with minimal work, removing (some) of the maintenance 
objections by simply reducing the amount of custom code involved.

The testnewtype.c code attached to #28709 (that demonstrates the bug) is 66 
lines of code, and implements a basic C level namedtuple creator function (full 
support omitted for brevity, but aside from _source, most of it would be easy). 
I'd expect a finished version to be low three digit lines of custom code, a 
third or less of what the cnamedtuple project needed to write the whole thing 
from scratch.

--
nosy: +josh.r

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-09-08 Thread Raymond Hettinger

Changes by Raymond Hettinger :


--
pull_requests: +3451

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-08-26 Thread Ethan Smith

Changes by Ethan Smith :


--
nosy: +Ethan Smith

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-20 Thread Raymond Hettinger

Raymond Hettinger added the comment:

>  it would be *nice* to not only optimization the creation 
> but also attribute access by name

FWIW, once the property/itemgetter pair are instantiated in the NT class, the 
actual lookup runs through them at C speed (no pure python steps).  There is 
not much fluff here.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-19 Thread Guido van Rossum

Guido van Rossum added the comment:

Yeah, it looks like the standard `_pickle` and `pickle` solution would work
here.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-19 Thread STINNER Victor

STINNER Victor added the comment:

General note about this issue: while the issie title is "Optimize namedtuple 
creation", it would be *nice* to not only optimization the creation but also 
attribute access by name:
http://bugs.python.org/issue28638#msg298499

Maybe we can have a very fast C implementation using structseq, and a fast 
Python implementation (faster than the current Python implementation) fallback 
for non-CPython.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-19 Thread INADA Naoki

INADA Naoki added the comment:

I didn't say "let's not do it".
I just want to focus on pure Python implementation at this issue,
because this thread is too long already.
Feel free to open new issue about C implementation.

Even if C implementation is added later, pure Python optimization
can boost PyPy performance. 
(https://github.com/python/cpython/pull/2736#issuecomment-316014866)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-19 Thread Giampaolo Rodola'

Giampaolo Rodola' added the comment:

> While "40x faster" is more 10x faster than "4x faster", C 
> implementation can boost only CPython and makes maintenance more harder.

As a counter argument against "let's not do it because it'll be harder to 
maintain" I'd like to point out that namedtuple API is already kind of over 
engineered (see: "verbose", "rename", "module" and "_source") and as such it 
seems likely it will remain pretty much the same in the future. So why not 
treat namedtuple like any other basic data structure, boost its internal 
implementation and simply use the existing unit tests to make sure there are no 
regressions? It seems the same barrier does not apply to tuples, lists and sets.

> Of course, 1.9x faster attribute access 
> (http://bugs.python.org/issue28638#msg298499) is attractive.

It is indeed and it makes a huge difference in situations like busy loops. E.g. 
in case of asyncio 1.9x faster literally means being able to serve twice the 
number of reqs/sec:
https://github.com/python/cpython/blob/3e2ad8ec61a322370a6fbdfb2209cf74546f5e08/Lib/asyncio/selector_events.py#L523

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-19 Thread INADA Naoki

INADA Naoki added the comment:

I want to focus on pure Python implementation in this issue.

While "40x faster" is more 10x faster than "4x faster", C implementation
can boost only CPython and makes maintenance more harder.

And sometimes "more 10x faster" is not so important.
For example, say application startup takes 1sec and namedtuple
creation took 0.4sec of the 1sec:

  4x faster: 1sec -> 0.7sec  (-30%)
 40x faster: 1sec -> 0.61sec (-39%)

In this case, "4x faster" reduces 0.3sec and "more 10x faster" reduces
only 0.09sec.

Of course, 1.9x faster attribute access 
(http://bugs.python.org/issue28638#msg298499) is attractive.
But this issue is too long already.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-18 Thread Jelle Zijlstra

Jelle Zijlstra added the comment:

Thanks Joe! I adapted your benchmark suite to also run my implementation. See 
https://github.com/JelleZijlstra/cnamedtuple/commit/61b6fbf4de37f8131ab43c619593327004974e52
 for the code and results. The results are consistent with what we've seen 
before.

Joe's cnamedtuple is about 40x faster for class creation than the current 
implementation, and my PR only speeds class creation up by 4x. That difference 
is big enough that I think we should seriously consider using the C 
implementation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-18 Thread Jelle Zijlstra

Jelle Zijlstra added the comment:

I benchmarked some common namedtuple operations with the following script:

#!/bin/bash
echo 'namedtuple creation'
./python -m timeit -s 'from collections import namedtuple' 'x = namedtuple("x", 
["a", "b", "c"])'

echo 'namedtuple instantiation'
./python -m timeit -s 'from collections import namedtuple; x = namedtuple("x", 
["a", "b", "c"])' 'x(1, 2, 3)'

echo 'namedtuple attribute access'
./python -m timeit -s 'from collections import namedtuple; x = namedtuple("x", 
["a", "b", "c"]); i = x(1, 2, 3)' 'i.a'

echo 'namedtuple _make'
./python -m timeit -s 'from collections import namedtuple; x = namedtuple("x", 
["a", "b", "c"])' 'x._make((1, 2, 3))'


--
With my patch as it stands now I get:

$ ./ntbenchmark.sh 
namedtuple creation
2000 loops, best of 5: 101 usec per loop
namedtuple instantiation
50 loops, best of 5: 477 nsec per loop
namedtuple attribute access
500 loops, best of 5: 59.9 nsec per loop
namedtuple _make
50 loops, best of 5: 430 nsec per loop


--
With unpatched CPython master I get:

$ ./ntbenchmark.sh 
namedtuple creation
500 loops, best of 5: 409 usec per loop
namedtuple instantiation
50 loops, best of 5: 476 nsec per loop
namedtuple attribute access
500 loops, best of 5: 60 nsec per loop
namedtuple _make
100 loops, best of 5: 389 nsec per loop


So creating a class is about 4x faster (similar to the benchmarks various other 
people have run) and calling _make() is 10% slower. That's probably because of 
the line "if len(result) != cls._num_fields:" in my implementation, which would 
have been something like "if len(result) != 3" in the exec-based implementation.

I also cProfiled class creation with my patch. These are results for creating 
1 3-element namedtuple classes:

 390005 function calls in 2.793 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
10.0530.0002.8260.000 
:1(make_nt)
11.0990.0002.7730.000 
/home/jelle/qython/cpython/Lib/collections/__init__.py:380(namedtuple)
10.9480.0000.9810.000 {built-in method builtins.exec}
   100.3160.0000.3160.000 {method 'format' of 'str' objects}
10.0690.0000.2200.000 {method 'join' of 'str' objects}
40.0710.0000.1520.000 
/home/jelle/qython/cpython/Lib/collections/__init__.py:439()
10.0440.0000.0440.000 {built-in method builtins.repr}
30.0330.0000.0330.000 {method 'startswith' of 'str' 
objects}
40.0310.0000.0310.000 {method 'isidentifier' of 'str' 
objects}
40.0250.0000.0250.000 {method '__contains__' of 
'frozenset' objects}
10.0220.0000.0220.000 {method 'replace' of 'str' 
objects}
10.0220.0000.0220.000 {built-in method sys._getframe}
30.0200.0000.0200.000 {method 'add' of 'set' objects}
20.0180.0000.0180.000 {built-in method builtins.len}
10.0130.0000.0130.000 {built-in method 
builtins.isinstance}
10.0090.0000.0090.000 {method 'get' of 'dict' objects}

So about 35% of time is still spent in the exec() call to create __new__. 
Another 10% is in .format() calls, so using f-strings instead of .format() 
might also be worth it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-18 Thread Joe Jevnik

Joe Jevnik added the comment:

I added a benchmark suite (using Victor's perf utility) to cnamedtuple. The 
results are here: https://github.com/ll/cnamedtuple#benchmarks

To summarize: type creation is much faster; instance creation and named 
attribute access are a bit faster.

--
nosy: +ll

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-18 Thread Giampaolo Rodola'

Giampaolo Rodola' added the comment:

> Should we consider a C-based implementation like 
> https://github.com/ll/cnamedtuple? 
> It could improve speed even more, but would be harder to maintain and
> test and harder to keep compatible. My sense is that it's not worth
> it unless benchmarks show a really dramatic difference.

I've just filed a ticket for this: 
https://github.com/ll/cnamedtuple/issues/7

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-18 Thread Antoine Pitrou

Changes by Antoine Pitrou :


--
stage: resolved -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-18 Thread Christoph Reiter

Christoph Reiter added the comment:

Why not just do the following:

>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> Point._source
"from collections import namedtuple\nPoint = namedtuple('Point', ['x', 'y'])\n"
>>> 

The docs make it seems as if the primary use case of the _source attribute is
to serialize the definition. Returning a source which produces a class with
different performance/memory characteristics goes against that.

--
nosy: +lazka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Guido van Rossum

Guido van Rossum added the comment:

Thanks Raymond and Jelle.

The bar for a reimplementation in C is much higher (though we'll have to agree 
that Jelle's version is fast enough before we reject it).

The bar for backporting this to 3.6 is much higher as well and I think it's not 
worth disturbing the peace (people depend on the craziest things staying the 
same between bugfix releases, but for feature releases they have reasons to do 
thorough testing).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Jelle Zijlstra

Changes by Jelle Zijlstra :


--
resolution: rejected -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Jelle Zijlstra

Jelle Zijlstra added the comment:

Should we consider a C-based implementation like 
https://github.com/ll/cnamedtuple? It could improve speed even more, 
but would be harder to maintain and test and harder to keep compatible. My 
sense is that it's not worth it unless benchmarks show a really dramatic 
difference.

As for Raymond's list of goals, my PR now preserves _source and verbose=True 
and the test suite passes. I think the only docs change needed is in the 
description for _source 
(https://docs.python.org/3/library/collections.html#collections.somenamedtuple._source),
 which is no longer "used to create the named tuple class". I'll add that to my 
PR. I haven't done anything towards the last two goals yet.

Should the change be applied to 3.6? It is fully backwards compatible, but 
perhaps the change is too disruptive to be included in the 3.6 series at this 
point.

--
resolution:  -> rejected

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Re-opening per discussion on python-dev.

Goals:

* Extend Jelle's patch to incorporate lazy support for "_source" and "verbose" 
so that the API is unchanged from the user's point of view.

* Make sure the current test suite still passes and that the current docs 
remain valid.

* Get better measurements of benefits so we know what is actually being 
achieved.

* Test to see if there are new positive benefits for PyPy and Jython as well.

--
resolution: rejected -> 
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Guido van Rossum

Guido van Rossum added the comment:

On python-dev Raymond agreed to reopen the issue and consider Jelle's 
implementation (https://github.com/python/cpython/pull/2736).

--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread STINNER Victor

STINNER Victor added the comment:

> It's possible to expose StructSeq somewhere.

Hum, when I mentioned structseq: my idea was more to reimplement
namedtuple using the existing structseq code, since structseq is well
tested and very fast.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Giampaolo Rodola'

Changes by Giampaolo Rodola' :


--
nosy: +giampaolo.rodola

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread INADA Naoki

INADA Naoki added the comment:

I respect Raymond's rejection.  But I want to write down why I like Jelle's 
approach.

Currently, functools is the only module which is very popular.
But leaving this means every new namedtuple makes startup time about 0.6ms 
slower.

This is also problem for applications heavily depending on namedtuple.
Creating namedtuple is more than 15 times slower than normal class. It's not 
predictable or reasonable overhead.
It's not once I profiled application startup time and found namedtuple
account non-negligible percentage.

It's possible to keep `_source` with Jelle's approach. `_source` can be 
equivalent source rather than exact source eval()ed.
I admit it's not ideal. But all namedtuple user
and all Python implementation can benefit from it.

It's possible to expose StructSeq somewhere.  It can make it faster to
import `functools`.
But it's ugly too that applications and libraries tries it first
and falls back to namedtuple.
And when it is used widely, other Python implementations will be forced
to implement it.

That's why I'm willing collections.namedtuple overhead is reasonable and 
predictable.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread STINNER Victor

STINNER Victor added the comment:

> So structseq is 1.9x faster than namedtuple to get an attribute by name.

Oops, I wrote it backward: So namedtuple is 1.9x slower than structseq to get 
an attribute by name.

(1.9x slower doesn't mean 1.9x faster, sorry.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread STINNER Victor

STINNER Victor added the comment:

> Speed isn't everything, and it certainly isn't adequate justification for 
> breaking public APIs that have been around for years.

What about the memory usage?

> See my old issue #19640 (...)

msg203271:

"""
I found this issue while using my tracemalloc module to analyze the memory 
consumption of Python. On the Python test suite, the _source attribute is the 
5th line allocating the most memory:

/usr/lib/python3.4/collections/__init__.py: 676.2 kB
"""

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread STINNER Victor

STINNER Victor added the comment:

Benchmark comparing collections.namedtuple to structseq, to get an attribute:

* Getting an attribute by name (obj.attr):
  Mean +- std dev: [name_structseq] 24.1 ns +- 0.5 ns -> [name_namedtuple] 45.7 
ns +- 1.9 ns: 1.90x slower (+90%)
* Getting an attribute by its integer index (obj[0]):
  (not significant)

So structseq is 1.9x faster than namedtuple to get an attribute by name.


haypo@speed-python$ ./bin/python3  -m perf timeit -s "from collections import 
namedtuple; Point=namedtuple('Point', 'x y'); p=Point(1,2)" "p.x" 
--duplicate=1024 -o name_namedtuple.json
Mean +- std dev: 45.7 ns +- 1.9 ns
haypo@speed-python$ ./bin/python3  -m perf timeit -s "from collections import 
namedtuple; Point=namedtuple('Point', 'x y'); p=Point(1,2)" "p[0]" 
--duplicate=1024 -o int_namedtuple.json
Mean +- std dev: 17.6 ns +- 0.0 ns


haypo@speed-python$ ./bin/python3  -m perf timeit -s "from sys import flags" 
"flags.debug" --duplicate=1024 -o name_structseq.json
Mean +- std dev: 24.1 ns +- 0.5 ns
haypo@speed-python$ ./bin/python3  -m perf timeit -s "from sys import flags" 
"flags[0]" --duplicate=1024 -o int_structseq.json
Mean +- std dev: 17.6 ns +- 0.2 ns

---

Getting an attribute by its integer index is as fast as tuple:

haypo@speed-python$ ./bin/python3  -m perf timeit --inherit=PYTHONPATH -s 
"p=(1,2)" "p[0]" --duplicate=1024 -o int_tuple.json
.
Mean +- std dev: 17.6 ns +- 0.0 ns

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Nick Coghlan

Nick Coghlan added the comment:

Yes, I'm saying you need a really long justification to explain why you want to 
break backwards compatibility solely for a speed increase.

For namedtuple instances, the leading underscore does *NOT* indicate a private 
attribute - it's just there to avoid colliding with field names.

Speed isn't everything, and it certainly isn't adequate justification for 
breaking public APIs that have been around for years.

Now, you can either escalate that argument to python-dev, and try to convince 
Guido to overrule Raymond on this point, *or* you can look at working out a 
Python level API to dynamically define PyStructSequence subclasses. That won't 
be entirely straightforward (as my recollection is that structseq is designed 
to build on static C structs), but if you're successful, it will give you 
something that should be faster than namedtuple in every way, not just at 
definition time.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread STINNER Victor

STINNER Victor added the comment:

Sorry, I don't have much data at this point, but it's not the first time that I 
noticed that namedtuple is super slow. We have much more efficient code like 
structseq in C. Why not reusing it at least in our stdlib modules?

About the _source attribute, honestly, I'm not aware of anyone using it. I 
don't think that the fact that a *private* attribute is document should prevent 
it to make Python faster.

I already noticed the _source attribute when I studied the Python memory usage. 
See my old isuse #19640: "Drop _source attribute of namedtuple (waste memory)", 
I later changed the title to "Dynamically generate the _source attribute of 
namedtuple to save memory)".

About "Python startup time doesn't matter", this is just plain wrong. Multiple 
core developers spent a lot of time on optimizing exactly that. Tell me if you 
really need a long rationale to work on that.

While I'm not sure about Naoki's exact optimization, I agree about the issue 
title: "Optimize namedtuple creation", and I like the idea of keeping the issue 
open to find a solution.

--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Nick Coghlan

Nick Coghlan added the comment:

Check the issue history - the issue has been rejected by Raymond, and then 
reopened for further debate by other core developers multiple times.

That's not a reasonable approach to requesting reconsideration of a module/API 
maintainers design decision.

I acknowledge that those earlier reopenings weren't by you, but the issue 
should still remain closed until *Raymond* agrees to reconsider it (and given 
the alternative option of instead making the lower overhead PyStructSequence 
visible at the Python level, I'd be surprised if he does).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Nick Coghlan

Nick Coghlan added the comment:

There's a path for escalation when you disagree with the decision of a 
module/API maintainer (in this case, Raymond): bringing the issue closure up on 
python-dev for wider discussion.

It *isn't* repeatedly reopening the issue after they have already made their 
decision and attempting to pester them into changing their mind.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Just because I disagree with you doesn't mean I'm pestering anyone.  Can you 
stop being so obnoxious?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Nick Coghlan

Nick Coghlan added the comment:

So unless and until he gets overruled by Guido, Raymond's decision to reject 
the proposed change stands.

--
resolution:  -> rejected
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Nick, can you stop closing an issue where the discussion hasn't been settled?  
This isn't civil.

--
resolution: rejected -> 
stage: resolved -> 
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Nick Coghlan

Nick Coghlan added the comment:

Folks, you're talking about removing a *public*, *documented* API from the 
standard library. The onus would thus be on you to prove *lack* of use, *and* 
provide adequate justification for the compatibility break, not on anyone else 
to prove that it's "sufficiently popular" to qualify for the standard backwards 
compatibility guarantees. Those guarantees apply by default and are only broken 
for compelling reasons - that's why we call them guarantees

Don't be fooled by the leading underscore - that's an artifact of how 
namedtuple avoids colliding with arbitrary field names, not an indicator that 
this is a private API: 
https://docs.python.org/3/library/collections.html#collections.somenamedtuple._source

"It would be faster" isn't adequate justification, since speed increases only 
matter in code that has been identified as a bottleneck, and startup time in 
general (let alone namedtuple definitions in particular) is rarely the 
bottleneck.

So please, just stop, and find a more productive way of expending your energy 
(such as by making PyStructSequence available via the "types" module, since 
that also allows for C level micro-optimizations when *used*, not just at 
definition time).

--
resolution:  -> rejected
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Antoine Pitrou

Antoine Pitrou added the comment:

I disagree with the rejection of this request.  The idea that "_source is an 
essential feature" should be backed by usage statistics instead of being 
hand-waved as rejection cause.

--
nosy: +pitrou
resolution: rejected -> 
stage: resolved -> 
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-17 Thread Raymond Hettinger

Changes by Raymond Hettinger :


--
resolution:  -> rejected
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-16 Thread Nick Coghlan

Nick Coghlan added the comment:

I agree with Raymond here - the standard library's startup benchmarks are *NOT* 
normal code execution paths, since normal code execution is dominated by the 
actual operation being performed, and hence startup micro-optimizations vanish 
into the noise.

Accordingly, we should *not* be redesigning existing standard interfaces simply 
for the sake of allowing them to be used during startup without significantly 
slowing down the interpreter startup benchmark.

By contrast, it *is* entirely OK to introduce specialised types specifically 
for internal use (including during startup), and only making them available at 
the Python level through the types module (e.g. types.MappingProxyType, 
types.SimpleNamespace).

At the moment, the internal PyStructSequence type used to define sys.flags, 
sys.version_info, etc *isn't* exposed that way, so efforts to allow the use of 
namedtuple-style interfaces in modules that don't want to use namedtuple itself 
would likely be better directed towards making that established type available 
and usable through the types module, rather than towards altering namedtuple.

That approach would have the potential to solve both the interpreter startup 
optimisation problem (as the "types" module mainly just exposes thing defined 
by the interpreter implementation, not new Python level classes), *and* provide 
an alternate option for folks that have pre-emptively decided that namedtuple 
is going to be "too slow" for their purposes without actually measuring the 
relative performance in the context of their application.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-16 Thread Raymond Hettinger

Raymond Hettinger added the comment:

> creates only one necessary backwards compatibility break 
> (we no longer have _source).

IMO, this is an essential feature.  It allows people to easily build their own 
variants, to divorce the generated code from the generator, and to fully 
understand what named tuples do (that is in part why we get so few questions 
about how they work).

You all seem to be in rush to redesign code that has been stable and well 
served the needs of users for a very long time.  This all seems to be driven by 
a relentless desire for micro-optimizations regardless of actual need.

BTW, none of the new contributors seem to be aware of named tuple's history.  
It was an amalgamation of many separate implementations that had sprung up in 
the wild (it was being reinvented many times).  It was posted as ASPN recipe 
and went through a long period of maturation that incorporated the suggestions 
of over a dozen engineers based on use in the field.  It went through further 
refinement when examined and discussed on the pythoh-dev incorporating reviews 
from Guido, Alex, and Tim.  Since that time, the tools has been broadly 
deployed and met the needs of enormous numbers of users. Its use is considered 
a best practice.  The code and API have maintained and improved an 
intentionally slow and careful pace.

I really, really do not want to significantly revised the stable code and 
undermine the premise of its implementation so that you can save a few 
micro-seconds in the load of some module.  That is contrary to our optimization 
philosophy for CPython.  

As is, the code is very understandable, easy to maintain, easy to understand, 
easy to create variants, easy to verify that it is bug free. It works great for 
CPython, IronPython, PyPy, and Jython without modification.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-16 Thread Jelle Zijlstra

Changes by Jelle Zijlstra :


--
pull_requests: +2796

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28638] Optimize namedtuple creation

2017-07-16 Thread INADA Naoki

INADA Naoki added the comment:

I like your idea.  Would you make pull request?

--
resolution: rejected -> 
status: closed -> open
title: Creating namedtuple is too slow to be used in common stdlib (e.g. 
functools) -> Optimize namedtuple creation

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com