[issue26280] ceval: Optimize list

2016-02-29 Thread Zach Byrne

Zach Byrne added the comment:

The new patch "subscr2" removes the tuple block, and addresses Victor's 
comments. This one looks a little faster, down to 0.0215 usec for the same test.

--
Added file: http://bugs.python.org/file42049/subscr2.patch

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26280>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26280] ceval: Optimize list

2016-02-18 Thread Zach Byrne

Zach Byrne added the comment:

Is it worth handling the exception, or just let it take the slow path and get 
caught by PyObject_GetItem()? We're still making sure the index is in bounds.

Also, where would be an appropriate place to put a macro for adjusting negative 
indices?

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26280>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26280] ceval: Optimize list[int] (subscript) operation similarly to CPython 2.7

2016-02-16 Thread Zach Byrne

Zach Byrne added the comment:

Here's a patch that looks likes Victor's from the duplicate, but with tuples 
covered as well. I ran some straight forward micro benchmarks but haven't 
bothered running the benchmark suite yet. Unsurprisingly, optimized paths are 
faster, and the others take a penalty.

[0]byrnez@byrnez-laptop:~/git/python$ ./python.orig -m timeit -s "l = 
[1,2,3,4,5,6]" "l[3]"
1000 loops, best of 3: 0.0306 usec per loop
[0]byrnez@byrnez-laptop:~/git/python$ ./python -m timeit -s "l = [1,2,3,4,5,6]" 
"l[3]"
1000 loops, best of 3: 0.0243 usec per loop

[0]byrnez@byrnez-laptop:~/git/python$ ./python.orig -m timeit -s "l = 
(1,2,3,4,5,6)" "l[3]"
1000 loops, best of 3: 0.0291 usec per loop
[0]byrnez@byrnez-laptop:~/git/python$ ./python -m timeit -s "l = (1,2,3,4,5,6)" 
"l[3]"
1000 loops, best of 3: 0.0241 usec per loop

[0]byrnez@byrnez-laptop:~/git/python$ ./python.orig -m timeit -s "l = 
'asdfasdf'" "l[3]"
1000 loops, best of 3: 0.034 usec per loop
[0]byrnez@byrnez-laptop:~/git/python$ ./python -m timeit -s "l = 'asdfasdf'" 
"l[3]"
1000 loops, best of 3: 0.0366 usec per loop

[0]byrnez@byrnez-laptop:~/git/python$ ./python.orig -m timeit -s "l = 
[1,2,3,4,5,6]" "l[:3]"
1000 loops, best of 3: 0.124 usec per loop
[0]byrnez@byrnez-laptop:~/git/python$ ./python -m timeit -s "l = [1,2,3,4,5,6]" 
"l[:3]"
1000 loops, best of 3: 0.125 usec per loop

--
Added file: http://bugs.python.org/file41939/subscr1.patch

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26280>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26280] ceval: Optimize [] operation similarly to CPython 2.7

2016-02-05 Thread Zach Byrne

Zach Byrne added the comment:

I'm attaching output from a selection of the benchmarks, I'm counting 
non-builtins and slices, but for everything, not just lists and tuples.

Quick observation: math workloads seem list heavy, text workloads seem dict 
heavy, and tuples are usually somewhere in the middle.

--
Added file: http://bugs.python.org/file41826/subscr_stats.txt

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26280>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26280] ceval: Optimize [] operation similarly to CPython 2.7

2016-02-04 Thread Zach Byrne

Zach Byrne added the comment:

Ok, I've started on the instrumenting, thanks for that head start, that would 
have taken me a while to figure out where to call the stats dump function from. 
Fun fact: BINARY_SUBSCR is called 717 starting python.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26280>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26280] ceval: Optimize [] operation similarly to CPython 2.7

2016-02-04 Thread Zach Byrne

Zach Byrne added the comment:

I'll put together something comprehensive in a bit, but here's a quick preview:

$ ./python
Python 3.6.0a0 (default, Feb  4 2016, 20:08:03) 
[GCC 4.6.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()
Total BINARY_SUBSCR calls: 726
List BINARY_SUBSCR calls: 36
Tuple BINARY_SUBSCR calls: 103
Dict BINARY_SUBSCR calls: 227
Unicode BINARY_SUBSCR calls: 288
Bytes BINARY_SUBSCR calls: 68
[-1] BINARY_SUBSCR calls: 0

$ python bm_elementtree.py -n 100 --timer perf_counter
...[snip]...
Total BINARY_SUBSCR calls: 1078533
List BINARY_SUBSCR calls: 513
Tuple BINARY_SUBSCR calls: 1322
Dict BINARY_SUBSCR calls: 1063075
Unicode BINARY_SUBSCR calls: 13150
Bytes BINARY_SUBSCR calls: 248
[-1] BINARY_SUBSCR calls: 0

Lib/test$ ../../python -m unittest discover
...[snip]...^C <== I got bored waiting
KeyboardInterrupt
Total BINARY_SUBSCR calls:  4732885
List BINARY_SUBSCR calls:   1418730
Tuple BINARY_SUBSCR calls:  1300717
Dict BINARY_SUBSCR calls:   1151766
Unicode BINARY_SUBSCR calls: 409924
Bytes BINARY_SUBSCR calls:   363029
[-1] BINARY_SUBSCR calls: 26623

So dict seems to be the winner here

--
keywords: +patch
Added file: http://bugs.python.org/file41814/26280_stats.diff

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26280>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26280] ceval: Optimize [] operation similarly to CPython 2.7

2016-02-04 Thread Zach Byrne

Zach Byrne added the comment:

One thing I forgot to do was count slices.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26280>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26280] ceval: Optimize [] operation similarly to CPython 2.7

2016-02-03 Thread Zach Byrne

Zach Byrne added the comment:

Yury,
Are you going to tackle this one, or would you like me to?

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26280>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Zach Byrne

Zach Byrne added the comment:

> Could you please take a look at the updated patch?
Looks ok to me, for whatever that's worth.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21955>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Zach Byrne

Zach Byrne added the comment:

> I don't understand what this table means (why 4 columns?). Can you explain 
> what you did?

Yury suggested running perf.py twice with the binaries swapped
So "faster" and "slower" underneath "Baseline Reference" are runs where the 
unmodified python binary was the first argument to perf, and the "Modified 
Reference" is where the patched binary is the first argument.

ie. "perf.py -r -b all python patched_python" vs "perf.py -r -b all 
patched_python python"

bench_results.txt has the actual output in it, and the "slower in the right 
column" comment was referring to the contents of that file, not the table. 
Sorry for the confusion.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21955>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-03 Thread Zach Byrne

Zach Byrne added the comment:

I ran 6 benchmarks on my work machine(not the same one as the last set) 
overnight.
Two with just the BINARY_ADD change, two with the BINARY_SUBSCR change, and two 
with both.
I'm attaching the output from all my benchmark runs, but here are the highlights
In this table I've flipped the results for running the modified build as the 
reference, but in the new attachment, slower in the right column means faster, 
I think :)
|--|---|---|
|Build | Baseline Reference| Modified Reference 
   |
|--||--||--|
|  | Faster | Slower   | Faster 
| Slower   |
|--||--||--|
|BINARY_ADD| chameleon_v2   | etree_parse  | chameleon_v2   
| call_simple  |
|  | chaos  | nbody| fannkuch   
| nbody|
|  | django | normal_startup   | normal_startup 
| pickle_dict  |
|  | etree_generate | pickle_dict  | nqueens
| regex_v8 |
|  | fannkuch   | pickle_list  | regex_compile  
|  |
|  | formatted_logging  | regex_effbot | spectral_norm  
|  |
|  | go |  | unpickle_list  
|  |
|  | json_load  |  |
|  |
|  | regex_compile  |  |
|  |
|  | simple_logging |  |
|  |
|  | spectral_norm  |  |
|  |
|--||--||--|
|BINARY_SUBSCR | chameleon_v2   | call_simple  | 2to3   
| etree_parse  |
|  | chaos  | go   | call_method_slots  
| json_dump_v2 |
|  | etree_generate | pickle_list  | chaos  
| pickle_dict  |
|  | fannkuch   | telco| fannkuch   
|  |
|  | fastpickle |  | formatted_logging  
|  |
|  | hexiom2|  | go 
|  |
|  | json_load  |  | hexiom2
|  |
|  | mako_v2|  | mako_v2
|  |
|  | meteor_contest |  | meteor_contest 
|  |
|  | nbody  |  | nbody  
|  |
|  | regex_v8   |  | normal_startup 
|  |
|  | spectral_norm  |  | nqueens
|  |
|  ||  | pickle_list
|  |
|  ||  | simple_logging 
|  |
|  ||  | spectral_norm  
|  |
|  ||  | telco  
|  |
|--||--||--|
|BOTH  | chameleon_v2   | call_simple  | chameleon_v2   
| fastpickle   |
|  | chaos  | etree_parse  | choas  
| pickle_dict  |
|  | etree_generate | pathlib  | etree_generate 
| pickle_list  |
|  | etree_process  | pickle_list  | etree_process  
| telco|
|  | fannkuch   |  | fannkuch   
|  |
|  | fastunpickle   |  | float  
|  |
|  | float  |  | formatted_logging  
|  |
|  | formatted_logging  |  | go 
|  |
|  | hexiom2|  | hexiom2
|  |
|  | nbody  |  | nbody  
|  |
|  | nqueens|  | normal_startup 
|  |
|  | regex_v8   |  | nqueens
|  |
|  | spectral_norm

[issue21955] ceval.c: implement fast path for integers with a single digit

2016-02-02 Thread Zach Byrne

Zach Byrne added the comment:

I took another look at this, and tried applying it to 3.6 and running the 
latest benchmarks. It applied cleanly, and the benchmark results were similar, 
this time unpack_sequence and spectral_norm were slower. Spectral norm makes 
sense, it's doing lots of FP addition. The unpack_sequence instruction looks 
like it already has optimizations for unpacking lists and tuples onto the 
stack, and running dis on the test showed that it's completely dominated calls 
to unpack_sequence, load_fast, and store_fast so I still don't know what's 
going on there.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21955>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21955] ceval.c: implement fast path for integers with a single digit

2016-01-11 Thread Zach Byrne

Zach Byrne added the comment:

Anybody still looking at this? I can take another stab at it if it's still in 
scope.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21955>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21955] ceval.c: implement fast path for integers with a single digit

2016-01-11 Thread Zach Byrne

Zach Byrne added the comment:

> Can you figure why unpack_sequence and other benchmarks were slower?
I didn't look really closely, A few of the slower ones were floating point 
heavy, which would incur the slow path penalty, but I can dig into 
unpack_sequence.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21955>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21955] ceval.c: implement fast path for integers with a single digit

2015-03-18 Thread Zach Byrne

Zach Byrne added the comment:

I haven't looked at it since I posted the benchmark results for 21955_2.patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21955
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21955] ceval.c: implement fast path for integers with a single digit

2014-07-23 Thread Zach Byrne

Zach Byrne added the comment:

I ran the whole benchmark suite. There are a few that are slower: 
call_method_slots, float, pickle_dict, and unpack_sequence.

Report on Linux zach-vbox 3.2.0-24-generic-pae #39-Ubuntu SMP Mon May 21 
18:54:21 UTC 2012 i686 i686
Total CPU cores: 1

### 2to3 ###
24.789549 - 24.809551: 1.00x slower

### call_method_slots ###
Min: 1.743554 - 1.780807: 1.02x slower
Avg: 1.751735 - 1.792814: 1.02x slower
Significant (t=-26.32)
Stddev: 0.00576 - 0.01823: 3.1660x larger

### call_method_unknown ###
Min: 1.828094 - 1.739625: 1.05x faster
Avg: 1.852225 - 1.806721: 1.03x faster
Significant (t=2.28)
Stddev: 0.01874 - 0.24320: 12.9783x larger

### call_simple ###
Min: 1.353581 - 1.263386: 1.07x faster
Avg: 1.397946 - 1.302046: 1.07x faster
Significant (t=24.28)
Stddev: 0.03667 - 0.03154: 1.1629x smaller

### chaos ###
Min: 1.199377 - 1.115550: 1.08x faster
Avg: 1.230859 - 1.146573: 1.07x faster
Significant (t=16.24)
Stddev: 0.02663 - 0.02525: 1.0544x smaller

### django_v2 ###
Min: 2.682884 - 2.633110: 1.02x faster
Avg: 2.747521 - 2.690486: 1.02x faster
Significant (t=9.90)
Stddev: 0.02744 - 0.03010: 1.0970x larger

### fastpickle ###
Min: 1.751475 - 1.597340: 1.10x faster
Avg: 1.771805 - 1.613533: 1.10x faster
Significant (t=64.81)
Stddev: 0.01177 - 0.01263: 1.0727x larger

### float ###
Min: 1.254858 - 1.293067: 1.03x slower
Avg: 1.336045 - 1.365787: 1.02x slower
Significant (t=-3.30)
Stddev: 0.04851 - 0.04135: 1.1730x smaller

### json_dump_v2 ###
Min: 17.871819 - 16.968647: 1.05x faster
Avg: 18.428747 - 17.483397: 1.05x faster
Significant (t=4.10)
Stddev: 1.60617 - 0.27655: 5.8078x smaller

### mako ###
Min: 0.241614 - 0.231678: 1.04x faster
Avg: 0.253730 - 0.240585: 1.05x faster
Significant (t=8.93)
Stddev: 0.01912 - 0.01327: 1.4417x smaller

### mako_v2 ###
Min: 0.225664 - 0.213179: 1.06x faster
Avg: 0.234850 - 0.225984: 1.04x faster
Significant (t=10.12)
Stddev: 0.01379 - 0.01391: 1.0090x larger

### meteor_contest ###
Min: 0.777612 - 0.758924: 1.02x faster
Avg: 0.799580 - 0.780897: 1.02x faster
Significant (t=3.97)
Stddev: 0.02482 - 0.02212: 1.1221x smaller

### nbody ###
Min: 0.969724 - 0.883935: 1.10x faster
Avg: 0.996416 - 0.918375: 1.08x faster
Significant (t=12.65)
Stddev: 0.02426 - 0.03627: 1.4951x larger

### nqueens ###
Min: 1.142745 - 1.128195: 1.01x faster
Avg: 1.296659 - 1.162443: 1.12x faster
Significant (t=2.75)
Stddev: 0.34462 - 0.02680: 12.8578x smaller

### pickle_dict ###
Min: 1.433264 - 1.467394: 1.02x slower
Avg: 1.468122 - 1.506908: 1.03x slower
Significant (t=-7.20)
Stddev: 0.02695 - 0.02691: 1.0013x smaller

### raytrace ###
Min: 5.454853 - 5.538799: 1.02x slower
Avg: 5.530943 - 5.676983: 1.03x slower
Significant (t=-8.64)
Stddev: 0.05152 - 0.10791: 2.0947x larger

### regex_effbot ###
Min: 0.205875 - 0.194776: 1.06x faster
Avg: 0.28 - 0.198759: 1.06x faster
Significant (t=5.10)
Stddev: 0.01305 - 0.01112: 1.1736x smaller

### regex_v8 ###
Min: 0.141628 - 0.133819: 1.06x faster
Avg: 0.147024 - 0.140053: 1.05x faster
Significant (t=2.72)
Stddev: 0.01163 - 0.01388: 1.1933x larger

### richards ###
Min: 0.734472 - 0.727501: 1.01x faster
Avg: 0.760795 - 0.743484: 1.02x faster
Significant (t=3.50)
Stddev: 0.02778 - 0.02127: 1.3061x smaller

### silent_logging ###
Min: 0.344678 - 0.336087: 1.03x faster
Avg: 0.357982 - 0.347361: 1.03x faster
Significant (t=2.76)
Stddev: 0.01992 - 0.01852: 1.0755x smaller

### simple_logging ###
Min: 1.104831 - 1.072921: 1.03x faster
Avg: 1.146844 - 1.117068: 1.03x faster
Significant (t=4.02)
Stddev: 0.03552 - 0.03848: 1.0833x larger

### spectral_norm ###
Min: 1.710336 - 1.688910: 1.01x faster
Avg: 1.872578 - 1.738698: 1.08x faster
Significant (t=2.35)
Stddev: 0.40095 - 0.03331: 12.0356x smaller

### tornado_http ###
Min: 0.849374 - 0.852209: 1.00x slower
Avg: 0.955472 - 0.916075: 1.04x faster
Significant (t=4.82)
Stddev: 0.07059 - 0.04119: 1.7139x smaller

### unpack_sequence ###
Min: 0.30 - 0.20: 1.52x faster
Avg: 0.000164 - 0.000174: 1.06x slower
Significant (t=-13.11)
Stddev: 0.00011 - 0.00013: 1.2256x larger

### unpickle_list ###
Min: 1.333952 - 1.212805: 1.10x faster
Avg: 1.373228 - 1.266677: 1.08x faster
Significant (t=16.32)
Stddev: 0.02894 - 0.03597: 1.2428x larger

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21955
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21955] ceval.c: implement fast path for integers with a single digit

2014-07-21 Thread Zach Byrne

Zach Byrne added the comment:

I did something similar to BINARY_SUBSCR after looking at the 2.7 source as 
Raymond suggested. Hopefully I got my binaries straight this time :) The new 
patch includes Victor's inlining and my new subscript changes.

Platform of campaign orig:
Python version: 3.5.0a0 (default:c8ce5bca0fcd+, Jul 15 2014, 18:11:28) [GCC 
4.6.3]
Timer precision: 6 ns
Date: 2014-07-21 20:28:30

Platform of campaign patch:
Python version: 3.5.0a0 (default:c8ce5bca0fcd+, Jul 21 2014, 20:21:20) [GCC 
4.6.3]
Timer precision: 20 ns
Date: 2014-07-21 20:28:39

-+-+---
Tests    |    orig |  patch
-+-+---
1+2  |  118 ns (*) |  103 ns (-13%)
1+2 ran 100 times  | 7.28 us (*) | 5.93 us (-19%)
x[1] |  120 ns (*) |   98 ns (-19%)
x[1] ran 100 times | 7.35 us (*) | 5.31 us (-28%)
-+-+---
Total    | 14.9 us (*) | 11.4 us (-23%)
-+-+---

--
Added file: http://bugs.python.org/file36021/21955_2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21955
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21955] ceval.c: implement fast path for integers with a single digit

2014-07-16 Thread Zach Byrne

Zach Byrne added the comment:

Well, dont' I feel silly. I confirmed both my regression and the inline speedup 
using the benchmark Victor added. I wonder if I got my binaries backwards in my 
first test...

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21955
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21955] ceval.c: implement fast path for integers with a single digit

2014-07-15 Thread Zach Byrne

Zach Byrne added the comment:

So I'm trying something pretty similar to Victor's pseudo-code and just using 
timeit to look for speedups
timeit('x+x', 'x=10', number=1000)
before:
1.193423141393
1.1988609210002323
1.1998214110003573
1.206968028999654
1.2065417159997196

after:
1.1698650090002047
1.170515890227
1.1752884750003432
1.174481861933
1.1741297110002051
1.176042264782

Small improvement. Haven't looked at optimizing BINARY_SUBSCR yet.

--
keywords: +patch
nosy: +zbyrne
Added file: http://bugs.python.org/file35961/21955.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21955
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21323] CGI HTTP server not running scripts from subdirectories

2014-07-11 Thread Zach Byrne

Zach Byrne added the comment:

Done and done.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21323
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21323] CGI HTTP server not running scripts from subdirectories

2014-07-08 Thread Zach Byrne

Zach Byrne added the comment:

Hi, I'm new. I wrote a test for nested directories under cgi-bin and got that 
to pass without failing the test added for 19435 by undoing most of the changes 
to run_cgi() but building path from the values in self.cgi_info. Thoughts?

--
keywords: +patch
nosy: +zbyrne
Added file: http://bugs.python.org/file35908/21323.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21323
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com