[issue19087] bytearray front-slicing not optimized

STINNER Victor Mon, 30 Sep 2013 14:52:55 -0700

STINNER Victor added the comment:

I adapted my micro-benchmark to measure the speedup: bench_bytearray2.py. 
Result on  bytea_slice2.patch:


Common platform:
CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g 
-fwrapv -O3 -Wall -Wstrict-prototypes
CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Timer info: namespace(adjustable=False, 
implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, 
resolution=1e-09)
Platform: Linux-3.9.4-200.fc18.x86_64-x86_64-with-fedora-18-Spherical_Cow
Python unicode implementation: PEP 393
Timer: time.perf_counter
Bits: int=32, long=64, long long=64, size_t=64, void*=64
Timer precision: 40 ns

Platform of campaign original:
Date: 2013-09-30 23:39:31
Python version: 3.4.0a2+ (default:687dd81cee3b, Sep 30 2013, 23:39:27) [GCC 
4.7.2 20121109 (Red Hat 4.7.2-8)]
SCM: hg revision=687dd81cee3b tag=tip branch=default date="2013-09-29 22:18 
+0200"

Platform of campaign patched:
Date: 2013-09-30 23:38:55
Python version: 3.4.0a2+ (default:687dd81cee3b+, Sep 30 2013, 23:30:35) [GCC 
4.7.2 20121109 (Red Hat 4.7.2-8)]
SCM: hg revision=687dd81cee3b+ tag=tip branch=default date="2013-09-29 22:18 
+0200"

------------------------+-------------+------------
non regression          |    original |     patched
------------------------+-------------+------------
concatenate 10**1 bytes |  1.1 us (*) |     1.14 us
concatenate 10**3 bytes |     46.9 us | 46.8 us (*)
concatenate 10**5 bytes | 4.66 ms (*) |     4.71 ms
concatenate 10**7 bytes |  478 ms (*) |      483 ms
------------------------+-------------+------------
Total                   |  482 ms (*) |      488 ms
------------------------+-------------+------------

----------------------------+-------------------+-------------
deleting front, append tail |          original |      patched
----------------------------+-------------------+-------------
buffer 10**1 bytes          |        639 ns (*) | 689 ns (+8%)
buffer 10**3 bytes          |        682 ns (*) | 723 ns (+6%)
buffer 10**5 bytes          |   3.54 us (+428%) |   671 ns (*)
buffer 10**7 bytes          | 900 us (+107128%) |   840 ns (*)
----------------------------+-------------------+-------------
Total                       |  905 us (+30877%) |  2.92 us (*)
----------------------------+-------------------+-------------

----------------------------+------------------+------------
Summary                     |         original |     patched
----------------------------+------------------+------------
non regression              |       482 ms (*) |      488 ms
deleting front, append tail | 905 us (+30877%) | 2.92 us (*)
----------------------------+------------------+------------
Total                       |       483 ms (*) |      488 ms
----------------------------+------------------+------------

@Serhiy: I see "zero" difference in the append loop micro-benchmark. I added 
the final cast to bytes()

@Antoine: Your patch rocks, 30x faster! (I don't care of the 8% slowdown in the 
nanosecond timing).

----------
Added file: http://bugs.python.org/file31929/bench_bytearray2.py

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue19087>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue19087] bytearray front-slicing not optimized

Reply via email to