Roundup Robot devn...@psf.upfronthosting.co.za added the comment:
New changeset 22b56b0b8619 by Victor Stinner in branch 'default':
Issue #14744: Use the new _PyUnicodeWriter internal API to speed up str%args
and str.format(args)
http://hg.python.org/cpython/rev/22b56b0b8619
--
Roundup Robot devn...@psf.upfronthosting.co.za added the comment:
New changeset 6abab1a103a6 by Victor Stinner in branch 'default':
Issue #14744: Fix compilation on Windows
http://hg.python.org/cpython/rev/6abab1a103a6
--
___
Python tracker
Roundup Robot devn...@psf.upfronthosting.co.za added the comment:
New changeset df0144f68d76 by Victor Stinner in branch 'default':
Issue #14744: Fix compilation on Windows (part 2)
http://hg.python.org/cpython/rev/df0144f68d76
--
___
Python tracker
STINNER Victor victor.stin...@gmail.com added the comment:
report_windows7: Comparaison of str%args and str.format() on Windows 7.
* Python 2.7 (64 bits)
* Python 3.2 (64 bits), narrow (UTF-16)
* Python 3.3 (*32* bits), PEP 393
The benchmark is not fair because Python 3.3 is compiled in 32
Changes by STINNER Victor victor.stin...@gmail.com:
--
resolution: - fixed
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14744
___
STINNER Victor victor.stin...@gmail.com added the comment:
So, do you have any comment or complain? Or can I commit the patch?
Le 24 mai 2012 11:57, STINNER Victor rep...@bugs.python.org a écrit :
STINNER Victor victor.stin...@gmail.com added the comment:
For Python 3.3, _PyUnicodeWriter
Serhiy Storchaka storch...@gmail.com added the comment:
So, do you have any comment or complain? Or can I commit the patch?
I beg your pardon, I will do a review and additional benchmarks today.
So far away I have to say, it is better to use stringlib approach, than the
massive macros, which
STINNER Victor victor.stin...@gmail.com added the comment:
So far away I have to say, it is better to use stringlib
approach, than the massive macros, which are more difficult
to read and edit.
Ah, you don't like the two macros in longobject.c. Functions to write digits
into a string may be
Serhiy Storchaka storch...@gmail.com added the comment:
I just sent you a patch which does not use any macros or stringlib.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14744
___
STINNER Victor victor.stin...@gmail.com added the comment:
Functions to write digits into a string may be appropriate
in the stringlib.
Oh, stringlib is specific to unicodeobject.c: it cannot be used outside.
--
___
Python tracker
Serhiy Storchaka storch...@gmail.com added the comment:
For Python 3.3, _PyUnicodeWriter API is faster than the Py_UCS4 buffer API
and PyAccu API in quite all cases, with a speedup between 30% and 100%. But
there are some cases where the _PyUnicodeWriter API is slower:
Perhaps most of
STINNER Victor victor.stin...@gmail.com added the comment:
For Python 3.3, _PyUnicodeWriter API is faster than the Py_UCS4 buffer API
and PyAccu API in quite all cases, with a speedup between 30% and 100%. But
there are some cases where the _PyUnicodeWriter API is slower:
Perhaps most of
Changes by STINNER Victor victor.stin...@gmail.com:
Added file: http://bugs.python.org/file25685/REPORT_32BIT_2.7_3.2_writer
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14744
___
Changes by STINNER Victor victor.stin...@gmail.com:
Added file: http://bugs.python.org/file25686/REPORT_32BIT_3.3
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14744
___
Changes by STINNER Victor victor.stin...@gmail.com:
Added file: http://bugs.python.org/file25687/REPORT_64BIT_2.7_3.2_writer
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14744
___
Changes by STINNER Victor victor.stin...@gmail.com:
Added file: http://bugs.python.org/file25688/REPORT_64BIT_3.3
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14744
___
Changes by STINNER Victor victor.stin...@gmail.com:
Added file: http://bugs.python.org/file25689/faa88c50a3d2.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14744
___
STINNER Victor victor.stin...@gmail.com added the comment:
Because I don't know what should be tested, I wrote a lot a tests in the
bench_str.py script. To run the benchmark, use:
./python benchmark.py --file=FILE script bench_str.py
Then to compare results:
./python benchmark.py compare_to
Antoine Pitrou pit...@free.fr added the comment:
When posting benchmark numbers, can you please only compared patched against
unpatched? I don't think we care about performance compared to 3.2 or 2.7 here,
and it would make things more readable.
--
STINNER Victor victor.stin...@gmail.com added the comment:
For Python 3.3, _PyUnicodeWriter API is faster than the Py_UCS4 buffer API and
PyAccu API in quite all cases, with a speedup between 30% and 100%. But there
are some cases where the _PyUnicodeWriter API is slower:
fmt=x={};
STINNER Victor victor.stin...@gmail.com added the comment:
When posting benchmark numbers, can you please only compared
patched against unpatched?
Here you have: REPORT_64BIT_PATCH.
--
Added file: http://bugs.python.org/file25690/REPORT_64BIT_PATCH
Changes by STINNER Victor victor.stin...@gmail.com:
Removed file: http://bugs.python.org/file25689/faa88c50a3d2.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14744
___
STINNER Victor victor.stin...@gmail.com added the comment:
faster-format.patch: Patch for Python 3.3 optimizing str%args and
str.format(args), use _PyUnicodeWriter deeper in formatting. The patch uses
different optimizations:
* if the result is just a string, copy the string by reference,
STINNER Victor victor.stin...@gmail.com added the comment:
I created a new repository to optimize str.format and str%args.
--
hgrepos: +125
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14744
Antoine Pitrou pit...@free.fr added the comment:
When it's possible to not overallocate, the speed up is around 10% for
short strings (I suppose that it's much better to longer strings).
Well, please post a benchmark for long strings, then :-)
I think 10% on a micro-benchmark is not worth the
STINNER Victor victor.stin...@gmail.com added the comment:
When it's possible to not overallocate, the speed up is around 10% for
short strings (I suppose that it's much better to longer strings).
Well, please post a benchmark for long strings, then :-)
Ok, here you have. I don't understand
Serhiy Storchaka storch...@gmail.com added the comment:
Not quite honest contrexample:
./python -m timeit -s f='[{}]'.format;s='A'*100 f(s)
Python 3.3: 100 loops, best of 3: 1.67 usec per loop
Python 3.3 + dont_overallocate.patch: 10 loops, best of 3: 2.01 usec per
loop
--
Antoine Pitrou pit...@free.fr added the comment:
When it's possible to not overallocate, the speed up is around 10% for
short strings (I suppose that it's much better to longer strings).
Well, please post a benchmark for long strings, then :-)
Ok, here you have. I don't understand why
STINNER Victor victor.stin...@gmail.com added the comment:
Not quite honest contrexample
I agree, this example is not honest :-) It's because of the magical value 100
used as initial size of the buffer. The speed is the same for shorter or longer
strings.
--
Serhiy Storchaka storch...@gmail.com added the comment:
It seems to me that the proposed changes are too tricky and too dirty for such
a modest gain. It seems to me, this effect can be achieved easier
(special-casing %s and {} to return str(arg)?).
If you want to get really impressive
STINNER Victor victor.stin...@gmail.com added the comment:
Do you have anything more interesting than fmt=%s ?
and
It seems to me that the proposed changes are too tricky and too dirty for
such a modest gain.
To be honest, I didn't write dont_overallocate.patch to speed up formatting
Antoine Pitrou pit...@free.fr added the comment:
I will rewrite my format_writer-2.patch based on
dont_overallocate.patch. It looks like you are waiting for the full
patch.
Well, there's no point in committing the first patch if the second one
doesn't give an interesting speedup.
--
STINNER Victor victor.stin...@gmail.com added the comment:
To prepare a deeper change, here is a first simple patch. Change how the size
of the _PyUnicodeWriter buffer is handled:
* overallocate by 100% (instead of 25%) and allocate at least 100 characters
* don't overallocate when possible
Changes by STINNER Victor victor.stin...@gmail.com:
Added file: http://bugs.python.org/file25558/benchmark.py
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14744
___
Serhiy Storchaka storch...@gmail.com added the comment:
Here is a new patch using _PyUnicodeWriter directly in longobject.c.
It may be worth to do it in a separate issue?
decimal digits) is 17% faster with my patch version 2 compared to tip,
and 38% faster compared to Python 3.3 before my
STINNER Victor victor.stin...@gmail.com added the comment:
Inlining may be removed to simplify the code
Attached inline_unicode_writer.patch does inline the code but also call only
unicode_writer_prepare() once for each argument in PyUnicode_Format(). The
patch removes
Roundup Robot devn...@psf.upfronthosting.co.za added the comment:
New changeset 6c8a117f8966 by Victor Stinner in branch 'default':
Issue #14744: Inline unicode_writer_write_char() and unicode_write_str()
http://hg.python.org/cpython/rev/6c8a117f8966
--
nosy: +python-dev
STINNER Victor victor.stin...@gmail.com added the comment:
_PyUnicodeWriter in long_to_decimal_string() for example.
long_to_decimal_string() is already creates a string of known size. How
_PyUnicodeWriter can help here?
x={}.format(123) uses a temporary buffer for 123. Using
Mark Dickinson dicki...@gmail.com added the comment:
Issue3451 looks much more promising for int formatting. But it will take
a lot of time to carefully check this.
I disagree: Issue 3451 is about *asymptotically* fast base conversion, and the
changes proposed there are only going to kick
Serhiy Storchaka storch...@gmail.com added the comment:
x={}.format(123) uses a temporary buffer for 123.
This, apparently, is inevitable. I doubt that it is able to considerably
optimize, not worsened str(int) (which is optimal for current
algorithm). Note that the more complex formatting
STINNER Victor victor.stin...@gmail.com added the comment:
Fill the ascii buffer and then copying can be cheaper than using
_PyUnicodeWriter with general non-ascii string.
Here is a new patch using _PyUnicodeWriter directly in longobject.c.
According to my benchmark (see below), formating a
Antoine Pitrou pit...@free.fr added the comment:
According to my benchmark (see below), formating a small number (5
decimal digits) is 17% faster with my patch version 2 compared to tip,
and 38% faster compared to Python 3.3 before my optimizations on str%
tuples or str.format(). Creating a
New submission from STINNER Victor victor.stin...@gmail.com:
Since 7be716a47e9d (issue #14716), str.format() uses the unicode_writer API.
I propose to continue the work in this direction to avoid more temporary
buffers.
Python 3.3:
100 loops, best of 3: 0.573 usec per loop
10 loops,
STINNER Victor victor.stin...@gmail.com added the comment:
Comments on the patch.
-PyAPI_FUNC(PyObject *) _PyComplex_FormatAdvanced(PyObject *obj,
+PyAPI_FUNC(int) _PyComplex_FormatWriter(PyObject *obj,
Even if it is a private function, I prefer to rename it because its API does
change.
/*
Serhiy Storchaka storch...@gmail.com added the comment:
If this patch is accepted, it's more to go even deeper: use _PyUnicodeWriter
in long_to_decimal_string() for example.
long_to_decimal_string() is already creates a string of known size. How
_PyUnicodeWriter can help here?
Issue3451
45 matches
Mail list logo