On 30.05.12 14:26, Victor Stinner wrote:
I implemented something like that, and it was not efficient and very complex.
See for example the (incomplete) patch for str%args attached to the
issue #14687:
http://bugs.python.org/file25413/pyunicode_format-2.patch
I have seen and commented on this p
>> The "two steps" method is not promising: parsing the format string
>> twice is slower than other methods.
>
> The "1.5 steps" method is more promising -- first parse the format string in
> an efficient internal representation, and then allocate the output string
> and then write characters (or e
On 30.05.12 01:44, Victor Stinner wrote:
The "two steps" method is not promising: parsing the format string
twice is slower than other methods.
The "1.5 steps" method is more promising -- first parse the format
string in an efficient internal representation, and then allocate the
output strin
On 5/29/2012 3:51 PM, Nick Coghlan wrote:
On Wed, May 30, 2012 at 8:44 AM, Victor Stinner
wrote:
I also compared str%args and str.format() with Python 2.7 (byte
strings), 3.2 (UTF-16 or UCS-4) and 3.3 (PEP 393): Python 3.3 is as
fast as Python 2.7 and sometimes faster! (Whereras Python 3.2 is
On Wed, May 30, 2012 at 8:44 AM, Victor Stinner
wrote:
> I also compared str%args and str.format() with Python 2.7 (byte
> strings), 3.2 (UTF-16 or UCS-4) and 3.3 (PEP 393): Python 3.3 is as
> fast as Python 2.7 and sometimes faster! (Whereras Python 3.2 is 10 to
> 30% slower than Python 2 in gene
Hi,
> * Use a Py_UCS4 buffer and then convert to the canonical form (ASCII,
> UCS1 or UCS2). Approach taken by io.StringIO. io.StringIO is not only
> used to write, but also to read and so a Py_UCS4 buffer is a good
> compromise.
> * PyAccu API: optimized version of chunks=[]; for ...: ...
> chu
04.05.12 02:45, Victor Stinner написав(ла):
* Two steps: compute the length and maximum character of the output
string, allocate the output string and then write characters. str%args
was using it.
* Optimistic approach. Start with a ASCII buffer, enlarge and widen
(to UCS2 and then UCS4) the
Various notes:
* PyUnicode_READ() is slower than reading a Py_UNICODE array.
* Some decoders unroll the main loop to process 4 or 8 bytes (32 or
64 bits CPU) at each step.
I am interested if you know other tricks to optimize Unicode strings
in Python, or if you are interested to work on this to