[issue25870] textwrap is very slow on long words without spaces

2015-12-15 Thread Bernhard M. Wiedemann

Bernhard M. Wiedemann added the comment:

should probably be

   lines = [x[n*64:(n+1)*64] for n in range(((len(x)-1)//64)+1)]

to avoid an empty line added when the last line is full
which once again shows why people prefer to use standard libraries
for this kind of work

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25870] textwrap is very slow on long words without spaces

2015-12-15 Thread R. David Murray

R. David Murray added the comment:

This has already been fixed in issue 22687.  It was deemed a performance 
improvement for an edge case and was not backported.

I don't see the advantage of using textwrap to split up base64 encoded strings, 
by the way.  The module isn't designed for doing line splitting, it designed 
for doing text wrapping where blanks matter.  For your application I would just 
do:

   lines = [x[n*64:(n+1)*64] for n in range((len(x)//64)+1)]

--
components:  -Benchmarks, Extension Modules, Regular Expressions
nosy: +r.david.murray
resolution:  -> duplicate
status: open -> closed
superseder:  -> horrible performance of textwrap.wrap() with a long word

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25870] textwrap is very slow on long words without spaces

2015-12-15 Thread Martin Panter

Martin Panter added the comment:

There is a standard library fuction for that ;) the step argument to range():

lines = (result[n:n + 64] for n in range(0, len(result), 64))

--
nosy: +martin.panter

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25870] textwrap is very slow on long words without spaces

2015-12-15 Thread Bernhard M. Wiedemann

New submission from Bernhard M. Wiedemann:

Many python scripts use textwrap to break base64-encoded strings from openssl 
into lines - e.g. https://bugs.launchpad.net/python-keystoneclient/+bug/1404402
and https://github.com/diafygi/acme-tiny/blob/master/acme_tiny.py#L166

Steps To Reproduce:
time python -c "import textwrap; textwrap.wrap('x'*9000, 64)"

This has a complexity of O(n^2), meaning wrapping 18000 chars takes 4 times as 
long as 9000.

one known workaround is to use
textwrap.wrap('x'*9000, 64, break_on_hyphens=False)

this also has O(n^2) complexity, but is around 2000 times faster.

--
components: Benchmarks, Extension Modules, Library (Lib), Regular Expressions
messages: 256461
nosy: bmwiedemann, brett.cannon, ezio.melotti, mrabarnett, pitrou
priority: normal
severity: normal
status: open
title: textwrap is very slow on long words without spaces
type: performance
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25870] textwrap is very slow on long words without spaces

2015-12-15 Thread R. David Murray

R. David Murray added the comment:

Oh, good call.  I forgot about step.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com