This post could be called an addendum to #336 
<https://github.com/leo-editor/leo-editor/issues/336>, but it's important 
for any pythonista doing string manipulation.

Previously, i.check was string-oriented.  It passed strings to all its 
helpers.  Each helper then immediately called g.splitLines to convert the 
string to a list, created a result list, and then returned ''.join(result). 

This is crazy. Besides being clumsy, the code creates large numbers of 
*large* strings.  The gc will have to recycle them all!

Now, i.check splits lines *once*, on entry, and thereafter does all 
checking using *lists *of lines. This is *much* faster, and stresses the gc 
*much* less.

Because python "interns" strings, creating a new *list* of strings creates 
new strings only if those strings have never before been seen (in the 
present gc generation). The lists consist of *short* strings, so they are 
much more likely to be interned. This is important.

Lists themselves are just lists of references to strings. For example, 
rearranging a list of strings *never* generates any new strings!

In short, *using lists of (short) strings is fundamentally better than 
splitting and joining large strings*.

Happily, using lists of strings usually simplifies the code, thanks to 
python's superb list comprehensions. If you don't grok comprehensions, you 
are missing one of python's most powerful and beautiful features.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.

Reply via email to