#927 <https://github.com/leo-editor/leo-editor/issues/924> performance
questions has revealed some spectacular performance gains. The new code is
in the "perf" branch.
Simply inlining g.isUnicode inside g.toUnicode makes a huge difference.
The new code is:
if isPython3:
def isUnicode(s):
return isinstance(s, str)
else:
def isUnicode(s):
return isinstance(s, types.UnicodeType)
# This inlining makes a huge difference.
# It saves most calls to _toUnicode and g.isUnicode!
if isPython3:
def toUnicode(s, encoding='utf-8', reportErrors=False):
'''Convert a non-unicode string with the given encoding to
unicode.'''
return s if isinstance(s, str) else _toUnicode(s, encoding,
reportErrors)
else:
def toUnicode(s, encoding='utf-8', reportErrors=False):
'''Convert a non-unicode string with the given encoding to
unicode.'''
return s if isinstance(s, types.UnicodeType) else _toUnicode(s,
encoding, reportErrors)
def _toUnicode(s, encoding, reportErrors):
if not encoding:
encoding = 'utf-8'
#
# These are the only significant calls to s.decode in Leo.
# Tracing these calls directly yields thousands of calls.
try:
s = s.decode(encoding, 'strict')
except (UnicodeDecodeError, UnicodeError):
# https://wiki.python.org/moin/UnicodeDecodeError
s = s.decode(encoding, 'replace')
if reportErrors:
g.trace(g.callers())
g.error("toUnicode: Error converting %s...from %s encoding to
unicode" % (
s[: 200], encoding))
except AttributeError:
# May be a QString.
s = g.u(s)
return s
*Before stats for g.isUnicode*
145181 g.isUnicode:__init__,read_words,add_expanded_line,toUnicode
66369 g.isUnicode:__get_h,headString,headString,toUnicode
52902 g.isUnicode:get_UNL,__get_h,headString,headString
11400 g.isUnicode:munge,os_path_normpath,toUnicodeFileEncoding,toUnicode
8924 g.isUnicode:shortFileName,os_path_basename,toUnicodeFileEncoding,
toUnicode
7744 g.isUnicode:get_directives_dict,__get_b,bodyString,bodyString
7488 g.isUnicode:<listcomp>,os_path_expanduser,toUnicodeFileEncoding,
toUnicode
7449 g.isUnicode:anyAtFileNodeName,findAtFileName,headString,toUnicode
6986 g.isUnicode:v_element_visitor,v_element_visitor,v_element_visitor,
v_element_visitor
6539 g.isUnicode:os_path_finalize_join,<listcomp>,
os_path_expandExpression,toUnicode
5643 g.isUnicode:get_directives_dict,__get_h,headString,headString
4841 g.isUnicode:isAnyAtFileNode,isAnyAtFileNode,headString,toUnicode
4797 g.isUnicode:idle_check_commander,isAnyAtFileNode,
isAnyAtFileNode,headString
4580 g.isUnicode:os_path_finalize_join,os_path_join,
toUnicodeFileEncoding,toUnicode
4565 g.isUnicode:anyAtFileNodeName,findAtFileName,skip_id,toUnicode
4044 g.isUnicode:anyAtFileNodeName,anyAtFileNodeName,findAtFileName,
headString
3405 g.isUnicode:isAnyAtFileNode,anyAtFileNodeName,findAtFileName,
headString
2489 g.isUnicode:createAllButtons,__get_h,headString,headString
2280 g.isUnicode:parse,parse,feed,characters
2054 g.isUnicode:visitNode,__get_h,headString,headString
1866 g.isUnicode:putLine,putCodeLine,onl,os
1747 g.isUnicode:visitNode,parseHeadline,skip_id,toUnicode
1715 g.isUnicode:putBody,putLine,putCodeLine,os
1302 g.isUnicode:parseHeadline,__get_h,headString,headString
*After stats for g.isUnicode*
52902 g.isUnicode:get_UNL,__get_h,headString,headString
7459 g.isUnicode:get_directives_dict,__get_b,bodyString,bodyString
6986 g.isUnicode:v_element_visitor,v_element_visitor,
v_element_visitor,v_element_visitor
5423 g.isUnicode:get_directives_dict,__get_h,headString,headString
3919 g.isUnicode:anyAtFileNodeName,anyAtFileNodeName,findAtFileName,
headString
2489 g.isUnicode:createAllButtons,__get_h,headString,headString
2280 g.isUnicode:parse,parse,feed,characters
2132 g.isUnicode:idle_check_commander,isAnyAtFileNode,
isAnyAtFileNode,headString
2054 g.isUnicode:visitNode,__get_h,headString,headString
1866 g.isUnicode:putLine,putCodeLine,onl,os
1715 g.isUnicode:putBody,putLine,putCodeLine,os
1550 g.isUnicode:isAnyAtFileNode,anyAtFileNodeName,findAtFileName,
headString
1302 g.isUnicode:parseHeadline,__get_h,headString,headString
*After stats for g._toUnicode*
10 g._toUnicode:read,openFileForReading,readFileToUnicode,toUnicode
7 g._toUnicode:readAll,readOneAtCleanNode,read_at_clean_lines,
toUnicode
6 g._toUnicode:openLeoFile,getLeoFile,readFile,toUnicode
5 g._toUnicode:read,read_into_root,readFileIntoString,toUnicode
3 g._toUnicode:read_into_root,readFileIntoString,
getPythonEncodingFromString,toUnicode
2 g._toUnicode:__init__,__init__,read_words,toUnicode
1 g._toUnicode:createOutline,init_import,readFileIntoString,toUnicode
This is remarkable. Next, I'll investigate why so many positions are being
created.
Edward
--
You received this message because you are subscribed to the Google Groups
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.