I spent most of yesterday studying the performance of Vitalije's prototype
code.
The only truly important performance metric is how long it takes
miniTkLeo.py to load a substantial .leo file. I changed this file so it
loads my private leoPy.leo file if no file is given on the command line.
On my machine, it takes 0.6 to 0.7 seconds to load this file *and all
external files*. This performance is why Vitalije and I are excited about
the code.
It's not possible to use cProfile directly on miniTkLeo.py because it uses
Python's threading and queue modules. Instead, I added profiling code to
the loadex function, like this:
def loadex():
'''The target of threading.Thread.'''
if 0: # Profile the code.
cProfile.runctx('loadex_helper()',
globals(),
locals(),
'profile_stats', # 'profile-%s.out' % process_name
)
print('===== writing profile_stats')
p = pstats.Stats('profile_stats')
p.strip_dirs().sort_stats('tottime').print_stats(50)
# .print_stats('leoDataModel.py', 50)
else:
loadex_helper()
def loadex_helper():
ltm2 = LeoTreeModel.frombytes(ltmbytes)
loaddir = os.path.dirname(fname)
loadExternalFiles(ltm2, loaddir)
G.q.put(ltm2)
With statistics enabled, the load time on my machine is 0.9 seconds,
instead of 0.6 to 0.7 seconds.
This code produces the following statistics, edited to show only the
highlights:
1. Limited to leoDataModel.py:
TotTime:
ncalls tottime percall cumtime percall filename:lineno(function)
8212 0.267 0.000 0.586 0.000 leoDataModel.py:1233(
load_derived_file)
8212 0.023 0.000 0.611 0.000 leoDataModel.py:1569(viter)
8047 0.017 0.000 0.024 0.000 leoDataModel.py:1327(set_node)
Calls:
ncalls tottime percall cumtime percall filename:lineno(function)
16691 0.003 0.000 0.006 0.000 leoDataModel.py:37(parPosIter)
8212 0.268 0.000 0.587 0.000 leoDataModel.py:1233(
load_derived_file)
8212 0.023 0.000 0.612 0.000 leoDataModel.py:1569(viter)
8047 0.017 0.000 0.023 0.000 leoDataModel.py:1327(set_node)
971 0.000 0.000 0.001 0.000 leoDataModel.py:293(parents)
806/165 0.001 0.000 0.001 0.000 leoDataModel.py:412(
updateParentSize) (in replaceNode)
2. Including all methods:
TotTime:
ncalls tottime percall cumtime percall filename:lineno(function)
8212 0.272 0.000 0.594 0.000 leoDataModel.py:1233(
load_derived_file)
626060 0.220 0.000 0.220 0.000 {method 'match' of
'_sre.SRE_Pattern' objects}
232802 0.036 0.000 0.036 0.000 {method 'startswith' of 'str'
objects}
165 0.021 0.000 0.030 0.000 {method 'read' of
'_io.TextIOWrapper' objects}
Calls:
ncalls tottime percall cumtime percall filename:lineno(function)
626060 0.221 0.000 0.221 0.000 {method 'match' of
'_sre.SRE_Pattern' objects}
232802 0.036 0.000 0.036 0.000 {method 'startswith' of 'str'
objects}
167416 0.014 0.000 0.014 0.000 {method 'append' of 'list'
objects}
110420 0.008 0.000 0.008 0.000 {built-in method builtins.len}
95453 0.007 0.000 0.007 0.000 {method 'isspace' of 'str'
objects}
35554 0.006 0.000 0.006 0.000 {method 'group' of
'_sre.SRE_Match' objects}
16357 0.001 0.000 0.001 0.000 {method 'random' of
'_random.Random' objects}
8906 0.002 0.000 0.002 0.000 {method 'pop' of 'list' objects}
8562 0.006 0.000 0.006 0.000 {method 'join' of 'str' objects}
2204 0.000 0.000 0.000 0.000 {built-in method builtins.
isinstance}
This is remarkable. To the first approximation, only load_derived_file
matters. None of the helper functions/generators contribute any
substantial time to the overall load time.
*Summary*
load_derived_file is incredibly fast. When loading .leo files, only it's
performance matters.
For this function only, the speed of attribute access may be crucial.
Converting section references to functions in load_derived_file may slow
the code by changing local refs to nonlocal refs.
Edward
--
You received this message because you are subscribed to the Google Groups
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.