Re: Documentation on new data model for Leo outlines

2018-05-21 Thread Edward K. Ream
On Sun, May 20, 2018 at 1:08 PM, vitalije  wrote:


> To be fair, I must point out that Leo is performing also undo/redo logic
> which my prototype still doesn't. But taking a full snapshot of the tree in
> the new model was measured before and it used to take about 20ms, for the
> outline of about 8300 nodes (LeoPyRef.leo).
>
> With that snapshot time added new model is still at least 100 times faster
> than Leo in the worst case (for Leo) and about 6 times faster in the
> easiest case for Leo (with all nodes collapsed).
>

​I look forward to seeing the code.  Is it available?
​

> Even if the changing Leo tree is O(1) operation, it seems that overall
> operation is much worse.
>

​One of my dreams is to support huge outlines, so that, for example, one
could imagine the human genome project using Leo outlines.
​

> What still remains to be done in my prototype is importing files, reading
> at-auto files which is wide area considering how many different languages
> Leo supports. I will try to make python, markdown and rest importers next
> because they are used in Leo, Leo's installation folder have files that I
> can use for the testing and comparing with the Leo importers.
>
> For the javascript importer I have an idea to use node modules for parsing
> numerous javascript dialects. Existing node modules like babel have already
> solved the problem of different dialects and can produce necessary data for
> Leo to create well shaped outline. And those node modules work very fast.
> They analyze hundreds of js files in less than 10s.
>

​I look forward to seeing these importers.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-20 Thread vitalije

On Thursday, May 17, 2018 at 9:20:33 PM UTC+2, Matt Wilkie wrote:
>
> I have just published first two articles about what I have done so far in 
>> this mini-project. Here are the links:
>>
>  [...]
>
>> All comments are welcome.
>>
>  
> Vitalije, I'm working my through your posts on that site from the 
> beginning. It's slow because I understand little because of my shallow 
> coding background. However I want to say that enjoy them and learn things. 
> Top of mind at the moment is your recognition that your initial attempts to 
> communicate an idea didn't work and that a) building a demo might be 
> effective, and b) choosing something that wasn't currently in discussion 
> but had been before would invite deeper inspection and involvement. 
>
> Thanks,
>
> Matt
>

Matt Wilkie, thanks for these kind comments. I am glad you like it.  

I have almost finished small gui that uses new LeoTreeModel as its data 
model. Reading and writing external files works correctly. I have made 
methods for all tree operations that I could think of. Promote/Demote, move 
node up, down, left and right, move node to any given position. Now here 
are some comparisons with the present Leo:

A node with the following body:
import timeit
def f():
c.demote()
c.promote()
def test(fun, num):
t1 = timeit.timeit(fun, number=num)/num*1000
g.es('%s avg:%.1fms'%(fun.__name__, t1))
def f2():
c.moveOutlineDown()
c.moveOutlineUp()

test(f, 10)
test(f2, 10)

Make it last node in outline, then make a clone of it, and insert three 
empty nodes after with heading a,b and c.
Let those five nodes be the last five top-level children of the outline. 
Select second clone (one closer to the end) and execute script.
The results depend on the size of the visible outline. I have tried with 
the LeoPy.leo and at first with all nodes collapsed I've got the following 
results:
f avg:126.6ms
f2 avg:160.9ms

When similar script executed on the same nodes in my gui prototype:
demote/promote average: 3.3ms
up/down average: 3.8ms

Here is the script:
def speedtest(x):
import timeit
pos = ltm.positions[-4]
def f1():
ltm.promote(pos)
draw_tree(tree, ltm)
ltm.promote_children(pos)
draw_tree(tree, ltm)
def f2():
ltm.move_node_down(pos)
draw_tree(tree, ltm)
ltm.move_node_up(pos)
draw_tree(tree, ltm)

t1 = timeit.timeit(f1, number=100)/100*1000
t2 = timeit.timeit(f2, number=100)/100*1000
print('demote/promote average: %.1fms\n'%t1)
print('up/down average: %.1fms\n'%t2)

Just executing once expand-all took too much time (didn't measure but more 
than 20s). After that I have reduced the number of timeit executions to 
just 1 and got the following results (with fully expanded tree of LeoPy.leo)

f avg:4374.2ms
f2 avg:4385.8ms

Performing same in prototype gui, expanding all nodes was instantaneous and 
timing it 100 times gave me the following results:
demote/promote average: 16.4ms
up/down average: 19.3ms

As you can see it is definitely dependent on the number of visible nodes, 
but nevertheless it is much faster than current Leo implementation.

To be fair, I must point out that Leo is performing also undo/redo logic 
which my prototype still doesn't. But taking a full snapshot of the tree in 
the new model was measured before and it used to take about 20ms, for the 
outline of about 8300 nodes (LeoPyRef.leo).

With that snapshot time added new model is still at least 100 times faster 
than Leo in the worst case (for Leo) and about 6 times faster in the 
easiest case for Leo (with all nodes collapsed).

Even if the changing Leo tree is O(1) operation, it seems that overall 
operation is much worse. Increasing number of visible nodes from 28 to 8349 
(about 300 times more), caused time to grow from 0.1s to 4s or about 40 
times. This of course is not real proof, but for practical purposes will 
do. New model OTOH slows down by factor of 6 when increasing number of 
visible nodes 300 times.

What still remains to be done in my prototype is importing files, reading 
at-auto files which is wide area considering how many different languages 
Leo supports. I will try to make python, markdown and rest importers next 
because they are used in Leo, Leo's installation folder have files that I 
can use for the testing and comparing with the Leo importers.

For the javascript importer I have an idea to use node modules for parsing 
numerous javascript dialects. Existing node modules like babel have already 
solved the problem of different dialects and can produce necessary data for 
Leo to create well shaped outline. And those node modules work very fast. 
They analyze hundreds of js files in less than 10s.

And it is most likely that every user who wants to use Leo for editing 
javascript, has already installed node.

Vitalije.


-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this 

Re: Documentation on new data model for Leo outlines

2018-05-17 Thread Matt Wilkie

>
> > > col = (pos.x - grid.minx) / (grid.maxx - grid.minx) * grid.cols 
> > > 
> > +10! 
>
> Given the +10 :) I should add I've been using this lib.: 
>
> https://pypi.org/project/addict/#description 
>
 
Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-17 Thread Matt Wilkie

>
> I have just published first two articles about what I have done so far in 
> this mini-project. Here are the links:
>
 [...]

> All comments are welcome.
>
 
Vitalije, I'm working my through your posts on that site from the 
beginning. It's slow because I understand little because of my shallow 
coding background. However I want to say that enjoy them and learn things. 
Top of mind at the moment is your recognition that your initial attempts to 
communicate an idea didn't work and that a) building a demo might be 
effective, and b) choosing something that wasn't currently in discussion 
but had been before would invite deeper inspection and involvement. 

Thanks,

Matt

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-17 Thread Terry Brown
On Thu, 17 May 2018 12:00:30 -0700 (PDT)
Matt Wilkie  wrote:

> > Veering off topic I wish there was a more general solution for dot
> > access to dict members, I understand all the arguments against, but
> > often writing mathematical expressions...
> > 
> > col = (pos['x'] - grid['minx']) / (grid['maxx'] - grid['minx']) *
> > grid['cols']
> > 
> > vs.
> > 
> > col = (pos.x - grid.minx) / (grid.maxx - grid.minx) * grid.cols
> > 
> +10!

Given the +10 :) I should add I've been using this lib.:

https://pypi.org/project/addict/#description

I would have picked a different name, but `pip install addict` is
easier than managing my own copy / version of this functionality.

A minor point, I did notice that

d = Dict(aDict)

seems to deepcopy aDict, whereas

d = Dict()
d.update(aDict)

behaves as expected (altering values in d alters the corresponding
value in aDict).

Cheers -Terry

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-17 Thread Matt Wilkie
the rest of this thread is over my head, but:

Veering off topic I wish there was a more general solution for dot 
> access to dict members, I understand all the arguments against, but 
> often writing mathematical expressions... 
>
> col = (pos['x'] - grid['minx']) / (grid['maxx'] - grid['minx']) * 
> grid['cols'] 
>
> vs. 
>
> col = (pos.x - grid.minx) / (grid.maxx - grid.minx) * grid.cols 
>

+10! 

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-16 Thread Edward K. Ream
On Tue, May 15, 2018 at 5:04 PM, vitalije  wrote:

About script compatibility and rewriting Position class, I was thinking
> that it would be simplest thing to just let them be as they are now.
>
​...​

Both models can easily coexist. We can advice all users to write their
> scripts using new patterns and strategies to take the advantage of new
> model. But they can keep old scripts if they like.
>

​An interesting approach.  I'll have more to say about this particular
topic in my next post.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-15 Thread vitalije
Thanks for all your comments and suggestions. I agree that using integers 
to access the elements of attrs list is not very readable. There will be 
time to discuss it further when whole model is settled down a bit. It is 
still evolving, so I haven't seen much point in polishing it yet.

Today, I haven't progress very much because I had lots of interruptions. I 
have made small gui example. But before I show it I wish to make few more 
outline commands like promote/demote children, move up,left,right,down. I 
hope tomorrow I would be able to finish this prototype and publish it so 
that everyone can test it.

About script compatibility and rewriting Position class, I was thinking 
that it would be simplest thing to just let them be as they are now. 
Instead of changing Position and VNode, we can change execute-script 
command. Before executing script, we can rebuild the whole tree with legacy 
VNodes and Positions which should not take very long (less than 100ms I 
expect). Legacy scripts will run in the legacy environment and after they 
finish we can synchronize data in the new model which is also very fast 
operation. That way there will be no need for major changes in the legacy 
code, and we can start to port piece by piece functionality to the new 
model. Both models can easily coexist. We can advice all users to write 
their scripts using new patterns and strategies to take the advantage of 
new model. But they can keep old scripts if they like.

Anyway, I am still exploring what else may be improved in the new model.

Vitalije

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-15 Thread Edward K. Ream
On Tuesday, May 15, 2018 at 6:36:15 AM UTC-5, Edward K. Ream wrote:

all existing Position methods and generators must have exactly the same 
> *effect* as before.  Many (All?) Position methods and generators will 
> need to be rewritten.  That's fine. The new code will be simpler than the 
> old code.
>

Rewriting the Position methods will be straightforward, as I'll now 
explain. 

>From Vitalije's Theory of Operation 
: "Suppose that model keeps 
a list of all nodes in outline order. Positions would be represented by 
indexes in this list."

This makes p.moveToThreadNext() and p.moveToThreadBack() trivial.  Either 
increment or decrement the index, with appropriate bounds checks.

p.threadNext() is just self.copy().moveToThreadNext(). It (and all other 
"copy" methods) will not have to change.

Methods like p.moveToVisNext() is conceptually harder, but it may be 
possible to use the existing p.moveToVisNext as is. We shall see...

*Summary*

The representation of an index as an index will probably simplify all 
Position methods. In any event, there is no doubt whatever that all 
Position methods can be rewritten to preserve existing meaning exactly.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-15 Thread Edward K. Ream
On Tue, May 15, 2018 at 8:53 AM, Terry Brown  wrote:

>
> We've discussed this before I think, but returning tuples can be a real
> pain for extensibility.


​I agree. I think Vitalije is aware of the trade-offs.  For now, I'm happy
to let him choose whatever style he likes.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-15 Thread Terry Brown
On Tue, 15 May 2018 08:04:21 -0500
"Edward K. Ream"  wrote:

> On Tue, May 15, 2018 at 6:55 AM, Terry Brown 
> wrote:
> 
> > What about a namedtuple?
>  
> ​That or a bunch.  Just to be clear, ​the code itself may change to
> make this moot.

Realized after I posted that may have some memory impact, possibly not
an issue / very minor.

> Another possibility, which I often favor, is to unpack the tuple in
> place:
> 
> item1, item2, ... itemN = aTuple
> <>
> 
> This pattern appears already in other places in Vitalije's code.
> 
> Edward

We've discussed this before I think, but returning tuples can be a real
pain for extensibility.  Changing

  return foo, bar

to

  return foo, bar, etc

breaks all clients, whereas changing

  return {'foo': x, 'bar': y}

to

  return {'foo': x, 'bar': y, 'etc': z}

is very unlikely to break anything.

but if dicts are a performance issue a named tuple might be a good
balance.

Veering off topic I wish there was a more general solution for dot
access to dict members, I understand all the arguments against, but
often writing mathematical expressions...

col = (pos['x'] - grid['minx']) / (grid['maxx'] - grid['minx']) * grid['cols']

vs.

col = (pos.x - grid.minx) / (grid.maxx - grid.minx) * grid.cols

Cheers -Terry

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-15 Thread Edward K. Ream
On Tue, May 15, 2018 at 6:55 AM, Terry Brown  wrote:

What about a namedtuple?
>

​That or a bunch.  Just to be clear, ​the code itself may change to make
this moot.

Another possibility, which I often favor, is to unpack the tuple in place:

item1, item2, ... itemN = aTuple
<>

This pattern appears already in other places in Vitalije's code.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-15 Thread Terry Brown
On Tue, 15 May 2018 04:36:15 -0700 (PDT)
"Edward K. Ream"  wrote:

> *Readability will not affect performance*
> 
> attrs[gnx] is a tuple [h, b, ps, chn, sz[0]]. The components should
> be accessed via a bunch, or an enum, say,
> 
> e_h = 0
> e_b = 1
> e_ps = 2

What about a namedtuple?

Cheers -Terry

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-15 Thread Edward K. Ream
On Monday, May 14, 2018 at 9:44:11 AM UTC-5, Edward K. Ream wrote:
>
>
> On Mon, May 14, 2018 at 8:02 AM, vitalije  wrote:
>
>> I have just published first two articles about what I have done so far in 
>> this mini-project. Here are the links:
>>
>>1. Leo tree model - new approach 
>>
>>2. Leo tree model - loading from a file 
>>
>>
>> ​This is superb work.  I never dreamed the code could be improved so much.
>

Still true, after more reflection. Some comments:


*Strategy*
I have confidence that this project will be a success.  Sitting in the 
bathtub yesterday, I realized that code details don't matter much.  The 
only things that matter are:

1. Existing Leo scripts must not be impacted in *any* way.

In particular, all existing Position methods and generators must have 
exactly the same *effect* as before.  Many (All?) Position methods and 
generators will need to be rewritten.  That's fine. The new code will be 
simpler than the old code.

2. All existing unit tests must pass.

Naturally, unit tests of low-level details of VNodes and Positions can 
change as necessary.


*Summary of code*
I've studied the code more thoroughly.  My high-level summary of the read 
process:

Part 1: Create tuples.

Use xml.etree.ElementTree when reading .leo files.  Use regexs when reading 
external files.

Part 2: Use the tuples to create vnodes.

This is non-trivial, because @others and section references alter the tree 
structure.

I think that's all that most devs will need to know.

*Readability will not affect performance*

attrs[gnx] is a tuple [h, b, ps, chn, sz[0]]. The components should be 
accessed via a bunch, or an enum, say,

e_h = 0
e_b = 1
e_ps = 2
...

To prove that this has no effect on performance, I added a global count, 
gCount of the number of times a[n] was accessed in leoDataModel.py during 
the tests run in test_leo_data_model.py.  Here is the instrumented run, 
with the new count last:

tree size 8137 read in 488.24ms files:164
ok 471 471
Average: 24.82ms
checking write
ok
tree correct 5327576
pickle avg: 20.38ms
upickle avg: 14.93ms
profiling write_all
ok 695.47ms
gCount: 50723

And here is the timeit script showing the time penalty of using an enum 
constant, e_gnx, instead of hard constants in the a array.

number = 5
setup1 = 'a = [0]'
stmt1 = 'b = a[0]'
n1 = timeit.timeit(stmt=stmt1, setup=setup1, number=number)
n1 = float(n1)

setup2 = 'e_gnx = 0; a = [0]'
stmt2 = 'b = a[e_gnx]'
n2 = timeit.timeit(stmt=stmt2, setup=setup2, number=number)
n2 = float(n2)

print('n1:%8.6f' % n1)
print('n2:%8.6f' % n2)
print('n2-n2: %8.6f' % (n2-n1))

And here is the output on my machine:

n1:0.001246
n2:0.001259
n2-n2: 0.13

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.


Re: Documentation on new data model for Leo outlines

2018-05-14 Thread Edward K. Ream
On Mon, May 14, 2018 at 8:02 AM, vitalije  wrote:

> I have just published first two articles about what I have done so far in
> this mini-project. Here are the links:
>
>1. Leo tree model - new approach
>
>2. Leo tree model - loading from a file
>
>
> ​This is superb work.  I never dreamed the code could be improved so much.

*From #1*:

"The other thing I would try to achieve is some stability of positions in
tree. If a tree is modified some positions may become invalid, but those
positions that point to the nodes that are still part of the tree should be
valid even after the tree has been changed."

Imo, this is worth quite a bit of work.

*From #2*:

import xml.etree.ElementTree as ET

​This was new in Python 2.5, but I didn't realize that until just now.  I
am all for replacing the super-ugly sax parser.

*Summary*

I haven't studied this in detail, but what I see looks spot on.

I'm glad I did this morning's prototype.  The get_patterns function looks
like a generalization of my python-only code.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To post to this group, send email to leo-editor@googlegroups.com.
Visit this group at https://groups.google.com/group/leo-editor.
For more options, visit https://groups.google.com/d/optout.