Vitalije's new version of c.deletePositionsInList is likely correct. 
Vitalije thinks it is, and he has backed up his belief with a strong 
testing framework. That's good enough for me. I have closed #153 
<https://github.com/leo-editor/leo-editor/issues/1539>9, committed 
Vitalije's code to the 6.2 and devel branches. The new code will be part of 
Leo 6.2.

In this Engineering Notebook post, I shall prove that the new code is 
correct. Neither testing nor assertions that code is "obviously" correct 
are completely convincing.

*What is to be proved*

Vitalije and I both agree that all changes to Leo's outline data must 
ultimately change vnodes. There is nothing else to change! Indeed, 
positions are only views on the "underlying" data.

It is also clear (obvious :-) that the data to be changed are v.parents and 
v.children for v in some set of vnodes. Again, there is nothing else to 
change.

We must show that the particular set of changed vnodes in the new code is 
the correct set of vnodes, and that the changes made to that set are 
correct. This is far from obvious at first glance.

As a commentary on the proof, I'll want to explain how the code handles the 
following situations:

- The list of positions passed to c.deletePositionsInList contains 
duplicate positions.
- The list of positions contains positions that (depending on order) may 
already have been deleted.

An intuitive proof should aid the commentary.

*The new code*

Here is Vitalije's version of c.deletePositionsInList, with comments and 
other details omitted:

def deletePositionsInList(self, aList):
    c = self

    def p2link(p):
        parent_v = p.stack[-1][0] if p.stack else c.hiddenRootNode
        return p._childIndex, parent_v

    links_to_be_cut = sorted(set(map(p2link, aList)), key=lambda x:-x[0])

    # The main loop.
    for i, v in links_to_be_cut:
        child_v = v.children.pop(i)
        child_v.parents.remove(v)

*Assumptions*

We may assume that aList is indeed a list of positions. p2link will throw 
an exception otherwise.

We may also assume that all positions in aList are valid. 
c.deletePositionsInList should check for this.

*The proof*

To begin the proof, note that last three lines, the main loop, have the 
expected "form". That is, when we delete a node, we will:

1. Delete v.children[i] for some v and i. This deleted vnode is child_v.

2. Delete child_v from its parents array.

v.parents is an unordered array, so child_v.parents.remove(v) is all that 
is needed. This line might crash if one of the positions is invalid. It 
will also crash if v is not present in child_v.parents. I'll discuss this 
later.

To complete the proof, we must show that the main loop deletes all and only 
those vnodes implied by the positions in aList. We can tentatively assume 
that the main loop will work properly if that data in links_to_be_cut is 
correct, but there some subtleties lurking here which I'll discuss later.

The following code creates links_to_be_cut:

def p2link(p):
    parent_v = p.stack[-1][0] if p.stack else c.hiddenRootNode
    return p._childIndex, parent_v

links_to_be_cut = sorted(set(map(p2link, aList)), key=lambda x:-x[0])

This is elegant (compressed) code. Let's examine it carefully...

p2link produces tuples (childIndex, parent_v). These become bound to i, v 
in the main loop:

for i, v in links_to_be_cut:
    child_v = v.children.pop(i)
    child_v.parents.remove(v)

So parent_v must be the vnode whose i'th child should be deleted. This line 
does that:

parent_v = p.stack[-1][0] if p.stack else c.hiddenRootNode

That's correct. p.stack entries have the form (v, childIndex). p.stack[-1] 
is the position p's immediate parent vnode, if p has a parent. Otherwise, 
c.hiddenRootNode is the proper parent vnode.

p._childIndex is also the correct value for i.

In short, deleting position p means executing the main loop for (i, v) as 
returned by p2link(p).

*Ordering and filtering*

The code does not use p2link directly. Instead, there is this line:

links_to_be_cut = sorted(set(map(p2link, aList)), key=lambda x:-x[0]


Let's break this into parts, from "inside" to "outside"

- map(p2link, aList) applies p2link to every position in aList. The result 
is a list of tuples (i, v). This list may contain duplicates.

- set(map(p2link, aList) removes those duplicates. Let theSet be 
set(map(p2link, aList).

- Finally, sorted(theSet, key=lambda x:-x[0]) creates a generator 
equivalent to a sorted list.

The generator will deliver the tuples (i, v) in descending order of i, when 
several tuples (i, v) have the same vnode. The generator does not order 
tuples on v, and there appears to be no need to do so. I'll say a few words 
about this below.

It's easy to understand the intent of this sort order. The main loop 
contains this line:

child_v = v.children.pop(i)

The sort order means that pop(i) happens before pop(j) if i > j. This 
ensures that the precomputed values of i won't change in the main loop.

*Completing the proof*

We can see why Vitalije asserts that the code is sound. The tuples (i, v) 
computed from p2link(p) correctly describe the work to be done. Filtering 
and ordering ensure that the work is done once, and in such a way that the 
child indices i will not change during the main loop.

The (i, v) tuples are sorted on i, but not v. aList can contain positions 
in any order, so we know only that for any particular vnode v the main loop 
will handle tuples (i1, v), (i2, v) in the correct order.

There only remains one more question to answer. Does the order in which the 
main loop handles vnodes matter? That is, does it matter if a tuple (i, v) 
describes a position that the main loop has already deleted?

This was, and still is, the source of my uneasiness with this algorithm. 
However, I'm pretty sure there is no actual problem, for the following 
reasons:

1. Because of undo, vnodes never get deleted. They only get unlinked.

2. I can foresee no circumstance in which the order in which particular 
vnodes get unlinked can matter.

In particular, the main loop doesn't actually care whether a vnode has 
already been unlinked. The final question is whether these lines in the 
main loop could ever fail:

child_v = v.children.pop(i)
child_v.parents.remove(v)

Again, the answer is no. The sorting and filtering phase ensures that (i, 
v) are unique. The only cause for concern is the second line. Is v 
guaranteed to be in child_v.parents? Yes, because v.children[i] must exist.

This concludes the proof.

*Discussion*

This is very clever code. You could say that the line:

links_to_be_cut = sorted(set(map(p2link, aList)), key=lambda x:-x[0])

forms the basis of an inductive proof of correctness. The algorithm is 
sound because the tuples (i, v) contain no "harmful" duplicates, and that 
for any particular v, the set of tuples (i, v) appear in descending order 
of i.

It should be straightforward to make this code undoable. The undo code 
would reverse of the main loop, handling the tuples in ascending order of 
i. The redo code would be much like the original main loop.

*Summary*

I have provided a careful proof of correctness. This proof is by no means 
obvious. I welcome all comments and corrections.

The proof shows that vnodes can be handled in any order, provided only that 
for any particular v, the tuples (i, v) must be handled in ascending order 
of i. This resolves the question of whether deleting already deleted 
positions is valid.

It should be straightforward to make c.deletePositionsInList undoable. 
c.deletePositionsInList should ensure that all positions in aList are valid.

Vitalije has done us all a great service in fixing c.deletePositionsInList.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/leo-editor/ac81d42d-2dde-444d-b4e0-1eb542d67948%40googlegroups.com.

Reply via email to