@cblake Thanks for your encouragement, it _did_ motivate me. :-) Unfortunately, 
I only found the time this week.

So finally, here's a new take on the recursive tree iterator: 
[https://hg.sr.ht/~sschwarzer/lazytree](https://hg.sr.ht/~sschwarzer/lazytree) 
(clone with 
    
    
    hg clone https://hg.sr.ht/~sschwarzer/lazytree
    
    
    Run

if you want to play with the code.)

The iterator code is in 
[https://hg.sr.ht/~sschwarzer/lazytree/browse/src/lazytree.nim](https://hg.sr.ht/~sschwarzer/lazytree/browse/src/lazytree.nim)
 and tests in 
[https://hg.sr.ht/~sschwarzer/lazytree/browse/tests/test_lazytree.nim](https://hg.sr.ht/~sschwarzer/lazytree/browse/tests/test_lazytree.nim)
 .

Using the iterator: 
    
    
    rootNode = parseXML("<root>...</root>")
    
    for nodeInfo in lazyTree(rootNode):
      # Deleting modifies the tree under `rootNode and also has the effect
      # that the deleted node's child nodes won't be yielded. Of course
      # deleting nodes is optional. In most cases you want to keep the node.
      if nodeInfo.node.tag == "toBeDeleted":
        nodeInfo.parentNode.delete(nodeInfo.childIndex)
      # Do something else with `nodeInfo.node`, which is the actual node.
      # (`parentNode` and `childIndex` are mostly needed to be able to delete
      # a node from the tree.)
      ...
    
    
    Run

Some remarks/thoughts:

  * Contrary to my old implementation, this iterator goes forward. I had only 
used the odd backward iteration because it presumably made the implementation 
easier.
  * The node type is generic. For now, it must satisfy the API of `XMLNode` 
though.
  * I had planned to go with the "typical" iterator design with `hasNextNode` 
and `nextNode` (see 
[https://en.wikipedia.org/wiki/Iterator_pattern#Structure](https://en.wikipedia.org/wiki/Iterator_pattern#Structure)
 ), but I had trouble "coordinating" these similar procs "in my head." I found 
it far more natural to have a proc that returns an `option.some` for a new item 
and `option.none` when the iterator is exhausted.
  * The `while` loop in `func nextNodeInfoOption` used to be a nested `if`, as 
can be seen at 
[https://hg.sr.ht/~sschwarzer/lazytree/browse/src/lazytree.nim?rev=78731f21f65a4f2d889886ff06ce8cbbfb660aa0#L64](https://hg.sr.ht/~sschwarzer/lazytree/browse/src/lazytree.nim?rev=78731f21f65a4f2d889886ff06ce8cbbfb660aa0#L64)
 (here without the node deletion feature). I switched to the `while` loop to 
reduce the redundancy of the outer `elif` and the inner `if`. However, the 
`while` loop doesn't fit the thought process behind the algorithm as nicely 
because it obscures that actually at most two iterations should be required.
  * The iterator detects the deletion of the node by comparing the number of 
children of the parent of the deleted node to the previous call of 
`nextNodeInfoOption`. This check is rather "weak" because it won't detect - and 
handle - a lot of other changes that client code may do. So doing any other 
tree modification beside deleting the currently yielded node may have undefined 
results.
  * I'm not sure whether the helper type `NodeInfo` should be modeled with a 
variant type (as now) or with `Option` s for `parentNode` and `childIndex` 
(which don't apply for the root node of a tree).
  * One could provide a `proc deleteNode(nodeInfo: NodeInfo)` to make the 
deletion easier. (Probably I'll make the change.) I think it's still useful to 
expose at least `parentNode`, so you can delete a node based on the `tag` value 
of the parent, for example.
  * I've been thinking about a variant of the current implementation where I'd 
create a `seq` of iterators for each child node when I start iterating over a 
parent node. This approach should make it easier to control the iteration: 
normally the code would iterate over the child iterators in the `seq`. Deletion 
of a node by client code would make the iterator modify the `seq` and continue 
processing it as before. A downside I see with this approach is that it would 
create `seq` s of iterators for all currently processed nodes in advance. Then 
again, this should only matter if nodes have big numbers of immediate child 
nodes.
  * Working on the code with this new approach was _much_ easier than the 
previous implementation with the stack (or even several). This time, I never 
even thought of using a debugger. :-) Only in one instance I added a 
`debugEcho`, but the presumed bug in the iterator turned out to be in the test. 
:-D



Do you see any edge cases where the algorithm would fail? If you think that a 
code comment doesn't fit the code, this could be an indication. ;-)

Other feedback is welcome, too!

Reply via email to