This memo is a contribution to an overall design note/discussion needed for true SAX-style event streaming for large data streams of small items.
The unparser doesn't "stream" as yet, because all suspensions are queued until the end of all unparsing. For streaming to truly work, we must trim the infoset of nodes no longer needed that are already unparsed, and we must evaluate suspended expressions soon after the data they require becomes available. While not written up in a design note, the prior design point being considered was enhancing the infoset so that suspensions could be hung onto the infoset node that is blocking them, and they would then be re-activated only when that infoset node's state changes. This, however, is a lot of complexity for possibly small gain. An alternative design point is possible based on evaluating suspensions based on the same times that we determine we can detach infoset nodes that are trimmed because they are no longer needed. To achieve streaming behavior when parsing or unparsing, one must trim infoset nodes no longer needed from the infoset. A heuristic is that when you are done with Array element N, then you can delete array element N-1. The assumption here is that expressions aren't going to refer back to element N-2, but only to N-1. Similarly, forward-referencing expressions when unparsing are going to look forward at most from element N to N+1, not beyond that to N+2. If the suspensions are queued on the nearest enclosing array element's infoset node, then when that array index is advanced to position N+1, then position N-1's suspensions can be evaluated and it is very likely that they will complete. Those suspensions which do not complete would be moved upward to the next enclosing array element (eventually to the root element - where they are evaluated after the end of the unparser completes as in the current (v2.2.0 i.e., 2018-10-02) version. After evaluating the expressions for position N-1, then (if desired) the infoset node can also be dropped. This dropping/pruning of infoset nodes requires a new state for infoset node references that indicates that it was present, but has been dropped. This state would be used to issue a proper diagnostic if the element is subsequently referenced. This algorithm is conservative in that any suspensions that cannot be evaluated are pushed up, eventually all the way to the root element where they can then illustrate the deadlock, as the root element's suspensions work exactly as they do now - all are evaluated repeatedly until all are complete or the list is no longer changing and a deadlock exists. A variation is to simply mark the infoset nodes that can be dropped as "droppable", but to leave the actual dropping to the consuming application (this is for parsing), that way the consuming application can discard as soon as possible what it doesn't need, while at the same time the consuming application can keep around as much of the infoset as it wants. A callback-style API is also possible, where nodes are dropped by a call to an API-registered callback object method. This method can then do whatever the application requires. Also: On LIFO/FIFO queues. The queues used for suspensions are LIFO, which is perhaps not as helpful as a FIFO order would be. What we want is a LIFO/FIFO combination. When we pull items off, we want the FIFO order. When that item must be re-inserted because it was blocked still, we want to add it to the "end" so that it will not be reconsidered until every other element has been attempted. This combination LIFO/FIFO behavior enables short-lived suspensions to be evaluated typically once, not repeatedly attempted until they eventually succeed.
