> Now the big issue is that Cheney is a semispace collector, which basically 
> means that during collection we need twice the memory, which absolutely sucks.
> Obviously ZmnSCPxj has gone insane and just wants us to spend twice as much 
> memory on the routemap right after shaving only a few bytes off the 
> representations of nodes and channels.
>
> However, nothing in the algorithm actually requires that tospace be in core 
> memory.
> We could instead write the tospace objects into a disk file.
> Cheney "just" requires two pointers, which we can implement simply as opening 
> the tospace file twice, once for append (allocation pointer) and one for 
> read/write (scan pointer).
>
> We need two tospace files, one for node objects and one for channel objects, 
> but in any case, once the Cheney run has completed, we can close the disk 
> files, wait for any pending pathfinding queries to complete (and temporarily 
> block new pathfinding queries), then reload the in-memory arrays from the 
> tospace file(s).
>
> This may make this technique slightly more palatable for lower-power devices, 
> which often still have some slightly larger amount of free disk space 
> compared to memory space.

In fact, this is a cop-out and should be rejected.
Semispace collection is just bad for memory utilization, and lower-resource 
devices do not have memory. or sufficient permanent storage, and it is apparent 
that ZmnSCPxj is just trolling you at this point.

Instead, we should review why qsort, in its in-place array-sorting form, is 
considered faster than mergesort, even though the average number of compares 
and so on are approximately the same.

The reason for that is that qsort, as originally developed, was an in-place 
array-sort, unlike mergesort which is almost impossible to implement as an 
in-place sort without tearing your cognitive substrate out and sacrificing your 
firstorn sentience to the outer gods, and the sheer reduction in cache pressure 
of in-place sorting due to touching smaller amounts of memory is generally why 
qsort tends to beat mergesort in sort contests.

And the reason why qsort *can* be an in-place sort with only constant amount of 
extra space needed to run it, is because (1) the elements of an array are equal 
size and (2) you can always swap two entries of an array in constant space.

Now the original reason why Cheney two-finger semispace is even a *semispace* 
collector, and thus requires twice as much memory while it is running, is that 
objects could be different sizes.
But remember, we came into this with the realization that by using linked lists 
we can make all the objects a constant fixed size, making it far amenable to 
organize them into large arrays of objects.

And just as we observe in qsort, that by swapping the pivot to a known location 
and then swapping it between the two partitions after we have partitioned the 
current array before recursing into the two sub-arrays, we can realize that we 
can make Cheney an in-space algorithm rather than a semispace one, by using 
similar swap operations.

So let us return to the motivating example:

     A --- B --- C --- D
     |   /       |
     |  /        |
     | /         |
     E --- F --- G

Now let us pretend that objects have been allocated willy-nilly in the array of 
nodes, and the nodes are located at random in the array of nodes.
The node A, being "our" own node, is still the first, because frankly any new 
node is going to at least know itself and will thus allocate itself as the 
first node in the node memory space.

    A G B F C E D

We start the Cheney collection in much the same manner, except that the scan 
pointer and the alloc pointer point to the same space/array that the nodes are 
already in.
That is, there is no longer a tospace and fromspace, just a single space where 
the Cheney algorithm works inside.
So we start the scan and alloc pointers like so:

    A G B F C E D
    ^ ^

Now we start by scanning A, which has links to B and E.
We swap B to the node that the alloc pointer is pointed at, then advance the 
alloc pointer:

    A B G F C E D
    ^   ^

Then we swap E to the node that the alloc pointer is pointed at, then advance 
the alloc pointer:

    A B E F C G D
    ^     ^

With this, we have completed scanning A and can advance the scan pointer by one 
unit:

    A B E F C G D
      ^   ^

Now we can pause collection --- the graph is still valid and we can still 
perform routing queries on it --- but let us continue.

We scan B.
We know that A and E have already been collected, because their 
indices/addresses are less than the alloc pointer.
However, C is greater than or equal to the alloc pointer, so we swap it with 
the node in the alloc pointer and advance the alloc pointer:

    A B E C F G D
      ^     ^

There are no other nodes that B is linked to, so we advance the scan pointer:

    A B E C F G D
        ^   ^

E is connected A, B, and F, A and B are below the alloc pointer, but F is at 
the alloc pointer and thus has to be collected as well.
Fortunately for us, F is also *at* the alloc pointer, so we do not need to 
actually perform any swap, just advance the alloc pointer:

    A B E C F G D
        ^     ^

E has no other links, so we advance the scan pointer:

    A B E C F G D
          ^   ^

Continuing the algorithm, we then end up with the result:

    A B E C F D G

Which is the same result we would have gotten if we were using a semispace 
Cheney as well.

-------

Now we might wonder if we can just swap nodes around arbitrarily.
Obviously, if some other thread is accessing the graph, then this is dangerous.
However, if only a single thread or process has access to the graph at any one 
time, we are fine.

As noted, routing queries are read-only and thus cannot possibly violate the 
tri-color invariant, thus we can simply pause the Cheney algorithm after 
advancing the scan pointer in order for the thread to service routing queries, 
gossip updates, or channel closes.
The graph remains functional and correct, though its heuristic might be 
inaccurate (remember, this is a heuristic, you should still use a 
Dijkstra-family algorithm to recover from cases where the heuristic is wrong in 
practice).

Then the only references to the nodes, when transitioning between GC tasks to 
query/mutator tasks, are the table of node IDs to nodes, which we can update as 
well when we swap two nodes.

The gossip handler only needs to handle one case, which is to promote a White 
object to a Gray object if a new channel is created between a Black and a White 
object.
Objects to the left of the scan pointer are Black, Gray is equal or greater to 
the scan pointer but less than the alloc pointer, and White is greater or equal 
to the alloc pointer.
Promoting a White object to Gray is simply done by swapping with the object at 
the alloc pointer, then advancing the alloc pointer, as normal.


With this, we can manage channel objects using a freelist instead of a GC, only 
use GC for the nodes themselves, which is helpful so that channels do not get 
churned around so much.

Further, even once the Cheney algorithm ends (scan == alloc), the nodes beyond 
the scan/alloc pointer are definitely unreachable from this node.
We can mark this point, and routefinding queries for nodes beyond this border 
can just fail immediately with no route.
Any nodes beyond this point might be true garbage, or they may eventually 
reconnect to our island.
This can be differentiated by checking for nodes that have no channels: we move 
nodes with channels to the space where nodes without channels are, and then 
finally discover the point at which we can reset the global allocation pointer 
for new nodes.



Regards,
ZmnSCPxj


_______________________________________________
Lightning-dev mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev

Reply via email to