Good morning Rusty,

> > > > Typical quotes suggested that commodity hardware would take 2 seconds to
> > > > find a route
> > > >  
> > > > Can you provide a reproducible benchmark or further qualify this number 
> > > > (2
> > > > seconds)?
> >
> > No reproducible benchmark.
>
> OK, on my digital ocean 2-cpu 4GB ram ntel(R) Xeon(R) CPU E5-2630L 0 @
> 2.00GHz (which is pretty old hardware), an unoptimized c-lighting
> implementation returns random routes on mainnet in:
>
> Between 3 to 347 msec, mean 220 msec.
>
> That's forking lightning-cli, querying, printing result and exiting
> (mainnet, 941 successes, 1353 failures, I ignored the times on
> failures since they were usually v fast).
>
> On my Raspberry Pi 2B (compiled with -O3):
>
> Between 21 to 3330 msec, mean 388 msec.

Thank you very much for this testing!

A rule of thumb in UX is "the user remembers your worst-case performance, not 
your average-case performance".
Perhaps it is a few tries on a RPi to a *really* remote node that started this 
random unqualified report of "2 seconds".

My understanding, Dijkstra, and other non-heuristic-guided algorithms, explore 
a "ball" around the starting node.
Thus, if the other end of the path is far, the ball to explore is larger and 
the practical algorithm runtime quickly goes up.

A\*, on the other hand, by use of the heuristic, tends to only form balls 
around large obstacles, but otherwise has a much smaller frontier it explores.
This may help reduce the worst-case times.

Currently I am working on an implementation of `getroutequick` for C-Lightning.
Basically, we periodically measure the distance of each node from our node, and 
store this in a cache in each node.
Then during pathfinding, we use A\* and use the stored distance-to-our-node as 
part of a differential heuristic.
Hopefully, the simple fact that we *have* a heuristic we can feed to A\* would 
be helpful in cutting down the worst-case runtime of pathfinding.

Dijkstra, A\*, and another algorithm called "Greedy Best First Search" have 
particular relationships to each other.
Basically, all three algorithms require a priority queue, where nodes are 
sorted in order from least-cost to most-cost.
The source node is put into the priority queue.
The processing loop takes a node from the priority queue (taking the least-cost 
node) and then expands each of the nodes it is connected to, pushing them into 
the priority queue according to their cost.
Their difference lies in how the priority used in the priority queue is 
computed:

* Dijkstra: f(n) = g(n)
* A\*: f(n) = g(n) + h(n)
* Greedy Best First: f(n) = h(n)

where:

* g(n) is the total cost from the source to this node
* h(n) is the estimated cost from this node to the destination.

I have come across very few references to Greedy Best First.
Here is one: 
http://theory.stanford.edu/~amitp/GameProgramming/AStarComparison.html#dijkstras-algorithm-and-best-first-search

Greedy Best First scans fewer nodes, but may yield non-optimal paths.
Dijkstra assuredly finds optimal paths, but scans many nodes due to its 
scanning a "ball" around the source.

Of note is that g(n) and h(n) should be "appropriately scaled" to each other 
when used in A\*.
That is, ideally h(n) should use the same units and should use the same costing 
estimates as costs of movement between nodes.
If h(n) is scaled down, then A\* behaves closer to Dijkstra (assured optimal 
path, but slow).
If h(n) is scaled up, then A\* behaves closer to Greedy Best First (faster, but 
may yield sub-optimal path).

Indeed, heuristic admissibiilty is simply the recognition that if we scale down 
h(n) so that it will never give more than the actual distance to target, A\* 
will fall back to Dijkstra.

-----

Priority Queues For Dijkstra and A\*
====================================

Dijkstra (and the related A\* and Greedy Best First) uses a priority queue.
Of note, is that Dijkstra tends to require an operation called "decrease 
priority".
This operation is used if a node is visited another time from a different path, 
which turns out to reduce its f(n).

However, according to this paper: 
http://www3.cs.stonybrook.edu/~rezaul/papers/TR-07-54.pdf

> all Dijkstra-NoDec implementations (i.e., AH-Dij, FBin-Dij, SBin-Dij, Al4-Dij 
> and Seq-Dij) ran at least 1.4 times faster than any Dijkstra-Dec 
> implementation (i.e.,BH-Dij, Bin-Dij and Pair-Dij).

Where:

* Dijkstra-NoDec is for implementations whose priority queue did *not* have 
this "decrease priority" operation (i.e. "NoDec")
* Dijkstra-Dec is for implementations whose priority queue *did* have this 
"decrease priority" operation.

That is: having a "decrease priority" operation was a drawback, not a benefit!

Of course, the algorithm has to ensure it does not expand the same node twice.
Often, the "NoDec" variants have to store the visited-ness of a node.

Now the visited-ness of a node can be stored in the structure that also stores 
the f(n) of that node.
f(n) is the priority of the node, and is basically the cost: under Dijkstra, 
the cost of reaching that node; under A\*, the estimated cost of reaching the 
goal node via the path that goes through this node.
And costs in Lightning are measurable in millisatoshis.

The maximum number of satoshis is known to fit in 53 bits.
Millisatoshis requires 1000x larger numbers, which requires 10 additional bits.
Thus, 63 bits can fit the largest possible cost (and if it costs more than what 
can fit 63 bits, then the cost of that path is immaterial: nobody can ever 
afford it, so saturating to this maximum value is perfectly valid).
Now, we can add one more bit as a flag meaning "we have pulled this node out of 
the priority queue and expanded its neighbors already", thus fitting into 64 
bits.


_______________________________________________
Lightning-dev mailing list
Lightning-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev

Reply via email to