[Lightning-dev] Handling nonzerobasefee when using Pickhard-Richter algo variants

ZmnSCPxj via Lightning-dev Mon, 30 Aug 2021 03:58:32 -0700

Good morning list,

It seems to me that the cost function defined in Pickhard-Richter
can in fact include a base fee:


    -log(success_probability) + fudging_factor * amount * prop_fee + 
fudging_factor * base_fee

(Rant: why do mathists prefer single-symbol var names, in software
engineering there is a strong injunction **against** single-char
var names, with exceptions for simple numeric iteration loop vars,
I mean keeping track of the meaning of each single-symbol var name
takes up working memory space, which is fairly limited on a wetware
CPU, which is precisely why in software engineering there is an
injunction against single-char var names, srsly.)

It seems to me that the above is "convex", as the `fudging_factor`
for a run will be constant, and the `base_fee` for the channel
would also be constant, and since it seems to me, naively, that
the paper defines "convex" as "the second derivative >= 0 for all
`amount`", and since the derivative of a constant is 0, the above
would still remain convex.
(I am not a mathist and this conjecture might be completely asinine.)

So, it seems to me that the *real* reason for `#zerobasefee` is
not that the **mincostflow** algorithm cannot handle non-zero `base_fee`,
but rather, the **disect** phase afterwards cannot handle non-zero
`base_fee`.

For example, suppose the minflowcost algorithm were instructed to
deliver 3,000msat from `S` to `D`, and returned the following flow:

    S -->3000--> A ->1000-> B
                 |          |
                 |        1000
                 |          v
                 +-->2000-> C ->3000-> D

In the "disect" phase afterwards, the above flow solution would have
to be split into two sub-payments, a 1000 sub-payment `S-A-B-C-D`
and a 2000 sub-payment `S-A-C-D`.

However, this does mean that the base cost for `C->D` and `S->A` in
the above flow will pay *twice* the base cost than what the above
cost function would have computed, because in reality two independent
payments pass through those channels.

Thus, the current Pickhardt-Richter scheme cannot work with non-zero
base fees if it were modified to consider fees, but not due to the
mincostflow algorithm, rather because due to the disect algorithm.

### Converting Non-Zero Base Fees To Proportional Fees

An alternative solution would be to optimize for a *maximum* fee
rather than optimize for the *exact* fee.

We can do this by virtually splitting up the entire payment into
smaller bunches of value, and asking mincostflow to solve individual
bunches rather than individual msats.

For example, we can decide that the smallest practical HTLC would
be 1000 msat.
Let us call this the "payment unit", and the mincostflow algo
solves for paying in these payment units instead of 1msat units.
(Actual implementations would need to have some heuristic or
reasoned rule-of-thumb assumption on what the smallest practical
HTLC would be.)

In our example, suppose we need to send 2001msat from `S` to `D`,
with the payment unit being 1000 msat.
Then we would actually have the mincostflow algorithm work to
deliver 3 payment units (3 * 1000msat) from `S` to `D`.

Let us suppose that the mincostflow algorithm returns the following
flow:

    S -->3-----> A ->1----> B
                 |          |
                 |        1 |
                 |          v
                 +-->2----> C ->3----> D

Actually, what the mincostflow algorithm uses as a cost function
would be:

    -log(success_probability) + fudging_factor * amount * prop_fee * 
payment_unit + fudging_factor * base_fee * amount

    == -log(success_probability) + fudging_factor * amount * (prop_fee * 
payment_unit + base_fee)

    where: amount is in units of 1000msat, our smallest practical HTLC.
           payment_unit is the unit, i.e. 1000msat.

What the above means is that the mincostflow algorithm *allocates*
`3 * base_fee` for `C->D`, since the `amount` flowing through
`C->D` would be 3.
However, when we pass the above flow to the disect algorithm, it
would actually only split this into 2 sub-payments, so the
actual payment plan would only pay `2 * base_fee` for the
`C->D` leg on both sub-payments.

In short, this effectively converts the base fee to a proportional
fee, removing the zerobasfee requirement imposed by the disect
algorithm.

That is, the cost computed by the mincostflow algorithm is really a
maximum cost budget that the subsequent disect algorithm could later
spend.

In effect, this converts the base fee to a proportional fee.

This may be acceptable in practice.
This approximation has a bias against non-zerobasefee --- it would
treat those channels as being far more expensive than they actually
would end up being in an *actual* payment attempt --- but at least
does not *require* zerobasefee.
It would be able to still use non-zerobasefee channels if those are
absolutely required to reach the destination.

This should at least help create a practical payment algorithm that
handles current LN with nonzerobasefee, and provide an impetus
towards making zerobasefee a reality.

Note that this approximation would be non-optimal --- there may
be solutions that provide lower costs than the output of this
approximation would provide.

It may be practical to have the payment unit be adjustable by a
higher-level (but still automated) process.
For example, it may be considered that having more than 500 splits
would be heuristically determined to be impractical, in which
case the payment unit could be the actual amount divided by 500
rounded up.
Then if that results in extreme overestimation of base costs
(i.e. the subsequent disect stage results in far fewer than 500
splits) the payment unit could be adjusted upwards.
Similarly, if that results in a payment plan that has high
base fees, the payment unit could be adjusted downwrds.

--

Similarly, having particular split amounts is helpful to reduce
the ability of intermediate nodes snooping on payment amounts,
so having a particular fixed payment unit may also be useful
nevertheless, even if this approximation is otherwise undesirable.

That is, in the above example, we just send 3 sub-payments
instead of the minimum 2, so that every sub-payment is always
uniformly 1000msat.
This leaks the least amount of information to intermediate nodes,
especially after payment decorrelation is widely deployed.
(this is the same principle for why a uniform unit amount is
needed in practical CoinJoin implementations; it increases the
anonymity set of individual sub-payments.)

This would be helped greatly by payment decorrelation;
intermediate nodes would remain uncertain that multiple
HTLCs with the same amount and the same in-channel and
out-channel are from the same overall payment or not.
Assuming many software uses the same payment unit consistently,
less information can be extracted by surveillors, and the
increased cost may be considered justifiable in paying for
additional privacy.

--

### Whole-flow Payments

Suppose we were to instead modify the LN protocol such that,
instead of MPP having individual sub-payments that split at the
source and join at the destination, we allow the source to
instruct arbitrary intermediate nodes to join and/or split
payments.

That is, suppose LN allowed the source to indicate to arbitrary
forwarding nodes to split payments and join them.
(This would require nearly all nodes to upgrade, which may be
impractical, but let us consider its implications first without
committing to actually doing this.)

A node that is a join node would be instructed to wait for all
incoming HTLCs to arrive before they create any outgoing
HTLCs.
A node that is a split node would make more than one outgoing
HTLCs.
Forwarding nodes can be both join and split nodes.

This would allow the output of the mincostflow algorithm to be
used directly, without the problematic disect stage.
Nodes that have more than one in-flow would be join nodes,
while nodes with more than one out-flow would be split nodes.

Returning to the example:

    S -->3000--> A ->1000-> B
                 |          |
                 |        1000
                 |          v
                 +-->2000-> C ->3000-> D

In the above, `S` sends a *single* payment out on channel
`S->A`, and instructs `A` to split the payment.
Then `S` also instructs `C` to join two payments, resulting
in a single HTLC out on `C->D` if `C` receives on both
`A->C` and `B->C`.

The effect is that, for a single payment plan (presented as a
flow, outputted by a mincostflow algorithm) for each channel
with nonzero flow, there would only be one HTLC that gets paid
for using a single `base_fee`.

Then, the base fee computed by the cost function would be exact
and there would be no underestimation of the base fee for
non-zerobasefee channels.

Let us then consider the practicality of this scheme.

If a hop were to fail to deliver, then some join nodes would
be left hanging, waiting for an incoming HTLC that will not
arrive.

The failure would have to be handled by a split node, when it
receives a `update_fail_htlc`.
That split node cannot propagate the failure backwards to the
source, since it would still have other outgoing HTLCs in
play, which it cannot safely ignore; it can only propagate
the failure backwards if all the other outgoing HTLCs also
fail.

To support this, a split node could give a request-to-fail
signal to its other outgoing HTLCs.
Then, any other join nodes still waiting for an incoming
HTLC (and would not yet have created any outgoing HTLCs,
or for the destination, would not have claimed any incoming
HTLCs) would fail all its incoming HTLCs.
Otherwise, an intermediate node receiving a request-to-fail
signal from any incoming HTLCs would simply propagate
this request-to-fail outwards to all its outgoing HTLCs.

This implies that for this scheme, if *any* hop fails, the
entire payment fails and has to be restarted "from the top".

This is in contrast with the current scheme of splitting
only at the source and joining only at the destination.
In the current LN, if *any* hop of an MPP fails, only the
sub-payment(s) going through that hop will need to be
retried.
Other sub-payment(s) can keep being "in play", and the
source, having all the information needed, only needs to
recompute the failing outgoing payments.

On the other hand, the same information (particular hops are
failing) would be propagated back to the source, and the
source having to recompute the entire flow rather than just
some subset of the payments is not much worse computationally
(work is still dominated by `O((m*m+m*n)*log n)` where `m` is
the channels of the *entire network* and `n` is the nodes
of the *entire network*, amount is just a piddling little
`O(log(u))` term).
It also removes the need to virtually reduce the capacity of
intermediate hops while other sub-payments are in play
(however, a practical implementation would need to consider
the possibility of multiple *simultaneous* outgoing payments
anyway and would still retain this reduction of capacity in
case a new parallel outgoing payment is triggered while one
is still ongoing).

This leads us to some fairly interesting choices:

* [ ] Whole-flow payment is a brilliant idea and we should
  totally do it.  `#deletethezerobasefeedebate`
* [ ] Whole-flow payment is a horrible idea and we should
  totally not do it.
  * [ ] As we can see, the LN community needs to get together
    for better payment success, and supporting whole-flow
    payments is worse than setting `basefee=0`. `#zerobasefee`
  * [ ] As we can see, the current LN provides a vital service
    (partial failure of MPP) and intermediate nodes need to be
    compensated for this service.  `#nonzerobasefee`

Regards,
ZmnSCPxj
_______________________________________________
Lightning-dev mailing list
Lightning-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev

[Lightning-dev] Handling nonzerobasefee when using Pickhard-Richter algo variants

Reply via email to