Re: [Lightning-dev] Unjamming lightning (new research paper)

2022-11-10 Thread Clara Shikhelman
Hi all,

We are planning a call to discuss this proposal further. It will be on
Monday the 14th, at 7 pm UTC here:
https://meet.jit.si/UnjammingLN

Please let me know if this conflicts with any other Bitcoin event.

Hope to see you all there!

On Thu, Nov 3, 2022 at 1:25 PM Clara Shikhelman 
wrote:

> Hi list,
>
> We would like to share with you our recent research on jamming in
> Lightning. We propose a combination of unconditional (~ upfront) fees and
> local reputation to fight jamming. We believe this can be a basis for an
> efficient and practical solution that can be implemented in the foreseeable
> future.
>
> The full paper is available [1].
>
> We classify jams into quick (resolve in seconds, mimicking honest
> payments) and slow (remain in-flight for hours or days). Fees
> disincentivize an attack where quick jams are constantly resolved and sent
> again. Reputation, in turn, allows nodes to deprioritize peers who
> consistently forward slow jams.
>
> We believe that our proposal is practical and efficient. In particular, we
> have shown that the additional (unconditional) fees can be relatively low
> (as low as 2% of the total fee) to fully compensate jamming victims for the
> lost routing revenue. Moreover, the total unconditional fee paid for all
> failed attempts stays low even if the failure rate is reasonably high. This
> means that the UX burden of paying for failed attempts is also low. A
> straightforward PoC implementation [2] demonstrates one approach to
> implementing the fee-related aspect of our proposal.
>
> Further sections provide more details on our approach and results.
>
> # Jamming
>
> As a reminder, jamming is a DoS attack where a malicious sender initiates
> payments (jams) but delays finalizing them, blocking channels along the
> route until the jams are resolved. Jamming may target liquidity or payment
> slots.
>
> We distinguish between quick and slow jamming. Quick jamming implies that
> jams are failed and re-sent every few seconds, making them hardly
> distinguishable from honest failing payments. In slow jamming, jams remain
> in-flight for hours.
>
> # Unconditional fees
>
> We propose unconditional fees to discourage quick jamming. Currently, jams
> are free because routing nodes don’t charge for failed payment attempts.
> With unconditional fees, however, jamming is no longer free.
>
> Our simulations indicate that unconditional fees don’t have to be too
> high. Under certain assumptions about the honest payment flow, a fee
> increase by just 2% (paid upfront) fully compensates a routing node under
> attack. Our simulator is open-source [3]. A PoC implementation demonstrates
> one approach to implementing unconditional fees and only requires minor
> changes [2].
>
> We have also considered the UX implications of paying for failed attempts.
> We have concluded that this should not be a deal-breaker, as the total
> unconditional fee paid stays low even if the failure rate is reasonably
> high (even as high as 50%). Privacy and incentives are also discussed in
> the paper.
>
> # Reputation
>
> Fees are not very effective in preventing slow jamming: this type of
> attack requires only a few jams, therefore, fees would have to be too high
> to be effective. Instead, we address slow jamming using local reputation.
>
> As per our proposal, nodes keep track of their peers’ past behavior. A
> routing node considers its peer “good” if it only forwards honest payments
> that resolve quickly and bring sufficient fee revenue. A peer that forwards
> jams, in contrast, loses reputation. Payments endorsed by a high-reputation
> peer are forwarded on the best efforts basis, while other (“high-risk”)
> payments can only use a predefined quota of liquidity and slots. Unless the
> attacker has built up a reputation in advance, it cannot fully jam a
> channel with at least some liquidity allocated exclusively to low-risk
> payments. Nodes parameterize their channels according to their risk
> tolerance.
>
> # Alternatives and Future Work
>
> In this work, we strive for a systematic approach. First, we list five
> properties a potential mitigation strategy should have: effectiveness,
> incentive compatibility, user experience, privacy and security, and ease of
> implementation. Then, we go over the design decisions to be made when
> constructing a countermeasure against jamming. Based on the desired
> criteria and the available options, we converge on a solution.
>
> Multiple approaches to jamming mitigation have been discussed on this list
> and elsewhere. Many of them may well be worth exploring, such as
> resolution-time-dependent fee amounts or stake certificates for reputation
> building. However, we believe that our solution strikes a good balance: it
> addresses the problem in question and is relatively straightforward to
> implement.
>
> We would love to bring this idea closer to implementation, and we plan to
> discuss it over the next spec meeting [4] (Monday, 2022-11-07). 

Re: [Lightning-dev] Fat Errors

2022-11-10 Thread Joost Jager
Pushed a golang implementation of the fat errors here:
https://github.com/lightningnetwork/lightning-onion/pull/60

Joost.

On Wed, Oct 19, 2022 at 1:12 PM Joost Jager  wrote:

> Hi list,
>
> I wanted to get back to a long-standing issue in Lightning: gaps in error
> attribution. I've posted about this before back in 2019 [1].
>
> Error attribution is important to properly penalize nodes after a payment
> failure occurs. The goal of the penalty is to give the next attempt a
> better chance at succeeding. In the happy failure flow, the sender is able
> to determine the origin of the failure and penalizes a single node or pair
> of nodes.
>
> Unfortunately it is possible for nodes on the route to hide themselves. If
> they return random data as the failure message, the sender won't know where
> the failure happened. Some senders then penalize all nodes that were part
> of the route [4][5]. This may exclude perfectly reliable nodes from being
> used for future payments. Other senders penalize no nodes at all [6][7],
> which allows the offending node to keep the disruption going.
>
> A special case of this is a final node sending back random data. Senders
> that penalize all nodes will keep looking for alternative routes. But
> because each alternative route still ends with that same final node, the
> sender will ultimately penalize all of its peers and possibly a lot of the
> rest of the network too.
>
> I can think of various reasons for exploiting this weakness. One is just
> plain grievance for whatever reason. Another one is to attract more traffic
> by getting competing routing nodes penalized. Or the goal could be to
> sufficiently mess up reputation tracking of a specific sender node to make
> it hard for that node to make further payments.
>
> Related to this are delays in the path. A node can delay propagating back
> a failure message and the sender won't be able to determine which node did
> it.
>
> The link at the top of this post [1] describes a way to address both
> unreadable failure messages as well as delays by letting each node on the
> route append a timestamp and hmac to the failure message. The great
> challenge is to do this in such a way that nodes don’t learn their position
> in the path.
>
> I'm revisiting this idea, and have prototyped various ways to implement
> it. In the remainder of this post, I will describe the variant that I
> thought works best (so far).
>
> # Failure message format
>
> The basic idea of the new format is to let each node (not just the error
> source) commit to the failure message when it passes it back by adding an
> hmac. The sender verifies all hmacs upon receipt of the failure message.
> This makes it impossible for any of the nodes to modify the failure message
> without revealing that they might have played a part in the modification.
> It won’t be possible for the sender to pinpoint an exact node, because
> either end of a communication channel may have modified the message.
> Pinpointing a pair of nodes however is good enough, and is commonly done
> for regular onion failures too.
>
> On the highest level, the new failure message consists of three parts:
>
> `message` (var len) | `payloads` (fixed len) | `hmacs` (fixed len)
>
> * `message` is the standard onion failure message as described in [2], but
> without the hmac. The hmac is now part of `hmacs` and doesn't need to be
> repeated.
>
> * `payloads` is a fixed length array that contains space for each node
> (`hop_payload`) on the route to add data to return to the sender. Ideally
> the contents and size of `hop_payload` is signaled so that future
> extensions don’t require all nodes to upgrade. For now, we’ll assume the
> following 9-byte format:
>
>   `is_final` (1 byte) | `duration` (8 bytes)
>
>   `is_final` indicates whether this node is the failure source. The sender
> uses `is_final` to determine when to stop the decryption/verification
> process.
>
>   `duration` is the time in milliseconds that the node held the htlc. By
> observing the series of reported durations, the sender is able to pinpoint
> a delay down to a pair of nodes.
>
>   The `hop_payload` is repeated 27 times (the maximum route length).
>
>   Every hop shifts `payloads` 9 bytes to the right and puts its own
> `hop_payload` in the 9 left-most bytes.
>
> * `hmacs` is a fixed length array where nodes add their hmacs as the
> failure message travels back to the sender.
>
>   To keep things simple, I'll describe the format as if the maximum route
> length was only three hops (instead of 27):
>
>   `hmac_0_2` | `hmac_0_1`| `hmac_0_0`| `hmac_1_1`| `hmac_1_0`| `hmac_2_0`
>
>   Because nodes don't know their position in the path, it's unclear to
> them what part of the failure message they are supposed to include in the
> hmac. They can't just include everything, because if part of that data is
> deleted later (to keep the message size fixed) it opens up the possibility
> for nodes to blame others.
>
>   The solution here is to