Using payment correlation attacks adversary can try to link the sender and
receiver of payment by observing traffic from the potential sender to the
potential receiver. Such observations can be made by the adversary nodes if
they are present on the payment path or if the adversary is able to monitor the
network traffic of the potential sender and receiver. In some circumstances,
the adversary can detect not only his presence on the payment path, but also if
the monitored nodes are the sender and receiver.
___ ___
| | | | | | | |
--| S |->| A1 |--> . -->| A2 |->| R |--
|___| || || |___|
S - potential sender
R - potential receiver
A1, A2 - adversary surveillance node
For big well-distributed networks, these forms of attacks are very costly and
can be economically applied only on a small set of nodes. However, if a network
is centralized, with the majority of traffic passing through a small number of
big nodes, an adversary's job is much easier. An adversary can monitor traffic
on those nodes, or in the case of a state-funded surveillance adversary, an
adversary can acquire a court order to get complete access to big nodes routing
information.
The adversary job can be simplified if:
- A1 and A2 are the same nodes. The sender and receiver are connected through a
single node, the adversary node.
- A2 and R are the same nodes. The receiver is some form of custodial wallet,
directly controlled or in collusion with the adversary. The adversary is going
to be aware of all income transactions. The only thing left is to find out who
the sender is.
- S and A1 are the same nodes. If the sender is some form of custodial wallet,
directly controlled or in collusion with the adversary, the sender has no
privacy, so correlation attacks are unnecessary.
- S->A1 is an unpublished channel. An adversary can identify S as the sender
for all payments originating from S and passing through A1.
- A2->R is an unpublished channel. An adversary can identify R as the receiver
for all payments destined for R and passing through A2.
The most notable LN payment correlations in order of severity are:
* Hash correlation
* Amount correlation
* CLTV correlation
* Timing correlation
Hash correlation
Hash correlation is the most straightforward to detect for surveillance nodes.
If adversary nodes A1 and A2 observe a payment with the same hash, they can
confidently conclude that they are on the payment path. However, the adversary
cannot yet determine with enough certainty whether S is the sender and R the
receiver. Yet, when combined with other correlation attacks or network topology
examination, the adversary can establish such a conclusion with enough
probability.
Fortunately, payment hash correlation is soon expected to be fixed with point
time lock contracts (PTLCs)[1]. Each payment hop will use a unique lock
contract point, so there will be no information that can correlate different
payments.
Amount correlation
==
Payment amount correlation is only slightly better than hash correlation in
terms of privacy because the receiver amount on each hop is mixed with the fees
of all the downstream nodes. Fees on LN are just a tiny fraction of the amount,
so for the attacker fees are not an issue, especially in combination with
timing correlation attacks.
Single-path payments are the most vulnerable to amount correlation attacks.
Besides the fact that nodes A1 and A2 will see a payment with roughly the same
amount, node A2 depending on the payment amount, can conclude that R is a
receiver. For instance, if the receiver is a shop that sells some product for X
satoshis, and if the attacker sees a payment of around X satoshis, he can be
sure that this payment goes to that shop node.
Multi-path payments have better privacy because the amount is now split into
multiple parts. The attacker can not easily find out what product the sender is
buying. But there is still a potential correlation factor, depending on how we
split the payment amount.
If we split the payment into equal parts, the attacker still can find out if a
partial payment is multiple of the price of some of the shop products. Also,
those sub-payment paths will be easily distinguishable by the amount, just like
in the case of single-path payment.
So, what can be done to de-correlate sub-path payment amounts?
Rather than splitting the payment amount into equal parts, we split it into
predefined values. For instance: 10k, 20k, 50k, 100k, 200k, 500k, 1000k, ...
satoshis. Just like physical cash. By doing so, every individual payment is
part of a much larger anonymity set consisting of all the payments at that
moment. Using this approach, we can split a payment into as many paths as
needed until we get to the exact number of satoshis. Splitting the payment
amount into enough sub-paths to get