# Re: [Lightning-dev] Routemap scaling (was: Just in Time Routing (JIT-Routing) and a channel rebalancing heuristic as an add on for improved routing success in BOLT 1.0)

```Good morning ZmnSCPxj,

I'll try to clarify my proposal further, but also have some questions about
yours.```
```
---

> Now, it seems to me what you propose, is to have octrees contain octrees, and
> so on.

There's one global tree, which is the same for all users. Every node in the
tree has a bucket and exactly 4 child nodes, except leaves have no children.
The tree has a max depth, which each client sets itself. The theoretical
maximum being HASHLEN/2 (In practice, we're likely to be concerned about <8).
Note that every parent's bucket contains all of the information of all of its
children's buckets - meaning the root node of the global quadtree is equivalent
to the unfiltered global network. Nodes pick a depth and concern themselves
with the bucket at that depth, unless it overflows, in which case they increase
the depth by 1.

> Now let us return, to my proposal of a distance measurement.
> This effectively maps the LN universe onto a circle.
> Each node commits itself to storing an arc of this circle, and possibly
> various other nodes it happens to be directly connected to that may be far
> from the arc near it.

The quadtree can also be looked at from the perspective of a circle, assuming
there is a reference point P on it. Each bucket is represented by an arc of
size 2pi/(4^depth), and the PKH-prefix represents a displacement of the arc
from P (Eg, the bucket 00b would represent an arc from P to pi/2+P). Bucket 00b
includes all gossip information for the buckets 0000b, 0001b, 0010b and 0011b,
which are sub-arcs of it, and so forth. Spilling a bucket is the same as
narrowing the arc to 1/4 its previous size.

Although different from your arbitrary distance k (arc size 2*k?), they're not
too dissimilar in the kind of distance covered - with the primary distinction
being that mine proposes using set interval ranges (based on powers of 4), and
the node picks the range which it's PKH fits into, rather than the range being
centred on the node.

---

> I can overload those N nodes by generating node addresses that also lie in
> that octant.
> By just iterating over scalars, about 1/8 of the generated node addresses
> will lie in the target octant.

If you target the first layer of the tree only, then once nodes spill to the
second layer, they will filter out any gossip which does not concern them. It
isn't sufficient to just brute force the first 2 bits, but you need to do it
for 2^depth, where depth is the target's maximum they're willing to spill to
(which is not shared).

However, commodity hardware can attempt millions of ec_mult/hash per second, so
getting a node into the bucket you want is trivial anyway for small depth.

> In order to disrupt a particular node, I must place fake nodes near that
> node, in an attempt to force its neighbors to reduce the arc of the circle
> that they can map.
> However, I need to generate fake nodes that are nearer to that node than
> genuine honest nodes.
> This means the probability of me generating such node addresses are much
> lower than 1/8, thus requiring that I devote more energy to generating the
> falsified attack nodes.

I would argue that the above is also true in your approach. It would still be
trivial to brute force the prefixes which minimize the distance between
themselves and the target, even with a network of tens of millions of nodes.
The amount of energy which would need devoting does not seem like it would a
deciding factor for this attack.

> Further, in executing this attack, while I disrupt one node very well, and
> nearby nodes somewhat, my effect will be far less than disrupting 1/8 of the
> network.

Since the attack needs to target the maximum depth that nodes might spill to,
then the amount of the network which could be affected by the attack would be
1/4^depth. I can't imagine it being very different in your distance based
approach, since we're considering the same kind of distances, from a different
perspective.

---

> Once I get a node address in the targeted octant, I can commit a nominal
> amount of Bitcoin into some 2-of-2 with an existing actual node I control and
> the generated node address, then synthesize some node gossip containing those

The way I see it, this potential attack affects the global network generally.
Somebody could try this regardless of our proposals, and try to flood the whole
network with spam.

BOLT#7 only states that a node MAY forward information about nodes for which it
does not know any channels. One could reconsider this approach if constrainted
on resources, such as to say if depth>X, MUST NOT forward any information about
nodes with no known channels, or perhaps have a variation of gossip which
combines node information into channel broadcasts to ensure that such node spam
can't occur. I think the vagueness of the spec on this rule is inidicative that

To try and mount an attack with greater chance of success, the attacker should
need to open many new channels, for which they face obvious constraints on the
bitcoin network, and real costs. And since the attack can only attempt to
degrade LN at best, and offers no guarantee of censoring, it seems unlikely
that an attacker would exhaust significant resources to try and target somebody
with a low chance of success.

The attacker would also likely use nominal amounts for the channels they're
trying to attack with, as they would not want to lock up significant funding.
It may be trivial to filter many tiny capacity channels as they're not likely
to be useful for making payments anyway.

---

> This unbalances the octree such that one octant has a far lerger number of
> nodes inside it than the other octants.
> The N nodes that promised to keep track of the routemap within the octant
> find that they need to take up more and more memory to store the octant data
> because of this targeted attack.
> Eventually, the N nodes start dropping out of this responsibility since they
> run out of resources.

Each node would treat the quadtree as a perfectly balanced one at the depth
they're concerned. The buckets themselves vary in capacity, and if the gossip
reaches capacity at any depth, the bucket spills to the next depth. If an
attacker managed to overflow a leaf bucket, then the potential victim would
need to consider another gossip approach.

On determining whether one is being attacked though, it should be possible to
detect based on information size before and after spilling. If one is being
specifically targeted, then the information before and after spilling will be
almost the same, with little saving on gossip, where normally spilling might
expect a 50%+ saving. A client could set a low threshold on the minimum saving
they expect to get from spilling, and if it is not met, they might determine
that they are being targeted, and take countermeasures.

To me it seems like your distance narrowing suffers the same issue. As the size
of your gossip grows, you will narrow the distance k for which you concern
yourself with gossip - but at some point you must have a minimum k, else it
will be too narrow to have any useful information. What happens if you decrease
k to some minimum, and the amount of spam still overflows the capacity limits
you've set?

> Once the number of nodes that hold that octant drops, I can then switch to
> targeted DDoS attacks on the remaining octant-mapping nodes, especially easy
> to locate them since they openly broadcast the fact that they promise to map
> the octant they are in.
> Now the octant becomes hard to access for 7/8 of the network.
> I have successfully executed a partitioning attack and disrupted the
> operation of an entire octant.

There are no "octant-mapping nodes". Every node knows about every channel from
its own bucket to other buckets. This remains true even after spilling -
although the quantity of information is reduced. Performing a DoS on a node who
stores the whole network information does not affect other users, because
they're not exclusive holder of information. Every single participant of the
network could operate at depth=2. There does not need to exist anybody who has
the full network map.

Also, if most nodes follow a reasonable autopilot strategy, then every node in
a bucket also has at least some channels to other buckets - meaning the target
of a DDoS would essentially be the entire bucket to try and completely isolate
it. Remember that I'm suggesting half (or more) of the open channels in a
bucket are to other buckets.

A node doesn't necessarily need to broadcast the depth at which they gossip,
but could indicate a deeper depth, so as to reduce the amount of queries made
to it, and hide the truth about the full information they carry. They shouldn't
indicate a smaller depth than the one at which they gossip as they'll be unable
to answer to a majority of queries.

If a node is being targeted they could specify a large depth at which it would
be infeasible for an attacker to target, whilst secretly maintaining
information at a smaller depth, but by doing so, they're also stating that
they're not going to be widely useful.

I'm also not suggesting this be the only gossip strategy. At minimum I would
also want friend-of-a-friend-of-a-friend type gossip in addition to this,
because those channels are the ones we'll most likely be routing payments over
anyway, so we should have them cached separately from this quadtree proposal.
Several other gossip approaches could be used too.

---

> Crucially no node admits to how large an arc of this circle they map out,
> only that they are more likely to map points nearer to themselves.

> This is because nodes only commit to *probabilistically* being more likely to
> know nodes near to them, than nodes that are not near to them.

I can see how using a probabilistic approach might mitigate the issue to some
extent, but it doesn't seem like a cure. Since the information nearest to you
is more proabable to be forwarded to you, an attacker could still try to
overwhelm your resources by plaing enough nodes nearby. How is the probability
decided? Does your node probabilistically filter incoming information, or do
you signal to your peers to probabilisitcally filter it?

In the latter case, while not necessarily admitting what you store, it seems
like you'll still leak hints about it. What information are you proposing gets
communicated (if any) between peers to filter gossip being sent?

My approach is based on the idea that the broadcaster will always know whether
or not to forward you information based on the filter you give them, with the
PKH. A node will forward you all information about any channel which has one or
both nodes matching the bit mask, and both of the nodes for every channel.
Everything else gets filtered (unless other filters or strategies are agreed).
A node forwarding you gossip you've specifically requested to be filtered could
be considered malicious.

The filter a node communicates with its peer does not necessarily need to
indicate the amount of the network they actually gossip about. For example, a
node wishing to have all information about bucket 00b might request filters for
buckets 0000b, 0001b, 0010b and 0011b from different peers. If each peer
forwards information for their buckets, then the node will learn of all
information in 00b without ever revealing to anybody, at the cost of some
duplication.

---

> They do not commit, as in your proposal, to absolutely knowing everything
> within their committed area.

Ultimately, my proposal is a convention and not a hard rule. Any commitment
comes with limits and it isn't to be *assumed* that a node will know everything
about their local topology, only that they are very likely to know it under
normal circumstances. It doesn't prevent anyone from using other techniques, or
to try routing through different buckets (which one might want to try for
increasing privacy).

Any feature must be considered the same. Nodes could advertize that they
support a feature and then make no commitment to following it.

> Nodes that openly broadcast that they know the nodes within the same octant
> they are, are nodes that want to be DDoS'ed.
> Thus, my definition of a "special" node is a lot looser than you seem to
> define it.
> Anything that makes it possible to point to a node and say "this is a good
> node to attack, in order to disrupt Lightning" is special.

Acknowledged, but I'm not convinced there is a DDoS motive. Disrupting specific
nodes doesn't necessarily disrupt other users. Obviously, it can disrupt the
friend and friend-of-a-friend of the target, but this will always be true. The
gossip network specifically intends to publish friend-of-a-friend information,
so targeted DDoS attacks on individuals and their friends are not necessarily
avoidable.

---

> * No node shall reveal the extent of its knowledge of the network, since if
> it reveals that it knows the entire network, it may become a target for
> attack in order to degrade the network.

My main question about your proposal is, given that bandwidth is a key resource
constraint, what strategy are you using to prevent the sending/receiving of
unwanted gossip information, which also prevents leaking information about
which gossip you're interested in?

>   * This also implies that if a node does not know the location of some node,
> node that might have better information; otherwise it would be possible to
> profile nodes to determine how much of the network they know.

I'm not sure I agree. I'd prefer to go the opposite way and always fail fast.

It could be specified that if you've indicated that you'll answer queries for a
specific bucket, and somebody requests information about a node or channel
outside of the bucket, and which there exists no direct channel from within
your bucket to that node, then you should automatically fail the query, even if
you know the information.

A query will then either return a very likely yes (on information within the
bucket), or a definite no (anything not in the bucket). Nothing more than which
was previously specified is leaked, even if you keep more gossip information
than you have publicized. If probing can only reveal information that is
already public, there is nothing to be gained.

---

PS, don't take any of this to be a dismissal of your proposal because I fully
I think there are potential useful optimizations in my approach which I'm not
sure how equivalent could be acheived with yours.
But if I am convinced there are unresolvable problems with the idea I'll drop
it and focus on your approach.

Regards,
Mark H
_______________________________________________
Lightning-dev mailing list
Lightning-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
```