Re: [Lightning-dev] Decoy node_ids and short_channel_ids
> (I'm seeking a clever way that Bob can assign them and trivially tell > which ID is assigned to which peer, but I can't figure it out, so I > guess Bob keeps a mapping and restricts each peer to 256 live scids?). Hi Rusty. Here's a potential way for Alice and Bob to agree a set of 256 scids without any additional messages or changes to existing messages beyond a feature flag and a flag in open_channel, but comes with a computational cost. Alice and Bob agree on a random integer `r`. This could be negotiated on `open_channel`, but we shouldn't need to send additional information because we already have a random integer we can use: the `temporary_channel_id`. This is not known to anybody besides Alice and Bob. When a channel is locked, Bob computes n=256 scids, using something approximating `concat(n, trunc_bytes(sha256(ec_mult(2^n*r, Q)), 7))`, where `Q` is Alice's public key for the channel funding transaction. The chance of scid collisions between channels is 2^56, which is probably no cause for concern. Instead of keeping a map of 256 scids for each channel, Bob can use a cuckoo filter for efficiency. The filter can be used for a quick membership test and also as an associative map from scids to channels. It can also support scid deletion in the event of channel closure (at the cost of recomputing 256 ec_mults again). So when Bob receives a new HTLC to forward, he tests it against his cuckoo filter and retreives a candidate set of possible channels to which it may refer. For each channel, he takes the most significant byte of the scid as `m` and performs `trunc_bytes(sha256(ec_mult(2^m*r, Q)), 7)` and tests the least-significant 7 bytes of the result against the scid. Alice does not need to keep all of the scids she may use for invoices because they can be computed on the fly, but she will need to keep a copy of the `temporary_channel_id`. In the reverse direction of Alice forwarding HTLCs to Bob, Bob's public key for the funding transaction is used instead. Regards, Mark Holden ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Routemap scaling (was: Just in Time Routing (JIT-Routing) and a channel rebalancing heuristic as an add on for improved routing success in BOLT 1.0)
Hi ZmnSCPxj, > Let me clarify: When you say "node" here, do you mean Lightning Network node? > Or do you mean instead an in-memory node? Neither. I meant a node in a tree. I tried to use the term bucket to make the distinction between this and Lightning node. The tree is not strictly in-memory. There is a purely conceptual global quadtree. Each bucket is essentially a view over the union of its 4 children. The bucket 00b is (b U 0001b U 0010b U 0011b). The bucket 0011b is (001100b U 001101b U 001110b 00b), etc. Each client choses a suitable max_depth <= HASHLEN/2, and the leaves of their tree contain a set of all PKH whose prefix matches the bucket index. A client will keep, at minimum, a subtree of the global quadtree (and both nodes for all channels where at least one of the nodes is in the same subtree). As you state, you can use any data structure for storing this in memory, but there are obvious benefits to using a quadtree-index to mirror the conceptual global one in terms of efficient querying, filtering, spilling, aggregating branch sizes. If many clients follow the convention then the optimisation opportunities arise, because the majority of routes will be discoverable with a single (possibly parallel) query to the network. Gossip size can be reduced and unwanted gossip can be eliminated, which alleviates the most constrained resource, bandwidth. If the conventions are widely followed, the benefits are maximized for everybody. Not following convention does not break things, it just limits the potential wins. --- > As I understand your proposal, taking over or taking down a node near the > root of the tree will make it difficult to look up several nodes at once. > This is because the tree nature makes nodes near the root of the tree (top of > the hierarchy) much more important than leaf nodes, which only worry about > themselves. If a node advertises itself near the top of the tree then it will be a better information provider than others, but this is never at the exclusion of others below it. All of the information a node in the parent bucket holds is held in part by the nodes in the child buckets. The parent does not know anything that other people don't know too. No nodes are "more important", but they might potentially be "more optimal". If you know about some nodes in bucket 0011b, but none in 0001b where a payment destination is, then you could query any node in 0011b and ask about any nodes in bucket 0001b. Since the buckets are nearby, there is a greater probability that they'll know about more nodes than somebody further away. This would be similar to how your proposal operates. If somebody did advertise being in bucket 00b, then they're able to find a potentially better (shorter) path to the destination because they know more information and you don't need to find a path through multiple buckets. If they are under DDoS, it doesn't bring down the network or limit access to the child buckets - it just makes it trivially less optimal to route because it *might* (low probability) require more queries to reach the destination. When querying, it is expected that if you know about any nodes in the same bucket as the payment destination, then there's a high probability that they will know a route to the destination. A parent bucket of that bucket is not any more likely to know about the destination, they have the same information. I've shown that at small depth, there's a high probability that you will have knowledge about a reasonable quantity of paths to every other bucket in the global network - so the majority of payments will only need to visit one bucket outside your own. The reason to specifically query parent buckets is limited, and not ultimately necessary. The only way I can see the problem you raise being a concern is if misconfigured clients have an unreasonably large depth for the amount of information in the network, such that there are few, or no channels from their own bucket to other buckets. In that case, they might become over-reliant on the parent's buckets to receive payments, but there are likely to be numerous parents at every depth of the tree (correctly configured clients), meaning there isn't a clear-cut target that an attacker could DDoS to try and disrupt a bucket. Nodes do not need to present the real depth at which they keep gossip, but there are potentially greater fee-earning opportunities if publicly advertising their maximum information capability. Nodes could counterbalance potential DDoS risk with higher routing fees, where people might pay more to have a greater chance of finding shorter routes in fewer queries, but a frugal client would simply chose the cheapest routes, and the more expensive routes via parent buckets would be less desirable to them - meaning DDoS on nodes in parent buckets may be wasted effort because it would drive people towards saving money on fees. Also, since
Re: [Lightning-dev] Routemap scaling (was: Just in Time Routing (JIT-Routing) and a channel rebalancing heuristic as an add on for improved routing success in BOLT 1.0)
Good morning ZmnSCPxj, I'll try to clarify my proposal further, but also have some questions about yours. --- > Now, it seems to me what you propose, is to have octrees contain octrees, and > so on. There's one global tree, which is the same for all users. Every node in the tree has a bucket and exactly 4 child nodes, except leaves have no children. The tree has a max depth, which each client sets itself. The theoretical maximum being HASHLEN/2 (In practice, we're likely to be concerned about <8). Note that every parent's bucket contains all of the information of all of its children's buckets - meaning the root node of the global quadtree is equivalent to the unfiltered global network. Nodes pick a depth and concern themselves with the bucket at that depth, unless it overflows, in which case they increase the depth by 1. > Now let us return, to my proposal of a distance measurement. > This effectively maps the LN universe onto a circle. > Each node commits itself to storing an arc of this circle, and possibly > various other nodes it happens to be directly connected to that may be far > from the arc near it. The quadtree can also be looked at from the perspective of a circle, assuming there is a reference point P on it. Each bucket is represented by an arc of size 2pi/(4^depth), and the PKH-prefix represents a displacement of the arc from P (Eg, the bucket 00b would represent an arc from P to pi/2+P). Bucket 00b includes all gossip information for the buckets b, 0001b, 0010b and 0011b, which are sub-arcs of it, and so forth. Spilling a bucket is the same as narrowing the arc to 1/4 its previous size. Although different from your arbitrary distance k (arc size 2*k?), they're not too dissimilar in the kind of distance covered - with the primary distinction being that mine proposes using set interval ranges (based on powers of 4), and the node picks the range which it's PKH fits into, rather than the range being centred on the node. --- > I can overload those N nodes by generating node addresses that also lie in > that octant. > By just iterating over scalars, about 1/8 of the generated node addresses > will lie in the target octant. If you target the first layer of the tree only, then once nodes spill to the second layer, they will filter out any gossip which does not concern them. It isn't sufficient to just brute force the first 2 bits, but you need to do it for 2^depth, where depth is the target's maximum they're willing to spill to (which is not shared). However, commodity hardware can attempt millions of ec_mult/hash per second, so getting a node into the bucket you want is trivial anyway for small depth. > In order to disrupt a particular node, I must place fake nodes near that > node, in an attempt to force its neighbors to reduce the arc of the circle > that they can map. > However, I need to generate fake nodes that are nearer to that node than > genuine honest nodes. > This means the probability of me generating such node addresses are much > lower than 1/8, thus requiring that I devote more energy to generating the > falsified attack nodes. I would argue that the above is also true in your approach. It would still be trivial to brute force the prefixes which minimize the distance between themselves and the target, even with a network of tens of millions of nodes. The amount of energy which would need devoting does not seem like it would a deciding factor for this attack. > Further, in executing this attack, while I disrupt one node very well, and > nearby nodes somewhat, my effect will be far less than disrupting 1/8 of the > network. Since the attack needs to target the maximum depth that nodes might spill to, then the amount of the network which could be affected by the attack would be 1/4^depth. I can't imagine it being very different in your distance based approach, since we're considering the same kind of distances, from a different perspective. --- > Once I get a node address in the targeted octant, I can commit a nominal > amount of Bitcoin into some 2-of-2 with an existing actual node I control and > the generated node address, then synthesize some node gossip containing those > node addresses. The way I see it, this potential attack affects the global network generally. Somebody could try this regardless of our proposals, and try to flood the whole network with spam. BOLT#7 only states that a node MAY forward information about nodes for which it does not know any channels. One could reconsider this approach if constrainted on resources, such as to say if depth>X, MUST NOT forward any information about nodes with no known channels, or perhaps have a variation of gossip which combines node information into channel broadcasts to ensure that such node spam can't occur. I think the vagueness of the spec on this rule is inidicative that it needs addressing. To try and mount an attack with greater chance of success, t
Re: [Lightning-dev] Routemap scaling (was: Just in Time Routing (JIT-Routing) and a channel rebalancing heuristic as an add on for improved routing success in BOLT 1.0)
Good morning ZmnSCPxj, thanks for the response. > I would be hesitant to divide the world in such manner. > I understand that in typical computer science, splitting your objects up into > smaller parts is a long-accepted method of doing things. > However, when it comes to finances and political power (the power to censor > or disrupt), such splitting is highly undesirable. > Thus I am strongly allergic to things that would "hierarchical"ize or > "bucket"ize or even just have a separation between "endpoint" and "core" > network. > I would rather propose for acceptance into BOLT, such proposals that would > keep the network homogeneous. Firstly, I completely agree with you that we should not be splitting up the network in any way which nodes or channels can be politically targeted, or in which any node is designated special privelege or status. I have the same allergy as you and took this into considerations when proposing this, and some of your suggestions are along a simlar line of thinking as mine. I think you might have misunderstood what I was proposing, but it's probably my fault for not expressing it well. With my suggestion, all nodes continue to be equal participants and there are no special nodes. I used the term "endpoint" previously to mean one of the two nodes which a regular channel belongs to, and not to mean some kind of special node. Any node can open channels to any other node - the "buckets" are merely a strategy for locally organizing gossip so that it can be made more efficient. The term "buckets" is used in the descriptions of other DHTs such as Kademlia too, and they don't refer to splitting up the global network, they merely provide a perspective for looking at subsections of it. I'm certainly interested in any solutions which keep the network permissionless because we really don't want to end up with DNS 2.0 or similar. > Various nodes may have different resources (BTC, CPU, memory, bandwidth, > latency). > Such inequalities are inevitable in this universe. > But these nodes should still, as much as we can, remain peers on the network. My suggestion accounts for the difference in computational requirements, as each node can determine its own depth in the tree based on the approximate quantity of information it wishes to gossip. A node could filter information to whichever depth of the tree they wished, by setting the bucket size at which they spill. This also allows for the size of gossip to dynamically shrink as the network grows, and is similar to garbage collection, in which anything which isn't part of the destination bucket on spilling is purged. Nodes could also pick (multiple) specific quadtree buckets to communicate all gossip about through filters they negotiate (the filter being the hash prefix of the desired bucket). It might be desirable to broadcast their filter as part of the gossip itself, so that other nodes can learn who are the better information providers, but again this would be completely optional. --- > This rule can be probabilistically fulfilled by having each node N know *at > least* routes to some nodes Ls..., where for all L <- Ls, dist(N, L) < some > constant. > The constant here can be selected by each node independently, depending on > its memory and CPU capacity. (the node knowing more routes than that is perfectly fine) The quadtree proposes a similar idea, but instead of using some constant, N knows about routes to Ls, where each L is within the same bucket. Essentially, i < dist(N, L) <= j, where i=BUCKET_MIN and j=BUCKET_MAX. For example, if using 8-bit ids and the first two bits identify the bucket, then i=, j=0011. I guess the equivalent distance using your constant would be k=0010, to cover the same range but without discriminating the results based on any boundaries like my suggestion does. > Then, we can have the following global rule: > * Given three nodes X, Y, and Z, and dist(X, Z) < dist(Y, Z), then X SHOULD > be more likely to know a route from itself to Z, than Y to know a route from > itself to Z, is also met for nodes which are in different buckets. This rule is essentially the same as what I was thinking for the distance between buckets. With the autopilot suggestion of decreasing the number of channels opened as distance increases, the probability of knowing about a neighbouring bucket is increased compared with a distant one. --- > It looks for any node L <- Ls such that dist(L, D) < dist(N, D). > In fact, it can just sort those nodes according to dist(L, D) and start with > the lowest distance, and fail once it reaches a dist(L, D) that exceeds its > own dist(N, D). > The above algorithm converges since dist(N, D) for each N that is delegated > to will progressively get smaller until we reach some N that knows the > destination D. This proposal also seems very similar to how existing DHTs work, unless I'm mistaken. My problem with the approach of e
Re: [Lightning-dev] Routemap scaling (was: Just in Time Routing (JIT-Routing) and a channel rebalancing heuristic as an add on for improved routing success in BOLT 1.0)
Hi ZmnSCPxj & René. One way you could have both determinism and encourage a diverse distribution of network maps is to treat it as a spatial indexing problem, where the space we use is the lexicographical space of the node ids (or hashes of), borrowing some similarities from DHTs. If for example, we take a quadtree, you can take the 2 most-significant bits of the public key hash, which would put your node into one of 4 buckets. Nodes could advertise a feature bit indicating that they are committed to keeping the entire routemap of the bucket which matches their own public key hash, which would be all of the nodes in the same bucket - and all of the channels with one or both of their endpoints in the bucket. I'd estimate that with a quadtree, information held by each node could be reduced to about 40% of that of the parent node in the quadtree, although the real amounts would depend on such things as autopilot defaults (ie, how many channels you open to nodes within your bucket versus channels to nodes in other buckets). Nodes could decide their own bucket capacities on which they wish to spill and reduce the amount of gossip by taking the 2 next most significant bits of the PKH, and could go several layers deep. A node which needs to make a payment to another node within its bucket should be able to do so without querying (unless there are no routes with the required capacity). If making a payment to another bucket, then there would still exist a decent number of channels in the local routemap to nodes in those other buckets, and these nodes could be queried to find the second half of a route to the destination, or could use JIT routing for the second half, assuming the first half of the route can be selected from the local routemap. In terms of relating this to "locality" in the geographical sense, one could create a convention where each bucket represents an approximate physical location. The globe can be spatially-indexed as a quadtree by taking a tetrahedral map projection (eg, Lee conformal projection[1]). The 4 equalateral triangles of the tetrahedron can be infinitely split again into 4 smaller equal-sized equalateral triangles for however many layers deep the quadtree might be. With this, it might be possible to have a convention where there is a relation between the lexicographical space and the geographical space, and wallet software would essentially brute force a private key to put you into the corresponding bucket to your physical location (trivial for the small number of bits we're talking about). Routing would be improved for local trade because you would have the entire local topology stored, and would only need to query when making payment at distance. (This may raise some privacy concerns which would need discussing.) One issue is that it would result in a very unbalanced tree given that population is dense in some areas and sparse in others. To overcome this, instead of using a conformal or equal-area projection, we might be able to use an equal-population-per-area projection, which I had never heard of such projection before but have found some research in regards to producing them[2]. Nodes would need to agree on the projection in order for this to work, but the work could be done once and the results open sourced and shared between the implementations. Autopilot implementations might also need adjusting to consider distance too. As a starting point I would suggest a geometric distribution, where half of opened channels should be within the same bucket, a quarter should be to sibling buckets, and an eight to cousin buckets, etc. This would result in increased probability of routing and reduced querying for local payments - paying your local coffee shop should be query-free - and payments to the other side of the world might require increased querying. There are also privacy considerations if nodes indicate their approximate locations which would need discussing. What do you think? Also, this method does not need the be the exclusive way in which gossip is communicated between nodes, and one might also combine with something like ZmnSCPxj has suggested, for gossiping about the highest capacity nodes. It might be also possible to share information about the highest capacity channels in a bucket too. [1]:https://en.wikipedia.org/wiki/Lee_conformal_world_in_a_tetrahedron [2]:https://www.pnas.org/content/101/20/7499.full (PS, sorry for the separate thread, LML will not let me subscribe to the list)___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev