Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Awesome, thanks Alex. Just one follow up. > Running the numbers, I currently see 15,761 public nodes on the network and > 148,295 half channels. Those each need refreshed gossip every two weeks. By > default that would result in 90% channel updates. And the rationale for each channel needing refreshed gossip every 2 weeks is to inform the network that the channel is still active (i.e. not disabled) and its parameters haven't changed? (I did look it up in BOLT 7 [0] but it wasn't clear to me that a channel would be assumed to be inactive/disabled if there wasn't a channel_update for 2 weeks.) That seems a lot of gossip to me if the recommended behavior of routing nodes is to maintain ~100 percent uptime and only when absolutely necessary change the parameters of the channel. I guess the alternative of significantly less gossip messages and a potential uptick in failed routes would be worse though. [0]: https://github.com/lightning/bolts/blob/master/07-routing-gossip.md#rationale-4 -- Michael Folkson Email: michaelfolkson at [protonmail.com](http://protonmail.com/) Keybase: michaelfolkson PGP: 43ED C999 9F85 1D40 EAF4 9835 92D6 0159 214C FEE3 --- Original Message --- On Wednesday, June 29th, 2022 at 7:07 PM, Alex Myers wrote: > Hi Michael, > > Thanks for the transcript and the questions, especially those you asked in > Gleb's original Erlay presentation. > > I tried to cover a lot of ground in only 30 minutes and the finer points may > have suffered. The most significant difference in concern between bitcoin > transaction relay and lightning gossip may be one of privacy: Source nodes of > Bitcoin transactions have an interest in privacy (avoid trivially > triangulating the source.) Lightning gossip is already signed by and linked > to a node ID - the source is completely transparent by nature. The lack of a > timing concern would allow for a global sketch where it would have been > infeasible for Erlay (among other reasons such as DoS.) > >> Why are hash collisions a concern for Lightning gossip and not for Erlay? Is >> it not a DoS vector for both? > > If lightning gossip were encoded for minisketch entries with the > short_channel_id, it would create a unique fingerprint by default thanks to > referencing the unique funding transaction on chain - no hashing required. > This was Rusty's original concept and what I had been proceeding with. > However, given the ongoing privacy discussion and desire to eventually > decouple lightning channels from their layer one funding transaction (gossip > v2), I think we should prepare for a future in which channels are not > explicitly linked to a SCID. That means hashing just as in Erlay and the same > DoS vector would be present. Salting with a per-peer shared secret works > here, but the solution is driven back toward inventory sets. > >> It seems you are leaning towards per-peer sketches with inventory sets (like >> Erlay) rather than global sketches. > > > Yes. There are pros and cons to each method, but most critically, this would > be compatible with eventual removal of the SCID. > >> Erlay falls back to flooding if the set reconciliation algorithm doesn't >> work which I'm assuming you'll do with Lightning gossip. > > Fallback will take some consideration (Erlay's bisect is an elegant feature), > but yes, flooding is still the ultimate fallback. > >> I was also surprised to hear that channel_update made up 97 percent of >> gossip messages. Isn't it recommended that you don't make too changes to >> your channel as it is likely to result in failed routed payments and being >> dropped as a routing node for future payments? It seems that this advice >> isn't being followed if there are so many channel_update messages being sent >> around. I almost wonder if Lightning implementations should include user >> prompts like "Are you sure you want to update your channel given this may >> affect your routing success?" :) > > Running the numbers, I currently see 15,761 public nodes on the network and > 148,295 half channels. Those each need refreshed gossip every two weeks. By > default that would result in 90% channel updates. That we're seeing roughly > three times as many channel updates vs node announcements compared to what's > strictly required is maybe not that surprising. I agree, there would be a > benefit to nodes taking a more active role in tracking calls to broadcast > gossip. > > Thanks, > Alex > > --- Original Message --- > On Wednesday, June 29th, 2022 at 6:09 AM, Michael Folkson > wrote: > >> Thanks for this Alex. >> >> Here's a transcript of your recent presentation at Bitcoin++ on Minisketch >> and Lightning gossip: >> >> https://btctranscripts.com/bitcoinplusplus/2022/2022-06-07-alex-myers-minisketch-lightning-gossip/ >> >> Having followed Gleb's work on using Minisketch for Erlay in Bitcoin Core >> [0] for a while now I was especially interested in how the challenges of >> using
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Hi Michael, Thanks for the transcript and the questions, especially those you asked in Gleb's original Erlay presentation. I tried to cover a lot of ground in only 30 minutes and the finer points may have suffered. The most significant difference in concern between bitcoin transaction relay and lightning gossip may be one of privacy: Source nodes of Bitcoin transactions have an interest in privacy (avoid trivially triangulating the source.) Lightning gossip is already signed by and linked to a node ID - the source is completely transparent by nature. The lack of a timing concern would allow for a global sketch where it would have been infeasible for Erlay (among other reasons such as DoS.) > Why are hash collisions a concern for Lightning gossip and not for Erlay? Is > it not a DoS vector for both? If lightning gossip were encoded for minisketch entries with the short_channel_id, it would create a unique fingerprint by default thanks to referencing the unique funding transaction on chain - no hashing required. This was Rusty's original concept and what I had been proceeding with. However, given the ongoing privacy discussion and desire to eventually decouple lightning channels from their layer one funding transaction (gossip v2), I think we should prepare for a future in which channels are not explicitly linked to a SCID. That means hashing just as in Erlay and the same DoS vector would be present. Salting with a per-peer shared secret works here, but the solution is driven back toward inventory sets. > It seems you are leaning towards per-peer sketches with inventory sets (like > Erlay) rather than global sketches. Yes. There are pros and cons to each method, but most critically, this would be compatible with eventual removal of the SCID. > Erlay falls back to flooding if the set reconciliation algorithm doesn't work > which I'm assuming you'll do with Lightning gossip. Fallback will take some consideration (Erlay's bisect is an elegant feature), but yes, flooding is still the ultimate fallback. > I was also surprised to hear that channel_update made up 97 percent of gossip > messages. Isn't it recommended that you don't make too changes to your > channel as it is likely to result in failed routed payments and being dropped > as a routing node for future payments? It seems that this advice isn't being > followed if there are so many channel_update messages being sent around. I > almost wonder if Lightning implementations should include user prompts like > "Are you sure you want to update your channel given this may affect your > routing success?" :) Running the numbers, I currently see 15,761 public nodes on the network and 148,295 half channels. Those each need refreshed gossip every two weeks. By default that would result in 90% channel updates. That we're seeing roughly three times as many channel updates vs node announcements compared to what's strictly required is maybe not that surprising. I agree, there would be a benefit to nodes taking a more active role in tracking calls to broadcast gossip. Thanks, Alex --- Original Message --- On Wednesday, June 29th, 2022 at 6:09 AM, Michael Folkson wrote: > Thanks for this Alex. > > Here's a transcript of your recent presentation at Bitcoin++ on Minisketch > and Lightning gossip: > > https://btctranscripts.com/bitcoinplusplus/2022/2022-06-07-alex-myers-minisketch-lightning-gossip/ > > Having followed Gleb's work on using Minisketch for Erlay in Bitcoin Core [0] > for a while now I was especially interested in how the challenges of using > Minisketch for Lightning gossip (node_announcement, channel_announcement, > channel_update messages) would differ to the challenges of using Minisketch > for transaction relay on the base layer. > > I guess one of the major differences is full nodes are trying to verify a > block every 10 minutes (on average) and so there is a sense of urgency to get > the transactions of the next block to be mined. With Lightning gossip unless > you are planning to send a payment (or route a payment) across a certain > route you are less concerned about learning about the current state of the > network urgently. If a new channel pops up you might choose not to route > through it regardless given its "newness" and its lack of track record of > successfully routing payments. There are parts of the network you care less > about (if they can't help you get to your regular destinations say) whereas > with transaction relay you have to care about all transactions (paying a > sufficient fee rate). > > "The problem that Bitcoin faced with transaction relay was pretty similar but > there are a few differences.For one, any time you introduce that short hash > function that produces a 64 bit fingerprint you have to be concerned with > collisions between hash functions. Someone could potentially take advantage > of that and grind out a hash that would resolve to the same fi
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Thanks for this Alex. Here's a transcript of your recent presentation at Bitcoin++ on Minisketch and Lightning gossip: https://btctranscripts.com/bitcoinplusplus/2022/2022-06-07-alex-myers-minisketch-lightning-gossip/ Having followed Gleb's work on using Minisketch for Erlay in Bitcoin Core [0] for a while now I was especially interested in how the challenges of using Minisketch for Lightning gossip (node_announcement, channel_announcement, channel_update messages) would differ to the challenges of using Minisketch for transaction relay on the base layer. I guess one of the major differences is full nodes are trying to verify a block every 10 minutes (on average) and so there is a sense of urgency to get the transactions of the next block to be mined. With Lightning gossip unless you are planning to send a payment (or route a payment) across a certain route you are less concerned about learning about the current state of the network urgently. If a new channel pops up you might choose not to route through it regardless given its "newness" and its lack of track record of successfully routing payments. There are parts of the network you care less about (if they can't help you get to your regular destinations say) whereas with transaction relay you have to care about all transactions (paying a sufficient fee rate). "The problem that Bitcoin faced with transaction relay was pretty similar but there are a few differences.For one, any time you introduce that short hash function that produces a 64 bit fingerprint you have to be concerned with collisions between hash functions. Someone could potentially take advantage of that and grind out a hash that would resolve to the same fingerprint." Could you elaborate on this? Why are hash collisions a concern for Lightning gossip and not for Erlay? Is it not a DoS vector for both? It seems you are leaning towards per-peer sketches with inventory sets (like Erlay) rather than global sketches. This makes sense to me and seems to be moving in a direction where your peer connections are more stable as you are storing data on what your peer's understanding of the network is. There could even be centralized APIs which allow you to compare your current understanding of the network to the centralized service's understanding. (Of course we don't want to have to rely on centralized services or bake them into the protocol if you don't want to use them.) Erlay falls back to flooding if the set reconciliation algorithm doesn't work which I'm assuming you'll do with Lightning gossip. I was also surprised to hear that channel_update made up 97 percent of gossip messages. Isn't it recommended that you don't make too changes to your channel as it is likely to result in failed routed payments and being dropped as a routing node for future payments? It seems that this advice isn't being followed if there are so many channel_update messages being sent around. I almost wonder if Lightning implementations should include user prompts like "Are you sure you want to update your channel given this may affect your routing success?" :) Thanks Michael P.S. Are we referring to "routing nodes" as "forwarding nodes" now? I've noticed "forwarding nodes" being used more recently on this list. [0]: https://github.com/bitcoin/bitcoin/pull/21515 -- Michael Folkson Email: michaelfolkson at [protonmail.com](http://protonmail.com/) Keybase: michaelfolkson PGP: 43ED C999 9F85 1D40 EAF4 9835 92D6 0159 214C FEE3 --- Original Message --- On Thursday, April 14th, 2022 at 22:00, Alex Myers wrote: > Hello lightning developers, > > I’ve been investigating set reconciliation as a means to reduce bandwidth and > redundancy of gossip message propagation. This builds on some earlier work > from Rusty using the minisketch library [1]. The idea is that each node will > build a sketch representing it’s own gossip set. Alice’s node will encode and > transmit this sketch to Bob’s node, where it will be merged with his own > sketch, and the differences produced. These differences should ideally be > exactly the latest missing gossip of both nodes. Due to size constraints, the > set differences will necessarily be encoded, but Bob’s node will be able to > identify which gossip Alice is missing, and may then transmit exactly those > messages. > > This process is relatively straightforward, with the caveat that the sets > must otherwise match very closely (each sketch has a maximum capacity for > differences.) The difficulty here is that each node and lightning > implementation may have its own rules for gossip acceptance and propagation. > Depending on their gossip partners, not all gossip may propagate to the > entire network. > > Core-lightning implements rate limiting for incoming channel updates and node > announcements. The default rate limit is 1 per day, with a burst of 4. I > analyzed my node’s gossip over a 14 day period, and found that, of all >
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
On 5/26/22 8:59 PM, Alex Myers wrote: Ah, this is an additional proposal on top, and requires a gossip "hard fork", which means your new protocol would only work for taproot channels, and any old/unupgraded channels will have to be propagated via the old mechanism. I'd kinda prefer to be able to rip out the old gossip sync code sooner than a few years from now :(. I viewed it as a soft fork, where if you want to use set reconciliation, anything added to the set would be subject to a constricted ruleset - in this case the gossip must be accompanied by a blockheight tlv (or otherwise reference a blockheight) and it must not replace a message in the current 100 block range. It doesn't necessarily have to reference blockheight, but that would simplify many edge cases. The key is merely that a node is responsible for limiting it's own gossip to a predefined interval, and it must be easily verifiable for any other nodes building and reconciling sketches. Given that we have access to a timechain, this just made the most sense. Ah, good point, you can just add it as a TLV. It still implies that "old-gossip" can't go away for a lont time - ~everyone has to upgrade, so we'll have two parallel systems. Worse, people are relying on the old behavior and some nodes may avoid upgrading to avoid the new rate-limits :(. If some nodes have 60 and others have 600099 (because you broke the ratelimiting recommendation, and propagated both approx the same time), then the network will split, sure. Right, so what do you do in that case, though? AFAIU, in your proposed sync mechanism if a node does this once, you're stuck with all of your gossip reconciliations with every peer "wasting" one difference "slot" for a day or however long it takes before the peer does a sane update. In my proposed alternative it only appears once and then you move on (or maybe once more on startup, but we can maybe be willing to take on some extra cost there?). This case may not be all that difficult. Easiest answer is you offer a spam proof to your peer. Send both messages, signed by the offending node as proof they violated the set reconciliation rate limit, then remove both from your sketch. You may want to keep the evidence it in your data store, at least until it's superceded by the next valid update, but there's no reason it must occupy a slot of the sketch. Meanwhile, feel free to use the message as you wish, just keep both out of the sketch. It's not perfect, but the sketch capacity is not compromised and the second incidence of spam should not propagate at all. (It may be possible to keep one, but this is the simplest answer.) Right, well if we're gonna start adding "spam-proofs" we shouldn't start talking about complexity of tracking the changed-set :p. Worse, unlike tracking the chanaged-set as proposed this protocol is a ton of unused code to handle an edge case we should only rarely hit...in other words code that will almost certainly be buggy, untested, and fail if people start hitting it. In general, I'm not a huge fan of protocols with any more usually-unused code than is strictly necessary. This also doesn't capture things like channel_update extensions - BOLTs today say a recipient "MAY choose NOT to for messages longer than the minimum expected length" - so now we'd need remove that (and I guess have a fixed "maximum length" for channel updates that everyone agrees to...basically we have to have exact consensus on valid channel updates across nodes. Heh, I'm surprised you'd complain about this - IIUC your existing gossip storage system keeps this as a side-effect so it'd be a single integer for y'all :p. In any case, if it makes the protocol a chunk more efficient I don't see why its a big deal to keep track of the set of which invoices have changed recently, you could even make it super efficient by just saying "anything more recent than timestamp X except a few exceptions that we got with some lag against the update timestamp". The benefit of a single global sketch is less overhead in adding additional gossip peers, though looking at the numbers, sketch decoding time seems to be the more significant driving factor than rebuilding sketches (when they're incremental.) I also like maximizing the utility of the sketch by adding the full gossip store if possible. Note that the alternative here does not prevent you from having a single global sketch. You can keep a rolling global sketch that you send to all your peers at once, it would just be a bit of a bandwidth burst when they all request a few channel updates/announcements from you. More generally, I'm somewhat surprised to hear a performance concern here - I can't imagine we'd be including any more entries in such a sketch than Bitcoin Core does transactions to relay to peers, and this is exactly the design direction they went in (because of basically the same concerns). I still think getting th
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
> > The update contains a block number. Let's say we allow an update every > > 100 blocks. This must be <= current block height (and presumably, newer > > than height - 2016). > > > > If you send an update number 60, and then 600100, it will propagate. > > 600099 will not. > > > Ah, this is an additional proposal on top, and requires a gossip "hard fork", > which means your new > protocol would only work for taproot channels, and any old/unupgraded > channels will have to be > propagated via the old mechanism. I'd kinda prefer to be able to rip out the > old gossip sync code > sooner than a few years from now :(. I viewed it as a soft fork, where if you want to use set reconciliation, anything added to the set would be subject to a constricted ruleset - in this case the gossip must be accompanied by a blockheight tlv (or otherwise reference a blockheight) and it must not replace a message in the current 100 block range. It doesn't necessarily have to reference blockheight, but that would simplify many edge cases. The key is merely that a node is responsible for limiting it's own gossip to a predefined interval, and it must be easily verifiable for any other nodes building and reconciling sketches. Given that we have access to a timechain, this just made the most sense. > > If some nodes have 60 and others have 600099 (because you broke the > > ratelimiting recommendation, and propagated both approx the same > > time), then the network will split, sure. > > > Right, so what do you do in that case, though? AFAIU, in your proposed sync > mechanism if a node does > this once, you're stuck with all of your gossip reconciliations with every > peer "wasting" one > difference "slot" for a day or however long it takes before the peer does a > sane update. In my > proposed alternative it only appears once and then you move on (or maybe once > more on startup, but > we can maybe be willing to take on some extra cost there?). This case may not be all that difficult. Easiest answer is you offer a spam proof to your peer. Send both messages, signed by the offending node as proof they violated the set reconciliation rate limit, then remove both from your sketch. You may want to keep the evidence it in your data store, at least until it's superceded by the next valid update, but there's no reason it must occupy a slot of the sketch. Meanwhile, feel free to use the message as you wish, just keep both out of the sketch. It's not perfect, but the sketch capacity is not compromised and the second incidence of spam should not propagate at all. (It may be possible to keep one, but this is the simplest answer.) > Heh, I'm surprised you'd complain about this - IIUC your existing gossip > storage system keeps this > as a side-effect so it'd be a single integer for y'all :p. In any case, if it > makes the protocol a > chunk more efficient I don't see why its a big deal to keep track of the set > of which invoices have > changed recently, you could even make it super efficient by just saying > "anything more recent than > timestamp X except a few exceptions that we got with some lag against the > update timestamp". The benefit of a single global sketch is less overhead in adding additional gossip peers, though looking at the numbers, sketch decoding time seems to be the more significant driving factor than rebuilding sketches (when they're incremental.) I also like maximizing the utility of the sketch by adding the full gossip store if possible. I still think getting the rate-limit responsibility to the originating node would be a win in either case. It will chew into sketch capacity regardless. -Alex --- Original Message --- On Thursday, May 26th, 2022 at 5:19 PM, Matt Corallo wrote: > > On 5/26/22 1:25 PM, Rusty Russell wrote: > > > Matt Corallo lf-li...@mattcorallo.com writes: > > > > > > > I agree there should be some rough consensus, but rate-limits are a > > > > > locally-enforced thing, not a > > > > > global one. There will always be races and updates you reject that > > > > > your peers dont, no matter the > > > > > rate-limit, and while I agree we should have guidelines, we can't > > > > > "just make them the same" - it > > > > > both doesn't solve the problem and means we can't change them in the > > > > > future. > > > > > > > > Sure it does! It severly limits the set divergence to race conditions > > > > (down to block height divergence, in practice). > > > > > > Huh? There's always some line you draw, if an update happens right on the > > > line (which they almost > > > certainly often will because people want to update, and they'll update > > > every X hours to whatever the > > > rate limit is), then ~half the network will accept the update and half > > > won't. I don't see how you > > > solve this problem. > > > > The update contains a block number. Let's say we allow an update every > > 100 blocks. This must be <= current block height (and pre
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
On 5/26/22 1:25 PM, Rusty Russell wrote: Matt Corallo writes: I agree there should be *some* rough consensus, but rate-limits are a locally-enforced thing, not a global one. There will always be races and updates you reject that your peers dont, no matter the rate-limit, and while I agree we should have guidelines, we can't "just make them the same" - it both doesn't solve the problem and means we can't change them in the future. Sure it does! It severly limits the set divergence to race conditions (down to block height divergence, in practice). Huh? There's always some line you draw, if an update happens right on the line (which they almost certainly often will because people want to update, and they'll update every X hours to whatever the rate limit is), then ~half the network will accept the update and half won't. I don't see how you solve this problem. The update contains a block number. Let's say we allow an update every 100 blocks. This must be <= current block height (and presumably, newer than height - 2016). If you send an update number 60, and then 600100, it will propagate. 600099 will not. Ah, this is an additional proposal on top, and requires a gossip "hard fork", which means your new protocol would only work for taproot channels, and any old/unupgraded channels will have to be propagated via the old mechanism. I'd kinda prefer to be able to rip out the old gossip sync code sooner than a few years from now :(. If some nodes have 60 and others have 600099 (because you broke the ratelimiting recommendation, *and* propagated both approx the same time), then the network will split, sure. Right, so what do you do in that case, though? AFAIU, in your proposed sync mechanism if a node does this once, you're stuck with all of your gossip reconciliations with every peer "wasting" one difference "slot" for a day or however long it takes before the peer does a sane update. In my proposed alternative it only appears once and then you move on (or maybe once more on startup, but we can maybe be willing to take on some extra cost there?). Maybe. What's a "non-update" based sketch? Some huge percentage of gossip is channel_update, so it's kind of the thing we want? Oops, maybe we're on *very* different pages, here - I mean doing sketches based on "the things that I received since the last sync, ie all the gossip updates from the last hour" vs doing sketches based on "the things I have, ie my full gossip store". But that requires state. Full store requires none, keeping it super-simple Heh, I'm surprised you'd complain about this - IIUC your existing gossip storage system keeps this as a side-effect so it'd be a single integer for y'all :p. In any case, if it makes the protocol a chunk more efficient I don't see why its a big deal to keep track of the set of which invoices have changed recently, you could even make it super efficient by just saying "anything more recent than timestamp X *except* a few exceptions that we got with some lag against the update timestamp". Better, the state is global, not per-peer, and a small fraction of the total state of the gossip store anyway, so its not like its introducing some new giant or non-constant-factor blowup. Matt ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Matt Corallo writes: >>> I agree there should be *some* rough consensus, but rate-limits are a >>> locally-enforced thing, not a >>> global one. There will always be races and updates you reject that your >>> peers dont, no matter the >>> rate-limit, and while I agree we should have guidelines, we can't "just >>> make them the same" - it >>> both doesn't solve the problem and means we can't change them in the future. >> >> Sure it does! It severly limits the set divergence to race conditions >> (down to block height divergence, in practice). > > Huh? There's always some line you draw, if an update happens right on the > line (which they almost > certainly often will because people want to update, and they'll update every > X hours to whatever the > rate limit is), then ~half the network will accept the update and half won't. > I don't see how you > solve this problem. The update contains a block number. Let's say we allow an update every 100 blocks. This must be <= current block height (and presumably, newer than height - 2016). If you send an update number 60, and then 600100, it will propagate. 600099 will not. If some nodes have 60 and others have 600099 (because you broke the ratelimiting recommendation, *and* propagated both approx the same time), then the network will split, sure. We could be fascist and penalize nodes which do this, but that's overkill unless it actually happens a lot. Nodes which want to keep an potential update "up their sleeve" will backdate updates by 101 blocks (everyone should do this, in fact). As I said, this has a problem with block height differences, but that's explicitly included in the messages so you can ignore and wait if you want. Again, may not be a problem in practice. >> Maybe. What's a "non-update" based sketch? Some huge percentage of >> gossip is channel_update, so it's kind of the thing we want? > > Oops, maybe we're on *very* different pages, here - I mean doing sketches > based on "the things that > I received since the last sync, ie all the gossip updates from the last hour" > vs doing sketches > based on "the things I have, ie my full gossip store". But that requires state. Full store requires none, keeping it super-simple Though Alex has a idea for a "include even the expired entries" then "regenerate every N blocks" which avoids the problem that each change is two deltas (one remove, one add), at cost of some complexity. Cheers, Rusty. ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Oops, sorry, I don't really monitor the dev lists but once every few months so this fell off my plate :/ On 4/28/22 6:11 PM, Rusty Russell wrote: Matt Corallo writes: OK, let's step back. Unlike Bitcoin, we can use a single sketch for *all* peers. This is because we *can* encode enough information that you can get useful info from the 64 bit id, and because it's expensive to create them so you can't spam. Yep, makes sense. The more boutique per-peer handling we need, the further it gets from this ideal;. The second potential thing I think you might have meant here I don't see as an issue at all? You can simply...let the sketch include one channel update that you ignored? See above discussion of similar rate-limits. No, you need to get all the ignored ones somehow? There's so much cruft in the sketch you can't decode it. Now you need to remember the ones you ratelimited, and try to match other's ratelimiting. Right, you'd end up downloading the thing you rate-limited, but only once (possibly per-peer). If you use the total-sync approach you'd download it on every sync, vs a "only updates" approach you'd do it once. I agree there should be *some* rough consensus, but rate-limits are a locally-enforced thing, not a global one. There will always be races and updates you reject that your peers dont, no matter the rate-limit, and while I agree we should have guidelines, we can't "just make them the same" - it both doesn't solve the problem and means we can't change them in the future. Sure it does! It severly limits the set divergence to race conditions (down to block height divergence, in practice). Huh? There's always some line you draw, if an update happens right on the line (which they almost certainly often will because people want to update, and they'll update every X hours to whatever the rate limit is), then ~half the network will accept the update and half won't. I don't see how you solve this problem. Maybe. What's a "non-update" based sketch? Some huge percentage of gossip is channel_update, so it's kind of the thing we want? Oops, maybe we're on *very* different pages, here - I mean doing sketches based on "the things that I received since the last sync, ie all the gossip updates from the last hour" vs doing sketches based on "the things I have, ie my full gossip store". Matt ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Matt Corallo writes: > On 4/26/22 11:53 PM, Rusty Russell wrote: >> Matt Corallo writes: This same problem will occur if *anyone* does ratelimiting, unless *everyone* does. And with minisketch, there's a good reason to do so. >>> >>> None of this seems like a good argument for *not* taking the "send updates >>> since the last sync in >>> the minisketch" approach to reduce the damage inconsistent policies >>> cause, though? >> >> You can't do this, with minisketch. You end up having to keep all the >> ratelimited differences you're ignoring *per peer*, and then cancelling >> them out of the minisketch on every receive or send. > > Hmm? I'm a bit confused, let me attempt to restate to make sure we're on the > same page. What I > *think* you said here is: "If you have a node which is rejecting a large > percentage *channel*'s > updates (on a per-channel, not per-update basis), and it tries to sync, > you'll end up having to keep > some huge set of 'I dont want any more updates for that channel' on a > per-peer basis"? Or maybe you > might have said "When you rate-limit, you have to tell your peer that you > rate-limited a channel > update and that it shouldn't add that update to its next sketch"? OK, let's step back. Unlike Bitcoin, we can use a single sketch for *all* peers. This is because we *can* encode enough information that you can get useful info from the 64 bit id, and because it's expensive to create them so you can't spam. The more boutique per-peer handling we need, the further it gets from this ideal;. > The second potential thing I think you might have meant here I don't see as > an issue at all? You can > simply...let the sketch include one channel update that you ignored? See > above discussion of similar > rate-limits. No, you need to get all the ignored ones somehow? There's so much cruft in the sketch you can't decode it. Now you need to remember the ones you ratelimited, and try to match other's ratelimiting. >> So you end up doing that LND and core-lightning do, which is "pick 3 >> peers to gossip with" and tell everyone else to shut up. >> >> Yet the point of minisketch is robustness; you can (at cost of 1 message >> per minute) keep in sync with an arbitrary number of peers. >> >> So, we might as well define a preferred ratelimit, so nodes know that >> spamming past a certain point is not going to propagate. At the moment, >> LND has no effective ratelimit at all, so it's a race to the bottom. > > I agree there should be *some* rough consensus, but rate-limits are a > locally-enforced thing, not a > global one. There will always be races and updates you reject that your peers > dont, no matter the > rate-limit, and while I agree we should have guidelines, we can't "just make > them the same" - it > both doesn't solve the problem and means we can't change them in the future. Sure it does! It severly limits the set divergence to race conditions (down to block height divergence, in practice). > Ultimately, a updates-based sync is more robust in such a case - if there's > some race and your peer > accepts something you don't it may mean one more entry in the sketch one > time, but it won't hang > around forever. > >> We need that limit eventually, this just makes it more of a priority. >> >>> I'm not really >>> sure in a world where you do "update-based-sketch" gossip sync you're any >>> worse off than today even >>> with different rate-limit policies, though I obviously agree there are >>> substantial issues with the >>> massively inconsistent rate-limit policies we see today. >> >> You can't really do it, since rate-limited junk overwhelms the sketch >> really fast :( > > How is this any better in a non-update-based-sketch? The only way to address > it is to have a bigger > sketch, which you can do no matter the thing you're building the sketch over. > > Maybe lets schedule a call to get on the same page, throwing text at each > other will likely not move > very quickly. Maybe. What's a "non-update" based sketch? Some huge percentage of gossip is channel_update, so it's kind of the thing we want? Cheers, Rusty. ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
On 4/26/22 11:53 PM, Rusty Russell wrote: Matt Corallo writes: This same problem will occur if *anyone* does ratelimiting, unless *everyone* does. And with minisketch, there's a good reason to do so. None of this seems like a good argument for *not* taking the "send updates since the last sync in the minisketch" approach to reduce the damage inconsistent policies cause, though? You can't do this, with minisketch. You end up having to keep all the ratelimited differences you're ignoring *per peer*, and then cancelling them out of the minisketch on every receive or send. Hmm? I'm a bit confused, let me attempt to restate to make sure we're on the same page. What I *think* you said here is: "If you have a node which is rejecting a large percentage *channel*'s updates (on a per-channel, not per-update basis), and it tries to sync, you'll end up having to keep some huge set of 'I dont want any more updates for that channel' on a per-peer basis"? Or maybe you might have said "When you rate-limit, you have to tell your peer that you rate-limited a channel update and that it shouldn't add that update to its next sketch"? Either way, I don't think its all that interesting an issue. The first case is definitely an issue, but is an issue in both a new-data-only sketch and all-data sketch world, and is not completely solved with identical rate-limits in any case. It can be largely addressed by sane software defaults and roughly-similar rate-limits, though, and because its a per-channel, not per-update issue I'm much less concerned. The second potential thing I think you might have meant here I don't see as an issue at all? You can simply...let the sketch include one channel update that you ignored? See above discussion of similar rate-limits. So you end up doing that LND and core-lightning do, which is "pick 3 peers to gossip with" and tell everyone else to shut up. Yet the point of minisketch is robustness; you can (at cost of 1 message per minute) keep in sync with an arbitrary number of peers. So, we might as well define a preferred ratelimit, so nodes know that spamming past a certain point is not going to propagate. At the moment, LND has no effective ratelimit at all, so it's a race to the bottom. I agree there should be *some* rough consensus, but rate-limits are a locally-enforced thing, not a global one. There will always be races and updates you reject that your peers dont, no matter the rate-limit, and while I agree we should have guidelines, we can't "just make them the same" - it both doesn't solve the problem and means we can't change them in the future. Ultimately, a updates-based sync is more robust in such a case - if there's some race and your peer accepts something you don't it may mean one more entry in the sketch one time, but it won't hang around forever. We need that limit eventually, this just makes it more of a priority. I'm not really sure in a world where you do "update-based-sketch" gossip sync you're any worse off than today even with different rate-limit policies, though I obviously agree there are substantial issues with the massively inconsistent rate-limit policies we see today. You can't really do it, since rate-limited junk overwhelms the sketch really fast :( How is this any better in a non-update-based-sketch? The only way to address it is to have a bigger sketch, which you can do no matter the thing you're building the sketch over. Maybe lets schedule a call to get on the same page, throwing text at each other will likely not move very quickly. Matt ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Matt Corallo writes: >> This same problem will occur if *anyone* does ratelimiting, unless >> *everyone* does. And with minisketch, there's a good reason to do so. > > None of this seems like a good argument for *not* taking the "send updates > since the last sync in > the minisketch" approach to reduce the damage inconsistent policies > cause, though? You can't do this, with minisketch. You end up having to keep all the ratelimited differences you're ignoring *per peer*, and then cancelling them out of the minisketch on every receive or send. So you end up doing that LND and core-lightning do, which is "pick 3 peers to gossip with" and tell everyone else to shut up. Yet the point of minisketch is robustness; you can (at cost of 1 message per minute) keep in sync with an arbitrary number of peers. So, we might as well define a preferred ratelimit, so nodes know that spamming past a certain point is not going to propagate. At the moment, LND has no effective ratelimit at all, so it's a race to the bottom. We need that limit eventually, this just makes it more of a priority. > I'm not really > sure in a world where you do "update-based-sketch" gossip sync you're any > worse off than today even > with different rate-limit policies, though I obviously agree there are > substantial issues with the > massively inconsistent rate-limit policies we see today. You can't really do it, since rate-limited junk overwhelms the sketch really fast :( Note, we *can* actually change the ratelimit in future, either by running two sketches (feature bit!), or by changing the rate slowly enough that they can handle the small differences. Cheers, Rusty. ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
On 4/22/22 6:40 PM, Rusty Russell wrote: Matt Corallo writes: Allowing only 1 a day, ended up with 18% of channels hitting the spam limit. We cannot fit that many channel differences inside a set! Perhaps Alex should post his more detailed results, but it's pretty clear that we can't stay in sync with this many differences :( Right, the fact that most nodes don't do any limiting at all and y'all have a *very* aggressive (by comparison) limit is going to be an issue in any context. I'm unable to find the post years ago where I proposed this limit and nobody had major objections. I just volunteered to go first :) I'm not trying to argue the number is good or bad, only that being several orders of magnitude away from everything else is going to lead to rejections. We could set some guidelines and improve things, but luckily regular-update-sync bypasses some of these issues anyway - if we sync once per block and your limit is once per block, getting 1000 updates per block for some channel doesn't result in multiple failures in the sync. Sure, multiple peers sending different updates for that channel can still cause some failures, but its still much better. Nodes will want to aggressively spam as much as they can, so I think we need a widely-agreed limit. I don't really care what it is, but somewhere between per 1 and 1000 blocks makes sense? I don't really disagree, but my point is that we should strive for the sync system to not need to care about this number as much as possible. Because views of the rate limits are a local view, not a global view, you'll always end up with things on the edge getting rejected during sync, and, worse, when we eventually want to change the limit, we'd be hosed. But we might end up with a gossip2 if we want to enable taproot, and use blockheight as timestamps, in which case we could probably just support that one operation (and maybe a direct query op). Like eclair, we don’t bother to rate limit and don’t see any issues with it, though we will skip relaying outbound updates if we’re saturating outbound connections. Yeah, we did as a trial, and in some cases it's become limiting. In particular, people restarting their LND nodes once a day resulting in 2 updates per day (which, in 0.11.0, we now allow). What do you mean "its become limiting"? As in you hit some reasonably-low CPU/disk/bandwidth limit in doing this? We have a pretty aggressive bandwidth limit for this kinda stuff (well, indirect bandwidth limit) and it very rarely hits in my experience (unless the peer is very overloaded and not responding to pings, which is a somewhat separate thing...) By rejecting more than 1 per day, some LND nodes had 50% of their channels left disabled :( This same problem will occur if *anyone* does ratelimiting, unless *everyone* does. And with minisketch, there's a good reason to do so. None of this seems like a good argument for *not* taking the "send updates since the last sync in the minisketch" approach to reduce the damage inconsistent policies cause, though? I'm not really sure in a world where you do "update-based-sketch" gossip sync you're any worse off than today even with different rate-limit policies, though I obviously agree there are substantial issues with the massively inconsistent rate-limit policies we see today. Matt ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Matt Corallo writes: >> Allowing only 1 a day, ended up with 18% of channels hitting the spam >> limit. We cannot fit that many channel differences inside a set! >> >> Perhaps Alex should post his more detailed results, but it's pretty >> clear that we can't stay in sync with this many differences :( > > Right, the fact that most nodes don't do any limiting at all and y'all have a > *very* aggressive (by > comparison) limit is going to be an issue in any context. I'm unable to find the post years ago where I proposed this limit and nobody had major objections. I just volunteered to go first :) > We could set some guidelines and improve > things, but luckily regular-update-sync bypasses some of these issues anyway > - if we sync once per > block and your limit is once per block, getting 1000 updates per block for > some channel doesn't > result in multiple failures in the sync. Sure, multiple peers sending > different updates for that > channel can still cause some failures, but its still much better. Nodes will want to aggressively spam as much as they can, so I think we need a widely-agreed limit. I don't really care what it is, but somewhere between per 1 and 1000 blocks makes sense? Normally I'd suggest a burst, but that's bad for consensus: better to say "just create your update N-6 blocks behind so you can always create a new one 6 blocks behind". >>> gossip queries is broken in at least five ways. >> >> Naah, it's perfect if you simply want to ask "give me updates since XXX" >> to get you close enough on reconnect to start using set reconciliation. >> This might allow us to remove some of the other features? > > Sure, but that's *just* the "gossip_timestamp_filter" message, there's > several other messages and a > whole query system that we can throw away if we just want that message :) I agree. Removing features would be nice :) >> But we might end up with a gossip2 if we want to enable taproot, and use >> blockheight as timestamps, in which case we could probably just support >> that one operation (and maybe a direct query op). >> >>> Like eclair, we don’t bother to rate limit and don’t see any issues with >>> it, though we will skip relaying outbound updates if we’re saturating >>> outbound connections. >> >> Yeah, we did as a trial, and in some cases it's become limiting. In >> particular, people restarting their LND nodes once a day resulting in 2 >> updates per day (which, in 0.11.0, we now allow). > > What do you mean "its become limiting"? As in you hit some reasonably-low > CPU/disk/bandwidth limit > in doing this? We have a pretty aggressive bandwidth limit for this kinda > stuff (well, indirect > bandwidth limit) and it very rarely hits in my experience (unless the peer is > very overloaded and > not responding to pings, which is a somewhat separate thing...) By rejecting more than 1 per day, some LND nodes had 50% of their channels left disabled :( This same problem will occur if *anyone* does ratelimiting, unless *everyone* does. And with minisketch, there's a good reason to do so. Cheers, Rusty. ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
On 4/21/22 7:20 PM, Rusty Russell wrote: Matt Corallo writes: Sure, if you’re rejecting a large % of channel updates in total you’re gonna end up hitting degenerate cases, but we can consider tuning the sync frequency if that becomes an issue. Let's be clear: it's a problem. Allowing only 1 a day, ended up with 18% of channels hitting the spam limit. We cannot fit that many channel differences inside a set! Perhaps Alex should post his more detailed results, but it's pretty clear that we can't stay in sync with this many differences :( Right, the fact that most nodes don't do any limiting at all and y'all have a *very* aggressive (by comparison) limit is going to be an issue in any context. We could set some guidelines and improve things, but luckily regular-update-sync bypasses some of these issues anyway - if we sync once per block and your limit is once per block, getting 1000 updates per block for some channel doesn't result in multiple failures in the sync. Sure, multiple peers sending different updates for that channel can still cause some failures, but its still much better. gossip queries is broken in at least five ways. Naah, it's perfect if you simply want to ask "give me updates since XXX" to get you close enough on reconnect to start using set reconciliation. This might allow us to remove some of the other features? Sure, but that's *just* the "gossip_timestamp_filter" message, there's several other messages and a whole query system that we can throw away if we just want that message :) But we might end up with a gossip2 if we want to enable taproot, and use blockheight as timestamps, in which case we could probably just support that one operation (and maybe a direct query op). Like eclair, we don’t bother to rate limit and don’t see any issues with it, though we will skip relaying outbound updates if we’re saturating outbound connections. Yeah, we did as a trial, and in some cases it's become limiting. In particular, people restarting their LND nodes once a day resulting in 2 updates per day (which, in 0.11.0, we now allow). What do you mean "its become limiting"? As in you hit some reasonably-low CPU/disk/bandwidth limit in doing this? We have a pretty aggressive bandwidth limit for this kinda stuff (well, indirect bandwidth limit) and it very rarely hits in my experience (unless the peer is very overloaded and not responding to pings, which is a somewhat separate thing...) Matt ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
On 4/22/22 9:15 AM, Alex Myers wrote: Hi Matt, Appreciate your responses. Hope you'll bear with me as I'm a bit new to this. Instead of trying to make sure everyone’s gossip acceptance matches exactly, which as you point it seems like a quagmire, why not (a) do a sync on startup and (b) do syncs of the *new* things. I'm not opposed to this technique, and maybe it ends up as a better solution. The rationale for not going full Erlay approach was that it's far less overhead to maintain a single sketch than to maintain a per-peer sketch and associated state for every gossip peer. In this way there's very little cost to adding additional gossip peers, which further encourages propagation and convergence of the gossip network. I'm not sure what you mean by per-node state here - I'd think you can implement it with a simple "list of updates that happened since time X" data, instead of having to maintain per-peer state. IIUC Erlay's design was concerned for privacy of originating nodes. Lightning gossip is public by nature, so I'm not sure we should constrain ourselves to the same design route without trying the alternative first. Part of the design of Erlay, especially the insight of syncing updates instead of full mempools, was actually this precise issue - Bitcoin Core nodes differ in policy for a number of reasons (especially across updates), and thus syncing the full mempool will result in degenerate cases of trying over and over and over again to sync stuff your peer is rejecting. At least if I recall correctly. if we're gonna add a minisketch-based sync anyway, please lets also use it for initial sync after restart This was out of the scope of what I had in mind, but I will give this some thought. I could see how a block_height reference coupled with set reconciliation could provide some better options here. This may not be all that difficult to shoe-horn in. Regardless of single sketch or per-peer set reconciliation, it should be easier to implement with tighter rules on rate-limiting. (Keep in mind, the node's graph can presumably be updated independently of the gossip it rebroadcasts if desired.) As a thought experiment, if we consider a CLN-LDK set reconciliation, and that each node is gossiping with 5 other peers in an evenly spaced frequency, we would currently see 42.8 commonly accepted channel_updates over an average 60s window along with 11 more updates which LDK accepts and CLN rejects (spam.)[1] Assuming the other 5 peers have shared 5/6ths of this gossip before the CLN/LDK set reconciliation, we're left with CLN seeing 7 updates to reconcile, while LDK sees 18. Already we've lost 60% efficiency due to lack of a common rate-limit heuristic. I do not believe that we will ever form a strong agreement on exactly what the rate-limits should be. And even if we do, we still have the issue of upgrades, where a simple change to the rate-limits causes sync to suddenly blow up and hit degenerate cases all over the place. Unless we can make the sync system relatively robust against slightly different policies, I think we're kinda screwed. Worse, what happens if someone sends updates at exactly the limit of the rate-limiters? Presumably people will do this because "that's what the limit is and I want to send updates as often as I can becaux...". Now you'll still have similar issues, I believe. I understand gossip traffic is manageable now, but I'm not sure it will be that long before it becomes an issue. Furthermore, any particular set reconciliation technique would benefit from a simple common rate-limit heuristic, not to mention originating nodes, who may not currently realize their channel updates are being rejected by a portion of the network due to differing criteria across implementations. Yes, I agree there is definitely a concern with differing criteria resulting in nodes not realizing their gossip is not propagating. I agree guidelines would be nice, but guidelines doesn't solve the issue for sync, sadly, I think. Luckily lightning does provide a mechanism to bypass the rejection - send an update back with an HTLC failure. If you're trying to route an HTLC and a node has new parameters for you, it'll helpfully let you know when you try to use the old parameters. Matt ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Hi Matt, Appreciate your responses. Hope you'll bear with me as I'm a bit new to this. > Instead of trying to make sure everyone’s gossip acceptance matches exactly, > which as you point it seems like a quagmire, why not (a) do a sync on startup > and (b) do syncs of the *new* things. I'm not opposed to this technique, and maybe it ends up as a better solution. The rationale for not going full Erlay approach was that it's far less overhead to maintain a single sketch than to maintain a per-peer sketch and associated state for every gossip peer. In this way there's very little cost to adding additional gossip peers, which further encourages propagation and convergence of the gossip network. IIUC Erlay's design was concerned for privacy of originating nodes. Lightning gossip is public by nature, so I'm not sure we should constrain ourselves to the same design route without trying the alternative first. > if we're gonna add a minisketch-based sync anyway, please lets also use it > for initial sync after restart This was out of the scope of what I had in mind, but I will give this some thought. I could see how a block_height reference coupled with set reconciliation could provide some better options here. This may not be all that difficult to shoe-horn in. Regardless of single sketch or per-peer set reconciliation, it should be easier to implement with tighter rules on rate-limiting. (Keep in mind, the node's graph can presumably be updated independently of the gossip it rebroadcasts if desired.) As a thought experiment, if we consider a CLN-LDK set reconciliation, and that each node is gossiping with 5 other peers in an evenly spaced frequency, we would currently see 42.8 commonly accepted channel_updates over an average 60s window along with 11 more updates which LDK accepts and CLN rejects (spam.)[1] Assuming the other 5 peers have shared 5/6ths of this gossip before the CLN/LDK set reconciliation, we're left with CLN seeing 7 updates to reconcile, while LDK sees 18. Already we've lost 60% efficiency due to lack of a common rate-limit heuristic. I understand gossip traffic is manageable now, but I'm not sure it will be that long before it becomes an issue. Furthermore, any particular set reconciliation technique would benefit from a simple common rate-limit heuristic, not to mention originating nodes, who may not currently realize their channel updates are being rejected by a portion of the network due to differing criteria across implementations. Thanks, Alex [1] https://github.com/endothermicdev/lnspammityspam/blob/main/sampleoutput.txt --- Original Message --- On Thursday, April 21st, 2022 at 3:47 PM, Matt Corallo lf-li...@mattcorallo.com wrote: > On 4/21/22 1:31 PM, Alex Myers wrote: > >> Hello Bastien, >> >> Thank you for your feedback. I hope you don't mind I let it percolate for a >> while. >> >> Eclair doesn't do any rate-limiting. We wanted to "feel the pain" before >> adding >> anything, and to be honest we haven't really felt it yet. >> >> I understand the “feel the pain first” approach, but attempting set >> reconciliation has forced me to >> confront the issue a bit early. >> >> My thoughts on sync were that set-reconciliation would only be used once a >> node had fully synced >> gossip through traditional means (initial_routing_sync / gossip_queries.) >> There should be many >> levers to pull in order to help maintain sync after this. I'm going to have >> to experiment with them >> a bit before I can claim they are sufficient, but I'm optimistic. > > Please, no. initial_routing_sync was removed from most implementations (it > sucks) and gossip queries > is broken in at least five ways. May we can recover it by adding yet more > extensions but if we're > gonna add a minisketch-based sync anyway, please lets also use it for initial > sync after restart > (unless you have no channels at all, in which case lets maybe revive > initial_routing_sync...) > > Matt___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Matt Corallo writes: > Sure, if you’re rejecting a large % of channel updates in total > you’re gonna end up hitting degenerate cases, but we can consider > tuning the sync frequency if that becomes an issue. Let's be clear: it's a problem. Allowing only 1 a day, ended up with 18% of channels hitting the spam limit. We cannot fit that many channel differences inside a set! Perhaps Alex should post his more detailed results, but it's pretty clear that we can't stay in sync with this many differences :( > gossip queries is broken in at least five ways. Naah, it's perfect if you simply want to ask "give me updates since XXX" to get you close enough on reconnect to start using set reconciliation. This might allow us to remove some of the other features? But we might end up with a gossip2 if we want to enable taproot, and use blockheight as timestamps, in which case we could probably just support that one operation (and maybe a direct query op). > Like eclair, we don’t bother to rate limit and don’t see any issues with it, > though we will skip relaying outbound updates if we’re saturating outbound > connections. Yeah, we did as a trial, and in some cases it's become limiting. In particular, people restarting their LND nodes once a day resulting in 2 updates per day (which, in 0.11.0, we now allow). Cheers, Rusty. ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
On 4/21/22 1:31 PM, Alex Myers wrote: Hello Bastien, Thank you for your feedback. I hope you don't mind I let it percolate for a while. Eclair doesn't do any rate-limiting. We wanted to "feel the pain" before adding anything, and to be honest we haven't really felt it yet. I understand the “feel the pain first” approach, but attempting set reconciliation has forced me to confront the issue a bit early. My thoughts on sync were that set-reconciliation would only be used once a node had fully synced gossip through traditional means (initial_routing_sync / gossip_queries.) There should be many levers to pull in order to help maintain sync after this. I'm going to have to experiment with them a bit before I can claim they are sufficient, but I'm optimistic. Please, no. initial_routing_sync was removed from most implementations (it sucks) and gossip queries is broken in at least five ways. May we can recover it by adding yet more extensions but if we're gonna add a minisketch-based sync anyway, please lets also use it for initial sync after restart (unless you have no channels at all, in which case lets maybe revive initial_routing_sync...) Matt ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Hello Bastien, Thank you for your feedback. I hope you don't mind I let it percolate for a while. > Eclair doesn't do any rate-limiting. We wanted to "feel the pain" before > adding > anything, and to be honest we haven't really felt it yet. I understand the “feel the pain first” approach, but attempting set reconciliation has forced me to confront the issue a bit early. My thoughts on sync were that set-reconciliation would only be used once a node had fully synced gossip through traditional means (initial_routing_sync / gossip_queries.) There should be many levers to pull in order to help maintain sync after this. I'm going to have to experiment with them a bit before I can claim they are sufficient, but I'm optimistic. > One thing that may help here from an implementation's point of view is to > avoid > sending a disabled channel update every time a channel goes offline. What > eclair does to avoid spamming is to only send a disabled channel update when > someone actually tries to use that channel. Of course, if people choose this > offline node in their route, you don't have a choice and will need to send a > disabled channel update, but we've observed that many channels come back > online before we actually need to use them, so we're saving two channel > updates > (one to disable the channel and one to re-enable it). I think all > implementations > should do this. Is that the case today? > We could go even further, and when we receive an htlc that should be relayed > to an offline node, wait a bit to give them an opportunity to come online > instead > of failing the htlc and sending a disabled channel update. Eclair currently > doesn'tdo that, but it would be very easy to add. Core-Lightning also delays sending disabled channel updates in an effort to minimize unnecessary gossip. I hadn’t considered an additional delay before failing an htlc on a disabled channel. That will be interesting to explore in the context of transient disconnects of Tor v3 nodes. I like the idea of a block_height in the channel update tlv. That would be sufficient to enable a simple rate-limit heuristic for this application anyway. Allowing leeway for the chain tip is no problem. I would also expect most implementations to hold a couple updates in reserve, defaulting to predated updates when available. This would allow a “burst” functionality similar to the current LND/CLN rate-limit, but the responsibility is now placed on the originating node to provide that allowance. Cheers, Alex --- Original Message --- On Friday, April 15th, 2022 at 2:15 AM, Bastien TEINTURIER wrote: > Good morning Alex, > >> I’ve been investigating set reconciliation as a means to reduce bandwidth > >> and redundancy of gossip message propagation. > > Cool project, glad to see someone working on it! The main difficulty here will > indeed be to ensure that the number of differences between sets is bounded. > We will need to maintain a mechanism to sync the whole graph from scratch > for new nodes, so the minisketch diff must be efficient enough otherwise nodes > will just fall back to a full sync way too often (which would waste a lot of > bandwidth). > >> Picking several offending channel ids, and digging further, the majority of >> these > >> appear to be flapping due to Tor or otherwise intermittent connections. > > One thing that may help here from an implementation's point of view is to > avoid > sending a disabled channel update every time a channel goes offline. What > eclair does to avoid spamming is to only send a disabled channel update when > someone actually tries to use that channel. Of course, if people choose this > offline node in their route, you don't have a choice and will need to send a > disabled channel update, but we've observed that many channels come back > online before we actually need to use them, so we're saving two channel > updates > (one to disable the channel and one to re-enable it). I think all > implementations > should do this. Is that the case today? > > We could go even further, and when we receive an htlc that should be relayed > to an offline node, wait a bit to give them an opportunity to come online > instead > of failing the htlc and sending a disabled channel update. Eclair currently > doesn't > do that, but it would be very easy to add. > >> - A common listing of current default rate limits across lightning network >> implementations. > > Eclair doesn't do any rate-limiting. We wanted to "feel the pain" before > adding > anything, and to be honest we haven't really felt it yet. > >> which will use a common, simple heuristic to accept or reject a gossip >> message. > >> (Think one channel update per block, or perhaps one per block_height << 5.) > > I think it would be easy to come to agreement between implementations and > restrict channel updates to at most one every N blocks. We simply need to add > the `block_height` in a tlv in `channel_update` and then w
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
I think I mentioned this out of band to Alex, but (b) is what Erlay's proposal is for Bitcoin gossip, so it's worth studying up. On Thu, Apr 21, 2022 at 9:18 AM Matt Corallo wrote: > Instead of trying to make sure everyone’s gossip acceptance matches > exactly, which as you point it seems like a quagmire, why not (a) do a sync > on startup and (b) do syncs of the *new* things. This way you aren’t stuck > staring at the same channels every time you do a sync. Sure, if you’re > rejecting a large % of channel updates in total you’re gonna end up hitting > degenerate cases, but we can consider tuning the sync frequency if that > becomes an issue. > > Like eclair, we don’t bother to rate limit and don’t see any issues with > it, though we will skip relaying outbound updates if we’re saturating > outbound connections. > > On Apr 14, 2022, at 17:06, Alex Myers wrote: > > > > Hello lightning developers, > > > I’ve been investigating set reconciliation as a means to reduce bandwidth > and redundancy of gossip message propagation. This builds on some earlier work > from Rusty using the minisketch library [1]. The idea is that each node > will build a sketch representing it’s own gossip set. Alice’s node will > encode and transmit this sketch to Bob’s node, where it will be merged with > his own sketch, and the differences produced. These differences should > ideally be exactly the latest missing gossip of both nodes. Due to size > constraints, the set differences will necessarily be encoded, but Bob’s > node will be able to identify which gossip Alice is missing, and may then > transmit exactly those messages. > > > This process is relatively straightforward, with the caveat that the sets > must otherwise match very closely (each sketch has a maximum capacity for > differences.) The difficulty here is that each node and lightning > implementation may have its own rules for gossip acceptance and > propagation. Depending on their gossip partners, not all gossip may > propagate to the entire network. > > > Core-lightning implements rate limiting for incoming channel updates and > node announcements. The default rate limit is 1 per day, with a burst of > 4. I analyzed my node’s gossip over a 14 day period, and found that, of > all publicly broadcasting half-channels, 18% of them fell afoul of our > spam-limiting rules at least once. [2] > > > Picking several offending channel ids, and digging further, the majority > of these appear to be flapping due to Tor or otherwise intermittent > connections. Well connected nodes may be more susceptible to this due to more > frequent routing attempts, and failures resulting in a returned channel > update (which otherwise might not have been broadcast.) A slight > relaxation of the rate limit resolves the majority of these cases. > > > A smaller subset of channels broadcast frequent channel updates with minor > adjustments to htlc_maximum_msat and fee_proportional_millionths > parameters. These nodes appear to be power users, with many channels and > large balances. I assume this is automated channel management at work. > > > Core-Lightning has updated rate-limiting in the upcoming release to > achieve a higher acceptance of incoming gossip, however, it seems that a > broader discussion of rate limits may now be worthwhile. A few immediate > ideas: > > - A common listing of current default rate limits across lightning > network implementations. > > - Internal checks of RPC input to limit or warn of network propagation > issues if certain rates are exceeded. > > - A commonly adopted rate-limit standard. > > > My aim is a set reconciliation gossip type, which will use a common, > simple heuristic to accept or reject a gossip message. (Think one channel > update per block, or perhaps one per block_height << 5.) See my github > for my current draft. [3] This solution allows tighter consensus, yet suffers > from the same problem as original anti-spam measures – it remains > somewhat arbitrary. I would like to start a conversation regarding gossip > propagation, channel_update and node_announcement usage, and perhaps even > bandwidth goals for syncing gossip in the future (how about a million > channels?) This would aid in the development of gossip set > reconciliation, but could also benefit current node connection and > routing reliability more generally. > > > Thanks, > > Alex > > > [1] https://github.com/sipa/minisketch > > [2] > https://github.com/endothermicdev/lnspammityspam/blob/main/sampleoutput.txt > > [3] > https://github.com/endothermicdev/lightning-rfc/blob/gossip-minisketch/07-routing-gossip.md#set-reconciliation > > ___ > Lightning-dev mailing list > Lightning-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev > > ___ > Lightning-dev mailing list > Lightning-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listi
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Instead of trying to make sure everyone’s gossip acceptance matches exactly, which as you point it seems like a quagmire, why not (a) do a sync on startup and (b) do syncs of the *new* things. This way you aren’t stuck staring at the same channels every time you do a sync. Sure, if you’re rejecting a large % of channel updates in total you’re gonna end up hitting degenerate cases, but we can consider tuning the sync frequency if that becomes an issue. Like eclair, we don’t bother to rate limit and don’t see any issues with it, though we will skip relaying outbound updates if we’re saturating outbound connections. > On Apr 14, 2022, at 17:06, Alex Myers wrote: > > > Hello lightning developers, > > I’ve been investigating set reconciliation as a means to reduce bandwidth and > redundancy of gossip message propagation. This builds on some earlier work > from Rusty using the minisketch library [1]. The idea is that each node will > build a sketch representing it’s own gossip set. Alice’s node will encode and > transmit this sketch to Bob’s node, where it will be merged with his own > sketch, and the differences produced. These differences should ideally be > exactly the latest missing gossip of both nodes. Due to size constraints, the > set differences will necessarily be encoded, but Bob’s node will be able to > identify which gossip Alice is missing, and may then transmit exactly those > messages. > > This process is relatively straightforward, with the caveat that the sets > must otherwise match very closely (each sketch has a maximum capacity for > differences.) The difficulty here is that each node and lightning > implementation may have its own rules for gossip acceptance and propagation. > Depending on their gossip partners, not all gossip may propagate to the > entire network. > > Core-lightning implements rate limiting for incoming channel updates and node > announcements. The default rate limit is 1 per day, with a burst of 4. I > analyzed my node’s gossip over a 14 day period, and found that, of all > publicly broadcasting half-channels, 18% of them fell afoul of our > spam-limiting rules at least once. [2] > > Picking several offending channel ids, and digging further, the majority of > these appear to be flapping due to Tor or otherwise intermittent connections. > Well connected nodes may be more susceptible to this due to more frequent > routing attempts, and failures resulting in a returned channel update (which > otherwise might not have been broadcast.) A slight relaxation of the rate > limit resolves the majority of these cases. > > A smaller subset of channels broadcast frequent channel updates with minor > adjustments to htlc_maximum_msat and fee_proportional_millionths parameters. > These nodes appear to be power users, with many channels and large balances. > I assume this is automated channel management at work. > > Core-Lightning has updated rate-limiting in the upcoming release to achieve a > higher acceptance of incoming gossip, however, it seems that a broader > discussion of rate limits may now be worthwhile. A few immediate ideas: > - A common listing of current default rate limits across lightning network > implementations. > - Internal checks of RPC input to limit or warn of network propagation issues > if certain rates are exceeded. > - A commonly adopted rate-limit standard. > > My aim is a set reconciliation gossip type, which will use a common, simple > heuristic to accept or reject a gossip message. (Think one channel update per > block, or perhaps one per block_height << 5.) See my github for my current > draft. [3] This solution allows tighter consensus, yet suffers from the same > problem as original anti-spam measures – it remains somewhat arbitrary. I > would like to start a conversation regarding gossip propagation, > channel_update and node_announcement usage, and perhaps even bandwidth goals > for syncing gossip in the future (how about a million channels?) This would > aid in the development of gossip set reconciliation, but could also benefit > current node connection and routing reliability more generally. > > Thanks, > Alex > > [1] https://github.com/sipa/minisketch > [2] > https://github.com/endothermicdev/lnspammityspam/blob/main/sampleoutput.txt > [3] > https://github.com/endothermicdev/lightning-rfc/blob/gossip-minisketch/07-routing-gossip.md#set-reconciliation > > ___ > Lightning-dev mailing list > Lightning-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Good morning Alex, I’ve been investigating set reconciliation as a means to reduce bandwidth and redundancy of gossip message propagation. > Cool project, glad to see someone working on it! The main difficulty here will indeed be to ensure that the number of differences between sets is bounded. We will need to maintain a mechanism to sync the whole graph from scratch for new nodes, so the minisketch diff must be efficient enough otherwise nodes will just fall back to a full sync way too often (which would waste a lot of bandwidth). Picking several offending channel ids, and digging further, the majority of > these appear to be flapping due to Tor or otherwise intermittent connections. > One thing that may help here from an implementation's point of view is to avoid sending a disabled channel update every time a channel goes offline. What eclair does to avoid spamming is to only send a disabled channel update when someone actually tries to use that channel. Of course, if people choose this offline node in their route, you don't have a choice and will need to send a disabled channel update, but we've observed that many channels come back online before we actually need to use them, so we're saving two channel updates (one to disable the channel and one to re-enable it). I think all implementations should do this. Is that the case today? We could go even further, and when we receive an htlc that should be relayed to an offline node, wait a bit to give them an opportunity to come online instead of failing the htlc and sending a disabled channel update. Eclair currently doesn't do that, but it would be very easy to add. - A common listing of current default rate limits across lightning network > implementations. > Eclair doesn't do any rate-limiting. We wanted to "feel the pain" before adding anything, and to be honest we haven't really felt it yet. which will use a common, simple heuristic to accept or reject a gossip > message. (Think one channel update per block, or perhaps one per block_height << 5.) > I think it would be easy to come to agreement between implementations and restrict channel updates to at most one every N blocks. We simply need to add the `block_height` in a tlv in `channel_update` and then we'll be able to actually rate-limit based on it. Given how much time it takes to upgrade most of the network, it may be a good idea to add the `block_height` tlv now in the spec, and act on it later? Unless your work requires bigger changes in channel update in which case it will probably be a new message. Note that it will never be completely accurate though, as different nodes can have different blockchain tips. My nodes may be one or two blocks late compared to the node that emits the channel update. We need to allow a bit of leeway there. Cheers, Bastien Le jeu. 14 avr. 2022 à 23:06, Alex Myers a écrit : > Hello lightning developers, > > > I’ve been investigating set reconciliation as a means to reduce bandwidth > and redundancy of gossip message propagation. This builds on some earlier work > from Rusty using the minisketch library [1]. The idea is that each node > will build a sketch representing it’s own gossip set. Alice’s node will > encode and transmit this sketch to Bob’s node, where it will be merged with > his own sketch, and the differences produced. These differences should > ideally be exactly the latest missing gossip of both nodes. Due to size > constraints, the set differences will necessarily be encoded, but Bob’s > node will be able to identify which gossip Alice is missing, and may then > transmit exactly those messages. > > > This process is relatively straightforward, with the caveat that the sets > must otherwise match very closely (each sketch has a maximum capacity for > differences.) The difficulty here is that each node and lightning > implementation may have its own rules for gossip acceptance and > propagation. Depending on their gossip partners, not all gossip may > propagate to the entire network. > > > Core-lightning implements rate limiting for incoming channel updates and > node announcements. The default rate limit is 1 per day, with a burst of > 4. I analyzed my node’s gossip over a 14 day period, and found that, of > all publicly broadcasting half-channels, 18% of them fell afoul of our > spam-limiting rules at least once. [2] > > > Picking several offending channel ids, and digging further, the majority > of these appear to be flapping due to Tor or otherwise intermittent > connections. Well connected nodes may be more susceptible to this due to more > frequent routing attempts, and failures resulting in a returned channel > update (which otherwise might not have been broadcast.) A slight > relaxation of the rate limit resolves the majority of these cases. > > > A smaller subset of channels broadcast frequent channel updates with minor > adjustments to htlc_maximum_msat and fee_proportional_millionths > parameters. These nodes appear to be power
[Lightning-dev] Gossip Propagation, Anti-spam, and Set Reconciliation
Hello lightning developers, I’ve been investigating set reconciliation as a means to reduce bandwidth and redundancy of gossip message propagation. This builds on some earlier work from Rusty using the minisketch library [1]. The idea is that each node will build a sketch representing it’s own gossip set. Alice’s node will encode and transmit this sketch to Bob’s node, where it will be merged with his own sketch, and the differences produced. These differences should ideally be exactly the latest missing gossip of both nodes. Due to size constraints, the set differences will necessarily be encoded, but Bob’s node will be able to identify which gossip Alice is missing, and may then transmit exactly those messages. This process is relatively straightforward, with the caveat that the sets must otherwise match very closely (each sketch has a maximum capacity for differences.) The difficulty here is that each node and lightning implementation may have its own rules for gossip acceptance and propagation. Depending on their gossip partners, not all gossip may propagate to the entire network. Core-lightning implements rate limiting for incoming channel updates and node announcements. The default rate limit is 1 per day, with a burst of 4. I analyzed my node’s gossip over a 14 day period, and found that, of all publicly broadcasting half-channels, 18% of them fell afoul of our spam-limiting rules at least once. [2] Picking several offending channel ids, and digging further, the majority of these appear to be flapping due to Tor or otherwise intermittent connections. Well connected nodes may be more susceptible to this due to more frequent routing attempts, and failures resulting in a returned channel update (which otherwise might not have been broadcast.)A slight relaxation of the rate limit resolves the majority of these cases. A smaller subset of channels broadcast frequent channel updates with minor adjustments to htlc_maximum_msat and fee_proportional_millionths parameters. These nodes appear to be power users, with many channels and large balances. I assume this is automated channel management at work. Core-Lightning has updated rate-limiting in the upcoming release to achieve a higher acceptance of incoming gossip, however, it seems that a broader discussion of rate limits may now be worthwhile. A few immediate ideas: - A common listing of current default rate limits across lightning network implementations. - Internal checks of RPC input to limit or warn of network propagation issues if certain rates are exceeded. - A commonly adopted rate-limit standard. My aim is a set reconciliation gossip type, which will use a common, simple heuristic to accept or reject a gossip message. (Think one channel update per block, or perhaps one per block_height << 5.) See my github for my current draft. [3] This solution allows tighter consensus, yet suffers from the same problem as original anti-spam measures – it remains somewhat arbitrary. I would like to start a conversation regarding gossip propagation, channel_update and node_announcement usage, and perhaps even bandwidth goals for syncing gossip in the future (how about a million channels?) This would aid in the development of gossip set reconciliation, but could also benefit current node connection and routing reliability more generally. Thanks, Alex [1] https://github.com/sipa/minisketch [2] https://github.com/endothermicdev/lnspammityspam/blob/main/sampleoutput.txt [3] https://github.com/endothermicdev/lightning-rfc/blob/gossip-minisketch/07-routing-gossip.md#set-reconciliation___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev