Re: [Lightning-dev] Improving the initial gossip sync
Hi, On 5 February 2018 at 14:02, Christian Decker wrote: > Hi everyone > > The feature bit is even, meaning that it is required from the peer, > since we extend the `init` message itself, and a peer that does not > support this feature would be unable to parse any future extensions to > the `init` message. Alternatively we could create a new > `set_gossip_timestamp` message that is only sent if both endpoints > support this proposal, but that could result in duplicate messages being > delivered between the `init` and the `set_gossip_timestamp` message and > it'd require additional messages. We chose the other aproach and propose to use an optional feature > The reason I'm using timestamp and not the blockheight in the short > channel ID is that we already use the timestamp for pruning. In the > blockheight based timestamp we might ignore channels that were created, > then not announced or forgotten, and then later came back and are now > stable. Just to be clear, you propose to use the timestamp of the most recent channel updates to filter the associated channel announcements ? > I hope this rather simple proposal is sufficient to fix the short-term > issues we are facing with the initial sync, while we wait for a real > sync protocol. It is definitely not meant to allow perfect > synchronization of the topology between peers, but then again I don't > believe that is strictly necessary to make the routing successful. > > Please let me know what you think, and I'd love to discuss Pierre's > proposal as well. > > Cheers, > Christian Our idea is to group channel announcements by "buckets", create a filter for each bucket, exchange and use them to filter out channel announcements. We would add a new `use_channel_announcement_filters` optional feature bit (7 for example), and a new `channel_announcement_filters` message. When a node that supports channel announcement filters receives an `init` message with the `use_channel_announcement_filters` bit set, it sends back its channel filters. When a node that supports channel announcement filters receives a`channel_announcement_filters` message, it uses it to filter channel announcements (and, implicitly ,channel updates) before sending them. The filters we have in mind are simple: - Sort announcements by short channel id - Compute a marker height, which is `144 * ((now - 7 * 144) / 144)` (we round to multiples of 144 to make sync easier) - Group channel announcements that were created before this marker by groups of 144 blocks - Group channel announcements that were created after this marker by groups of 1 block - For each group, sort and concatenate all channel announcements short channel ids and hash the result (we could use sha256, or the first 16 bytes of the sha256 hash) The new `channel_announcement_filters` would then be a list of (height, hash) pairs ordered by increasing heights. This implies that implementation can easily sort announcements by short channel id, which should not be very difficult. An additional step could be to send all short channel ids for all groups for which the group hash did not match. Alternatively we could use smarter filters The use case we have in mind is mobile nodes, or more generally nodes which are often offline and need to resync very often. Cheers, Fabrice ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Improving the initial gossip sync
Hi, Suppose you partition nodes into 3 generic roles: - payers: they mostly send payments, are typically small and operated by end users, and are offline quite a lot - relayers: they mostly relay payments, and would be online most of the time (if they're too unreliable other nodes will eventually close their channels with them) - payees: they mostly receive payments, how often they can be online is directly link to their particular mode of operations (since you need to be online to receive payments) Of course most nodes would play more or less all roles. However, mobile nodes would probably be mostly "payers", and they have specific properties: - if they don't relay payments they don't have to be announced. There could be millions of mobile nodes that would have no impact on the size of the routing table - it does not impact the network when they're offline - but they need an accurate routing table. This is very different from nodes who mostly relay or accept payements - they would be connected to a very small number of nodes - they would typically be online for just a few hours every day, but could be stopped/paused/restarted many times a day Laolu wrote: > So I think the primary distinction between y'alls proposals is that > cdecker's proposal focuses on eventually synchronizing all the set of > _updates_, while Fabrice's proposal cares *only* about the newly created > channels. It only cares about new channels as the rationale is that if once > tries to route over a channel with a state channel update for it, then > you'll get an error with the latest update encapsulated. If you have one filter per day and they don't match (because your peer has channels that you missed, or have been closed and you were not aware of it) then you will receive all channel announcements for this particular day, and the associated updates Laolu wrote: > I think he's actually proposing just a general update horizon in which > vertexes+edges with a lower time stamp just shouldn't be set at all. In the > case of an old zombie channel which was resurrected, it would eventually be > re-propagated as the node on either end of the channel should broadcast a > fresh update along with the original chan ann. Yes but it could take a long time. It may be worse on testnet since it seems that nodes don't change their fees very often. "Payer nodes" need a good routing table (as opposed to "relayers" which could work without one if they never initiate payments) Laolu wrote: > This seems to assume that both nodes have a strongly synchronized view of > the network. Otherwise, they'll fall back to sending everything that went on > during the entire epoch regularly. It also doesn't address the zombie churn > issue as they may eventually send you very old channels you'll have to deal > with (or discard). Yes I agree that for nodes which have connections to a lot of peers, strongly synchronized routing tables is harder to achieve since a small change may invalidate an entire bucket. Real queryable filters would be much better, but worst case scenario is we've sent an additionnal 30 Kb or o of sync messages. (A very naive filter would be sort + pack all short ids for example) But we focus on nodes which are connected to a very small number of peers, and in this particular case it is not an unrealistic expectation. We have built a prototype and on testnet it works fairly well. I also found nodes which have no direct channel betweem them but produce the same filters for 75% of the buckets ("produce" here means that I opened a simple gossip connection to them, got their routing table and used it to generate filters). Laolu wrote: > How far back would this go? Weeks, months, years? Since forever :) One filter per day for all annoucements that are older than now - 1 week (modulo 144) One filter per block for recent announcements > > FWIW this approach optimizes for just learning of new channels instead of > learning of the freshest state you haven't yet seen. I'd say it optimizes the case where you are connected to very few peers, and are online a few times every day (?) > > -- Laolu > > > On Mon, Feb 5, 2018 at 7:08 AM Fabrice Drouin > wrote: >> >> Hi, >> >> On 5 February 2018 at 14:02, Christian Decker >> wrote: >> > Hi everyone >> > >> > The feature bit is even, meaning that it is required from the peer, >> > since we extend the `init` message itself, and a peer that does not >> > support this feature would be unable to parse any future extensions to >> > the `init` message. Alternatively we could create a new >> > `set_gossip_timestamp` message that is only sent if both endpoints >> > support this proposal, but that could r
Re: [Lightning-dev] Improving the initial gossip sync
On 12 February 2018 at 02:45, Rusty Russell wrote: > Christian Decker writes: >> Rusty Russell writes: >>> Finally catching up. I prefer the simplicity of the timestamp >>> mechanism, with a more ambitious mechanism TBA. >> >> Fabrice and I had a short chat a few days ago and decided that we'll >> simulate both approaches and see what consumes less bandwidth. With >> zombie channels and the chances for missing channels during a weak form >> of synchronization, it's not that clear to us which one has the better >> tradeoff. With some numbers behind it it may become easier to decide :-) > > Maybe; I think we'd be best off with an IBLT-approach similar to > Fabrice's proposal. An IBLT is better than a simple hash, since if your > results are similar you can just extract the differences, and they're > easier to maintain. Even easier if we make the boundaries static rather > than now-relative. For node_announce and channel_update you'd probably > want separate IBLTs (perhaps, though not necessarily, as a separate > RTT). Yes, real filters would be better, but the 'bucket hash' idea works (from what I've seen on testnet) for our specific target (nodes which are connected to very small number of peers and go offline very often) . > Note that this approach fits really well as a complement to the > timestamp approach: you'd use this for older pre-timestamp, where you're > likely to have a similar idea of channels. Both approaches maybe needed because they may be solutions to different problems (nodes which get d isconnected from a small set of peers vs nodes connected to many peers, which remain online but not some of their peers) >>> Now, as to the proposal specifics. >>> >>> I dislike the re-transmission of all old channel_announcement and >>> node_announcement messages, just because there's been a recent >>> channel_update. Simpler to just say 'send anything >= >>> routing_sync_timestamp`. >> >> I'm afraid we can't really omit the `channel_announcement` since a >> `channel_update` that isn't preceded by a `channel_announcement` is >> invalid and will be dropped by peers (especially because the >> `channel_update` doesn't contain the necessary information for >> validation). > > OTOH this is a rare corner case which will eventually be fixed by weekly > channel_announce retransmission. In particular, the receiver should > have already seen the channel_announce, since it preceeded the timestamp > they asked for. > > Presumably IRL you'd ask for a timestamp sometime before you were last > disconnected, say 30 minutes. > > "The perfect is the enemy of the good". This is precisely what I think would not work very well with the timestamp approach: when you're missing an 'old' channel announcement, and only have a few sources for them. It can have a huge impact on terminal nodes which won't be able to find routes and waiting for a new channel update would take too long. Yes, using just a few peers mean that you will be limited to the routing table they will give you, but having some kind of filter would let nodes connect to other peers just to retrieve them and check how far off they are from the rest of the nework. This would not possible with a timestamp (you would need to download the entire routing table again, which is what we're trying to avoid) >>> Background: c-lightning internally keeps an tree of gossip in the order >>> we received them, keeping a 'current' pointer for each peer. This is >>> very efficient (though we don't remember if a peer sent us a gossip msg >>> already, so uses twice the bandwidth it could). Ok so a peer would receive an announcement it has sent, but woud immediately dismiss it ? >> >> We can solve that by keeping a filter of the messages we received from >> the peer, it's more of an optimization than anything, other than the >> bandwidth cost, it doesn't hurt. > > Yes, it's on the TODO somewhere... we should do this! > >>> But this isn't *quite* the same as timestamp order, so we can't just set >>> the 'current' pointer based on the first entry >= >>> `routing_sync_timestamp`; we need to actively filter. This is still a >>> simple traverse, however, skipping over any entry less than >>> routing_sync_timestamp. >>> >>> OTOH, if we need to retransmit announcements, when do we stop >>> retransmitting them? If a new channel_update comes in during this time, >>> are we still to dump the announcements? Do we have to remember which >>> ones we've sent to each peer? >> >> That's more of an implementation detail. In c-lightning we can just >> remember the index at which the initial sync started, and send >> announcements along until the index is larger than the initial sync >> index. > > True. It is an implementation detail which is critical to saving > bandwidth though. > >> A more general approach would be to have 2 timestamps, one highwater and >> one lowwater mark. Anything inbetween these marks will be forwarded >> togeth
Re: [Lightning-dev] Improving the initial gossip sync
server-side vs client-side filtering for SPV clients) On 16 February 2018 at 13:34, Fabrice Drouin wrote: > I like the IBLT idea very much but my understanding of how they work > is that that the tricky part would be first to estimate the number of > differences between "our" and "their" routing tables. > So when we open a connection we would first exchange messages to > estimate how far off we are (by sending a sample of shortids and > extrapolate ?) then send filters which would be specific to each peer. > This may become an issue if a node is connected to many peers, and is > similar to the server-side vs client-side filtering issue for SPV > clients. > Basically, I fear that it would take some time before it is agreed > upon and available, hindering the development of mobile nodes. > > The bucket hash idea is naive but is very simple to implement and > could buy us enough time to implement IBLT filters properly. Imho the > timestamp idea does not work for the mobile phone use case (but is > probably simpler and better that bucket hashes for "centre" nodes > which are never completely off the grid) > > > On 14 February 2018 at 01:24, Rusty Russell wrote: >> Fabrice Drouin writes: >>> Yes, real filters would be better, but the 'bucket hash' idea works >>> (from what I've seen on testnet) for our specific target (nodes which >>> are connected to very small number of peers and go offline very >>> often) >> >> What if we also add an 'announce_query' message: if you see a >> 'channel_update' which you discard because you don't know the channel, >> 'announce_query' asks them to send the 'channel_announce' for that >> 'short_channel_id' followed by re-sending the 'channel_update'(s)? >> (Immediately, rather than waiting for next gossip batch). >> >> I think we would want this for IBLT, too, since you'd want this to query >> any short-channel-id you extract from that which you don't know about. > > Yes, unless it is part of the initial sync (compare filters. then send > what they're missing) > >> I see. (BTW, your formatting makes your post sounds very Zen!). > Sorry about that, I've disabled the haiku mode :) > >> Yes, we can probably use difference encoding and use 1 bit for output >> index (with anything which is greater appended) and get down to 1 byte >> per channel_id at scale. >> >> But my rule-of-thumb for scaling today is 1M - 10M channels, and that >> starts to get a little excessive. Hence my interest in IBLTs, which are >> still pretty trivial to implement. > > Yes, sending all shortids would also have been a temporary measure > while a more sophisticated approach is being designed. >> >> Cheers, >> Rusty. ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Improving the initial gossip sync
On 20 February 2018 at 02:08, Rusty Russell wrote: > Hi all, > > This consumed much of our lightning dev interop call today! But > I think we have a way forward, which is in three parts, gated by a new > feature bitpair: We've built a prototype with a new feature bit `channel_range_queries` and the following logic: When you receive their init message and check their local features - if they set `initial_routing_sync` and `channel_range_queries` then do nothing (they will send you a `query_channel_range`) - if they set `initial_routing_sync` and not `channel_range_queries` then send your routing table (as before) - if you support `channel_range_queries` then send a `query_channel_range` message This way new and old nodes should be able to understand each other > 1. query_short_channel_id > = > > 1. type: 260 (`query_short_channel_id`) > 2. data: >* [`32`:`chain_hash`] >* [`8`:`short_channel_id`] We could add a `data` field which contains zipped ids like in `reply_channel_range` so we can query several items with a single message ? > 1. type: 262 (`reply_channel_range`) > 2. data: >* [`32`:`chain_hash`] >* [`4`:`first_blocknum`] >* [`4`:`number_of_blocks`] >* [`2`:`len`] >* [`len`:`data`] We could add an additional `encoding_type` field before `data` (or it could be the first byte of `data`) > Appendix A: Encoding Sizes > == > > I tried various obvious compression schemes, in increasing complexity > order (see source below, which takes stdin and spits out stdout): > > Raw = raw 8-byte stream of ordered channels. > gzip -9: gzip -9 of raw. > splitgz: all blocknums first, then all txnums, then all outnums, then > gzip -9 > delta: CVarInt encoding: > blocknum_delta,num,num*txnum_delta,num*outnum. > deltagz: delta, with gzip -9 > > Corpus 1: LN mainnet dump, 1830 channels.[1] > > Raw: 14640 bytes > gzip -9: 6717 bytes > splitgz: 6464 bytes > delta: 6624 bytes > deltagz: 4171 bytes > > Corpus 2: All P2SH outputs between blocks 508000-508999 incl, 790844 > channels.[2] > > Raw: 6326752 bytes > gzip -9: 1861710 bytes > splitgz: 964332 bytes > delta: 1655255 bytes > deltagz: 595469 bytes > > [1] http://ozlabs.org/~rusty/short_channels-mainnet.xz > [2] http://ozlabs.org/~rusty/short_channels-all-p2sh-508000-509000.xz > Impressive! ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
[Lightning-dev] Proposal for syncing channel updates
Hi, This a a proposal for an extension of our current “channel queries” that should allow nodes to properly sync their outdated channel updates. I already opened a issue on the RFC’s github repo (https://github.com/lightningnetwork/lightning-rfc/issues/480) but decided to post here too to, to have a less “constrained” discussion. And it looks like a fairly standard synchronisation problem so maybe someone will think of others similar schemes that have been used in a different context. Thanks, Fabrice Background: Routing Table Sync (If you’re familiar with LN you can just skip this section) LN is a p2p network of nodes, which can be represented as a graph where nodes are vertices and channels are edges, and where you can pay any node you can find a route to: - each nodes maintains a routing table i.e. a full view of the LN graph - to send a payment, nodes use their local routing table to compute a route to the destination, and send a onion-like message to the first node on that route, which will forward it to the next node and so on until it reaches its destination The routing table includes: - “static” information: channel announcements - “dynamic” information: channel updates (relay fees) (It also includes node announcements, which are not needed for route computation) Using our graph analogy, channel updates would be edge parameters (cost, max capacity, min payment amount, …). They can change often, usually when nodes decide to change their relay fee policy, but also to signify that a channel is temporarily unusable. A new channel update will replace the previous one. Channel ids are identified with an 8 bytes "short transaction id": we use the blockchain coordinates of the funding tx: block height (4 bytes) + tx index (2 bytes) + output index (2 bytes) Chanel updates include a channel id, a direction (for a channel between Alice and Bob there are 2 channel updates: one for Alice->Bob and one for Bob->Alice), fee parameters, and a 4 bytes timestamp. To compute routes, nodes need a way to keep their routing table up-to-date: we call it "routing table sync" or "routing sync". There is something else to consider: route finding is only needed when you're * sending * payments, not when you're relaying them or receiving them. A node that sits in the "middle" of the LN network and just keeps relaying payments would work even if it has no routing information at all. Likewise, a node that just creates payment requests and receives payments does not need a routing table. On the other end of the spectrum, a LN "wallet" that is mostly used to send payments will not work very well it its routing table is missing info or contains outdated info, so routing sync is a very important issue for LN wallets, which are also typically offline more often than other nodes. If your wallet is missing channel announcements it may not be able to find a route, and if its channel updates are outdated it may compute a route that includes channels that are temporarily disabled, or use fee rates that are too old and will be refused by relaying nodes. In this case nodes can return errors that include their most recent channel update, so that the sender can try again, but this will only work well if just a few channel updates are outdated. So far, these are the “routing table sync” schemes that have been specified and implemented: Step #1: just send everything The first routing sync scheme was very simple: nodes would request that peers they connect to send them a complete "dump" of their entire routing table. It worked well at the beginning but was expensive for both peers and quickly became impractical. Step #2: synchronise channel announcements New query messages where added to the LN protocol to improve routing table sync: nodes can ask their peers for all their channel ids in a given block range, compare that list to their own channel ids and query the ones they're missing (as well as related channel updates). Nodes can also send a timestamp-based filter to their peers ("only send me channel updates that match this timestamp filter"). It's a nice improvement but there are still issues with nodes that are offline very often: they will be able to sync their channel announcements, but not their channel updates. Suppose that at T0 a node has 1000 channel updates that are outdated. It comes back online, starts syncing its routing table, and goes offline after a few minutes. It now has 900 channel updates that are outdated. At T1 = T0 + 8 hours it comes back online again. If it uses T0 to filter out channel updates, it will never receive the info it is missing for its 900 outdated channel updates. Using our "last time I was online at" timestamp as a gossip filter does not work here. => Proposed solution: timestamp-based channel updates sync We need a better method for syncing channel updates. And it is not really a set reconciliation problem (like syncing channel announcements for example): we’re not missing items,
Re: [Lightning-dev] Proposal for syncing channel updates
Hi Zmn, > It may be reduced to a set reconciliation problem if we consider the > timestamp and enable/disable state of channel updates as part of an item, > i.e. a channel update of 111:1:1 at 2018-10-04 state=enabled is not the same > as a channel update of 111:1:1 at 2018-10-05 state=disabled. > > Then both sides can use standard set reconciliation algorithms, and for > channel updates of the same short channel ID, we simply drop all items except > the one with latest timestamp. > > The above idea might be less efficient than your proposed extension. Yes there may be some way use the structure of channel_update messages to transpose this into a set reconciliation problem, and use smarter tools like IBLTs. But we would need to have a good estimate for the number of differences between the local and remote sets. This would be the really hard part I think, and probably even harder to get right with channel_updates than with channel_announcements. I had a quick look at how this type of sync issue is handled in different contexts and my impression is that exchanging and and comparing timestamps would be the most natural solution ? But mostly my point is that I think we missed something with the current channel queries, so first I would like to know if other people agree with that :) and propose something that is close to what we have today, should be easy to implement if you already support channel queries, and should fix the issue that I think we have. Thanks, Fabrice ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Commitment Transaction Format Update Proposals?
Hello, > 1. Rather than trying to agree on what fees will be in the future, we > should use an OP_TRUE-style output to allow CPFP (Roasbeef) We could also use SIGHASH_ANYONECANPAY|SIGHASH_SINGLE for HTLC txs, without adding the "OP_TRUE" output to the commitment transaction. We would still need the update_fee message to manage onchain fees for the commit tx (but not the HTLC txs) but there would be no reason anymore to refuse fee rates that are too high and channels would not get closed anymore when there's a spike in onchain fees. Cheers, Fabrice ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Wireshark plug-in for Lightning Network(BOLT) protocol
Nice work, thank you! On Fri, 26 Oct 2018 at 17:37, wrote: > > Hello lightning network developers. > Nayuta team is developing Wireshark plug-in for Lightning Network(BOLT) > protocol. > https://github.com/nayutaco/lightning-dissector > > It’s alpha version, but it can decode some BOLT message. > Currently, this software works for Nayuta’s implementation(ptarmigan) and > Éclair. > When ptarmigan is compiled with some option, it write out key information > file. This Wireshark plug-in decode packet using that file. > When you use Éclair, this software parse log file. > > Through our development experience, interoperability test is time consuming > task. > If people can see communication log of BOLT message on same format (.pcap), > it will be useful for interoperability test. > > Our proposal: > Every implementation has compile option which enable output key information > file. > > We are glad if this project is useful for lightning network eco-system. > ___ > Lightning-dev mailing list > Lightning-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
[Lightning-dev] Improving payment UX with low-latency route probing
Context == Sent payments that remain pending, i.e. payments which have not yet been failed or fulfilled, are currently a major UX challenge for LN and a common source of complaints from end-users. Why payments are not fulfilled quickly is not always easy to investigate, but we've seen problems caused by intermediate nodes which were stuck waiting for a revocation, and recipients who could take a very long time to reply with a payment preimage. It is already possible to partially mitigate this by disconnecting from a node that is taking too long to send a revocation (after 30 seconds for example) and reconnecting immediately to the same node. This way pending downstream HTLCs can be forgotten and the corresponding upstream HTLCs failed. Proposed changes === It should be possible to provide a faster "proceed/try another route" answer to the sending node using probing with short timeout requirements: before sending the actual payment it would first send a "blank" probe request, along the same route. This request would be similar to a payment request, with the same onion packet formatting and processing, with the additional requirements that if the next node in the route has not replied within the timeout period (typically a few hundred milliseconds) then the current node will immediately send back an error message. There could be several options for the probe request: - include the same amounts and fee constraints than the actual payment request. - include no amount information, in which case we're just trying to "ping" every node on the route. Implementation I would like to discuss the possibility of implementing this with a "0 satoshi" payment request that the receiving node would generate along with the real one. The sender would first try to "pay" the "0 satoshi" request using the route it computed with the actual payment parameters. I think that it would not require many changes to the existing protocol and implementations. Not using the actual amount and fees means that the actual payment could fail because of capacity issues but as long as this happens quickly, and it should since we checked first that all nodes on the route are alive and responsive, it still is much better than “stuck” payments. And it would not help if a node decides to misbehave, but would not make things worse than they are now (?) Cheers, Fabrice ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
[Lightning-dev] Quick analysis of channel_update data
Hello All, and Happy New Year! To understand why there is a steady stream of channel updates, even when fee parameters don't seem to actually change, I made hourly backups of the routing table of one of our nodes, and compared these routing tables to see what exactly was being modified. It turns out that: - there are a lot of disable/enable/disable etc…. updates which are just sent when a channel is disabled then enabled again (when nodes go offline for example ?). This can happen there are also a lot of updates that don’t change anything (just a new timestamp and signatures but otherwise same info), up to several times a day for the same channel id In both cases we end up syncing info that we already have. I don’t know yet how best to use this when syncing routing tables, but I thought it was worth sharing anyway. A basic checksum that does not cover all fields, but only fees and HTLC min/max values could probably be used to improve routing table sync ? Cheers, Fabrice ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Quick analysis of channel_update data
On Wed, 2 Jan 2019 at 18:26, Christian Decker wrote: > > For the ones that flap with a period that is long enough for the > disabling and enabling updates being flushed, we are presented with a > tradeoff. IIRC we (c-lightning) currently hold back disabling > `channel_update`s until someone actually attempts to use the channel at > which point we fail the HTLC and send out the stashed `channel_update` > thus reducing the publicly visible flapping. For the enabling we can't > do that, but we could think about a local policy on how much to delay a > `channel_update` depending on the past stability of that peer. Again > this is local policy and doesn't warrant a spec change. > > I think we should probably try out some policies related to when to send > `channel_update`s and how to hide redundant updates, and then we can see > which ones work best :-) > Yes, I haven't looked at how to handle this with local policies. My hypothesis is that when you're syncing a routing table that is say one day old, you end up querying and downloading a lot of information that you already have, and that adding a basic checksum to our channel queries may greatly improve this. Of course this would be much more actionable with stats and hard numbers which I'll provide ASAP. Cheers, Fabrice ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Quick analysis of channel_update data
Follow-up: here's more detailed info on the data I collected and potential savings we could achieve: I made hourly routing table backups for 12 days, and collected routing information for 17 000 channel ids. There are 130 000 different channel updates :on average each channel has been updated 8 times. Here, “different” means that at least the timestamp has changed, and a node would have queried this channel update during its syncing process. But only 18 000 pairs of channel updates carry actual fee and/or HTLC value change. 85% of the time, we just queried information that we already had! Adding a basic checksum (4 bytes for example) that covers fees and HTLC min/max value to our channel range queries would be a significant improvement and I will add this the open BOLT 1.1 proposal to extend queries with timestamps. I also think that such a checksum could also be used - in “inventory” based gossip messages - in set reconciliation schemes: we could reconcile [channel id | timestamp | checksum] first Cheers, Fabrice ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Quick analysis of channel_update data
On Fri, 4 Jan 2019 at 04:43, ZmnSCPxj wrote: > > - in set reconciliation schemes: we could reconcile [channel id | > > timestamp | checksum] first > > Perhaps I misunderstand how set reconciliation works, but --- if timestamp is > changed while checksum is not, then it would still be seen as a set > difference and still require further communication rounds to discover that > the channel parameters have not actually changed. > > Perhaps it is better to reconcile [channel_id | checksum] instead, and if > there is a different set of channel parameters, share the set difference and > sort out which timestamp is later at that point. Ah yes of course, the `timestamp` should not be included. Cheers, Fabrice ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Lite client considerations for Lightning Implementations
Hi Chris, What we've learned building a lite bitcoin/LN wallet is that there are different things it must implement: - a bitcoin wallet. We started with bitcoinj, but there are known issues with Bloom Filters, which is one of the reasons why we ended up building our own wallet that connect to Electrum Servers (and it seems we're not the only ones). I'm not sure that a "better" implementation of BIP37 is actually needed, if that's what you mean by "traditional SPV". Client-side filters is a nice improvement, and we have a basic Neutrino prototype that is up to date with the BIPs but not used in our mobile app. We could collaborate on this ? - monitoring your channels" part: detect that your peer is trying to cheat and published an old commit tx, and publish a penalty tx. This is fairly easy (the "detecting" part at least :)) - validating channels: you receive gossip message, and check that channels actually exist, detect when they've been closed and remove them from your routing table. This is much harder. Electrum servers now have a method for retrieving a tx from its coordinates (height + position), but as the number of channels grows it may become impractical to watch every channel. With Bloom Filters and client-side filters you probably end up having to download all blocks (but not necessarily store them all). I also think that it's very important the lite wallet support mobile platforms, android in your case, and since it's basically stuck at Java 7 you may wan to consider using plain Java (or Kotlin) instead of Scala as much as possible. Cheers, Fabrice On Sun, 6 Jan 2019 at 15:58, Chris Stewart wrote: > > Hi all, > > Hope your 2019 is off to a fantastic start. I'm really excited for Lightning > in 2019. > > We are currently reviving a lite client project in bitcoin-s > (https://github.com/bitcoin-s/bitcoin-s-core/pull/280). The goal is to have a > modern replacement for bitcoinj that also can be used for L2 applications > like lightning. We also are planning on supporting multiple coins, hsms etc. > > The current plan is to implement traditional SPV, and then implement neutrino > when development is picking back up on that in bitcoin core. If that takes > too long, we will consider implementing neutrino against btcd. > > What I wanted to ask of the mailing list is to give us "things to consider" > when developing this lite client from a usability perspective for lightning > devs. How can we make your lives easier? > > One thing that seems logical is to adhere to the bitcoin core api when > possible, this means you can use bitcoin-s as a drop in lite client > replacement for bitcoin core. > > Thoughts? > > -Chris > > > > > ___ > Lightning-dev mailing list > Lightning-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Quick analysis of channel_update data
On Tue, 8 Jan 2019 at 17:11, Christian Decker wrote: > > Rusty Russell writes: > > Fortunately, this seems fairly easy to handle: discard the newer > > duplicate (unless > 1 week old). For future more advanced > > reconstruction schemes (eg. INV or minisketch), we could remember the > > latest timestamp of the duplicate, so we can avoid requesting it again. > > Unfortunately this assumes that you have a single update partner, and > still results in flaps, and might even result in a stuck state for some > channels. > > Assume that we have a network in which a node D receives the updates > from a node A through two or more separate paths: > > A --- B --- D > \--- C ---/ > > And let's assume that some channel of A (c_A) is flapping (not the ones > to B and C). A will send out two updates, one disables and the other one > re-enables c_A, otherwise they are identical (timestamp and signature > are different as well of course). The flush interval in B is sufficient > to see both updates before flushing, hence both updates get dropped and > nothing apparently changed (D doesn't get told about anything from > B). The flush interval of C triggers after getting the re-enable, and D > gets the disabling update, followed by the enabling update once C's > flush interval triggers again. Worse if the connection A-C gets severed > between the updates, now C and D learned that the channel is disabled > and will not get the re-enabling update since B has dropped that one > altogether. If B now gets told by D about the disable, it'll also go > "ok, I'll disable it as well", leaving the entire network believing that > the channel is disabled. > > This is really hard to debug, since A has sent a re-enabling > channel_update, but everybody is stuck in the old state. I think there may even be a simpler case where not replacing updates will result in nodes not knowing that a channel has been re-enabled: suppose you got 3 updates U1, U2, U3 for the same channel, U2 disables it, U3 enables it again and is the same as U1. If you discard it and just keep U1, and your peer has U2, how will you tell them that the channel has been enabled again ? Unless "discard" here means keep the update but don't broadcast it ? > At least locally updating timestamp and signature for identical updates > and then not broadcasting if they were the only changes would at least > prevent the last issue of overriding a dropped state with an earlier > one, but it'd still leave C and D in an inconsistent state until we have > some sort of passive sync that compares routing tables and fixes these > issues. But then there's a risk that nodes would discard channels as stale because they don't get new updates when they reconnect. > I think all the bolted on things are pretty much overkill at this point, > it is unlikely that we will get any consistency in our views of the > routing table, but that's actually not needed to route, and we should > consider this a best effort gossip protocol anyway. If the routing > protocol is too chatty, we should make efforts towards local policies at > the senders of the update to reduce the number of flapping updates, not > build in-network deduplications. Maybe something like "eager-disable" > and "lazy-enable" is what we should go for, in which disables are sent > right away, and enables are put on an exponential backoff timeout (after > all what use are flappy nodes for routing?). Yes there are probably heuristics that would help reducing gossip traffic, and I see your point but I was thinking about doing the opposite: "eager-enable" and "lazy-disable", because from a sender's p.o.v trying to use a disabled channel is better than ignoring an enabled channel. Cheers, Fabrice ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Quick analysis of channel_update data
Additional info on channel_update traffic: Comparing daily backups of routing tables over the last 2 weeks shows that nearly all channels get at least a new update every day. This means that channel_update traffic is not primarily cause by nodes publishing new updates when channel are about to become stale: otherwise we would see 1/14th of our channels getting a new update on the first day, then another 1/14th on the second day and so on.This is confirmed by comparing routing table backups over a single day: nearly all channels were updated, one average once, with an update that almost always does not include new information. It could be caused by "flapping" channels, probably because the hosts that are hosting them are not reliable (as in is often offline). Heuristics can be used to improve traffic but it's orhtogonal to the problem of improving our current sync protocol. Also, these heuristics would probaly be used to close channels to unreliable nodes instead of filtering/delaying publishing updates for them. Finally, this is not just obsessing over bandwidth (though bandwidth is a real issue for most mobile users). I'm also over obsessing over startup time and payment UX :), because they do matter a lot for mobile users, and would like to push the current gossip design as far as it can go. I also think that we'll face the same issue when designing inventory messages for channel_update messages. Cheers, Fabrice On Wed, 9 Jan 2019 at 00:44, Rusty Russell wrote: > > Fabrice Drouin writes: > > I think there may even be a simpler case where not replacing updates > > will result in nodes not knowing that a channel has been re-enabled: > > suppose you got 3 updates U1, U2, U3 for the same channel, U2 disables > > it, U3 enables it again and is the same as U1. If you discard it and > > just keep U1, and your peer has U2, how will you tell them that the > > channel has been enabled again ? Unless "discard" here means keep the > > update but don't broadcast it ? > > This can only happen if you happen to lose connection to the peer(s) > which sent U2 before it sends U3. > > Again, this corner case penalizes flapping channels. If we also > ratelimit our own enables to 1 per 120 seconds, you won't hit this case? > > > But then there's a risk that nodes would discard channels as stale > > because they don't get new updates when they reconnect. > > You need to accept redundant updates after 1 week, I think. > > Cheers, > Rusty. ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Unification of feature bits?
On Mon, 21 Jan 2019 at 08:05, Rusty Russell wrote: > > Hi all, > > I have a concrete proposal for feature bits. > > 1. Rename 'local features' to 'peer features'. > 2. Rename 'global features' to 'routing features'. > 3. Have them share a number space (ie. peer and routing features don't >overlap). > 4. Put both in `features` in node announcements, but never use even bits >for peer features. > > This means we can both use node_announcement as "connect to a peer which > supports feature X" and "can I route through this node?". Unification of feature bits makes sense but I don't really understand the concept of `routing features` as opposed to `node features`. What would prevent us from routing payments through a node ? (AMP ? changes to the onion packet ?) I find it easier to reason in terms of `node features`, which are advertised in node announcements, and `peer/connection features`, which are a subset of `node features` applied to a specific connection. Node features would be all the features that we have today (option_data_loss_protect, initial_routing_sync, option_upfront_shutdown_script, gossip_queries), since it makes sense to advertise them except maybe for initial_routing_sync, with the addition of wumbo which could only be optional. What is the rationale for not allowing even bits in peer features ? It makes sense for node features, but there are cases where you may require specific features for a specific connection (option_data_loss_protect for example, or option_upfront_shutdown_script) Cheers, Fabrice > Similarly, (future) DNS seed filtering might support filtering only by > pairs of bits (ie. give me peers which support X, even or odd). > > Cheers, > Rusty. > ___ > Lightning-dev mailing list > Lightning-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Quick analysis of channel_update data
I'll start collecting and checking data again, but from what I see now using our checksum extension still significantly reduces gossip traffic. I'm not saying that heuristics to reduce the number of updates cannot help, but I just don't think it should be our primary way of handling such traffic. If you've opened channels to nodes that are unreliable then you should eventually close these channels, but delaying how you publish updates that disable/enable them has an impact on everyone, especially if they mostly send payments (as opposed to relaying or receiving them). Cheers, Fabrice On Mon, 18 Feb 2019 at 13:10, Rusty Russell wrote: > > BTW, I took a snapshot of our gossip store from two weeks back, which > simply stores all gossip in order (compacting every week or so). > > channel_updates which updated existing channels: 17766 > ... which changed *only* the timestamps: 12644 > ... which were a week since the last: 7233 > ... which only changed the disable/enable: 4839 > > So there are about 5100 timestamp-only updates less than a week apart > (about 2000 are 1036 seconds apart, who is this?). > > 1. I'll look at getting even more conservative with flapping (120second >delay if we've just sent an update) but that doesn't seem to be the >majority of traffic. > 2. I'll also slow down refreshes to every 12 days, rather than 7, but >again it's only a marginal change. > > But basically, the majority of updates I saw two weeks ago are actually > refreshes, not spam. > > Hope that adds something? > Rusty. > > Fabrice Drouin writes: > > Additional info on channel_update traffic: > > > > Comparing daily backups of routing tables over the last 2 weeks shows > > that nearly all channels get at least a new update every day. This > > means that channel_update traffic is not primarily cause by nodes > > publishing new updates when channel are about to become stale: > > otherwise we would see 1/14th of our channels getting a new update on > > the first day, then another 1/14th on the second day and so on.This is > > confirmed by comparing routing table backups over a single day: nearly > > all channels were updated, one average once, with an update that > > almost always does not include new information. > > > > It could be caused by "flapping" channels, probably because the hosts > > that are hosting them are not reliable (as in is often offline). > > > > Heuristics can be used to improve traffic but it's orhtogonal to the > > problem of improving our current sync protocol. > > Also, these heuristics would probaly be used to close channels to > > unreliable nodes instead of filtering/delaying publishing updates for > > them. > > > > Finally, this is not just obsessing over bandwidth (though bandwidth > > is a real issue for most mobile users). I'm also over obsessing over > > startup time and payment UX :), because they do matter a lot for > > mobile users, and would like to push the current gossip design as far > > as it can go. I also think that we'll face the same issue when > > designing inventory messages for channel_update messages. > > > > Cheers, > > > > Fabrice > > > > > > > > On Wed, 9 Jan 2019 at 00:44, Rusty Russell wrote: > >> > >> Fabrice Drouin writes: > >> > I think there may even be a simpler case where not replacing updates > >> > will result in nodes not knowing that a channel has been re-enabled: > >> > suppose you got 3 updates U1, U2, U3 for the same channel, U2 disables > >> > it, U3 enables it again and is the same as U1. If you discard it and > >> > just keep U1, and your peer has U2, how will you tell them that the > >> > channel has been enabled again ? Unless "discard" here means keep the > >> > update but don't broadcast it ? > >> > >> This can only happen if you happen to lose connection to the peer(s) > >> which sent U2 before it sends U3. > >> > >> Again, this corner case penalizes flapping channels. If we also > >> ratelimit our own enables to 1 per 120 seconds, you won't hit this case? > >> > >> > But then there's a risk that nodes would discard channels as stale > >> > because they don't get new updates when they reconnect. > >> > >> You need to accept redundant updates after 1 week, I think. > >> > >> Cheers, > >> Rusty. ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
[Lightning-dev] Removing lnd's source code from the Lightning specs repository
Hello, When you navigate to https://github.com/lightningnetwork/ you find - the Lightning Network white paper - the Lightning Network specifications - and ... the source code for lnd! This has been an anomaly for years, which has created some confusion between Lightning the open-source protocol and Lightning Labs, one of the companies specifying and implementing this protocol, but we didn't do anything about it. I believe that was a mistake: a few days ago, Arcane Research published a fairly detailed report on the state of the Lightning Network: https://twitter.com/ArcaneResearch/status/1445442967582302213. They obviously did some real work there, and seem to imply that their report was vetted by Open Node and Lightning Labs. Yet in the first version that they published you’ll find this: "Lightning Labs, founded in 2016, has developed the reference client for the Lightning Network called Lightning Network Daemon (LND) They also maintain the network standards documents (BOLTs) repository." They changed it because we told them that it was wrong, but the fact that in 2021 people who took time do do proper research, interviews, ... can still misunderstand that badly how the Lightning developers community works means that we ourselves badly underestimated how confusing mixing the open-source specs for Lightning and the source code for one of its implementations can be. To be clear, I'm not blaming Arcane Research that much for thinking that an implementation of an open-source protocol that is hosted with the white paper and specs for that protocol is a "reference" implementation, and thinking that since Lightning Labs maintains lnd then they probably maintain the other stuff too. The problem is how that information is published. So I'm proposing that lnd's source code be removed from https://github.com/lightningnetwork/ (and moved to https://github.com/lightninglabs for example, with the rest of their Lightning tools, but it's up to Lightning Labs). Thanks, Fabrice ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Removing lnd's source code from the Lightning specs repository
On Tue, 12 Oct 2021 at 01:14, Martin Habovštiak wrote: > > I can confirm I moved a repository few months ago and all links kept working > fine. > Yes, github makes it really easy, and you keep your issues, PRs, stars, .. depending on your dev/packaging you may need to rename packages (something java/scala/... devs have to do from time to time) but it's also very simple. The issue here is not technical. Fabrice ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
Re: [Lightning-dev] Removing lnd's source code from the Lightning specs repository
On Tue, 12 Oct 2021 at 21:57, Olaoluwa Osuntokun wrote: > Also note that lnd has _never_ referred to itself as the "reference" > implementation. A few years ago some other implementations adopted that > title themselves, but have since adopted softer language. I don't remember that but if you're referring to c-lightning it was the first lightning implementation, and the only one for a while, so in a way it was a "reference" at the time ? Or it could have been a reference to their policy of "implementing the spec, all the spec and nothing but the spec" ? > I think it's worth briefly revisiting a bit of history here w.r.t the github > org in question. In the beginning, the lightningnetwork github org was > created by Joseph, and the lightningnetwork/paper repo was added, the > manuscript that kicked off this entire thing. Later lightningnetwork/lnd was > created where we started to work on an initial implementation (before the > BOLTs in their current form existed), and we were added as owners. > Eventually we (devs of current impls) all met up in Milan and decided to > converge on a single specification, thus we added the BOLTs to the same > repo, despite it being used for lnd and knowingly so. Yes, work on c-lightning then eclair then lnd all began a long time before the BOLTs process was implemented, and we all set up repos, accounts... I agree that we all inherited things from the "pre-BOLTS" era and changing them will create some friction but I still believe it should be done. You also mentioned potential admin rights issues on the current specs repos which would be solved by moving them to a new clean repo. > As it seems the primary grievance here is collocating an implementation of > Lightning along with the _specification_ of the protocol, and given that the > spec was added last, how about we move the spec to an independent repo owned > by the community? I currently have github.com/lightning, and would be happy > to donate it to the community, or we could create a new org like > "lightning-specs" or something similar. Sounds great! github.com/lightning is nice (and I like Damian's idea of using github.com/lightning/bolts) and seems to please everyone so it looks that we have a plan! Fabrice ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev