Re: [Lightning-dev] Scaling Lightning With Simple Covenants
Hi aj, A few more thoughts on this trust/safety vs. capital efficiency tradeoff: > Optimising that formula by making LA [the channel's active lifetime] as large > as possible doesn't > necessarily work -- if a casual user spends all their funds and > disappears prior to the active lifetime running out, then those > funds can't be easily spent by B until the total lifetime runs out, > so depending on how persistent your casual users are, I think that's > another way of ending up with your capital locked up unproductively. The risk of the casual user spending all of their funds can be addressed by having the casual user prepay the cost-of-capital fees for the dedicated user's funds for the entire lifetime of the channel. Then, whenever the dedicated user's funds increase or decrease (due to a send or receive by the casual user), a corresponding prepayment adjustment is included in the new balances defined by the send or receive HTLC. With prepayments, the dedicated user can safely agree to a long active lifetime for the channel. In the paper, I assumed an active lifetime of 110,000 blocks (about 2.1 years), but allowed the casual users to obtain a new channel every 10,000 blocks (about 2.5 months) by staggering their timeout-trees ([1], Sec. 4.8 and 5). The paper includes a rollover period (which covers the casual user's unavailability for up to 2.5 months) in addition to the timeout-tree's active lifetime (25 months) and inactive lifetime (1 week for putting leaves onchain, which definitely introduces risk). Here are some rough calculations if one wants to eliminate risk by making the inactive lifetime long enough to put all leaves of all timeout-trees onchain before the timeout-trees' expiries: TIME BOUND ON NUMBER OF LEAVES: -- There are approximately 52,500 blocks / year, each with at most 4M vbytes, for a total of approximately 210B = 2.1 * 10^11 vbytes per year. If each leaf requires an average (when all leaves in a tree are put onchain) of 2,100 vbytes, then 2.1 * 10^11 / 2,100 = 10^8 = 100M leaves can be put onchain in 1 year with full block capacity devoted to leaves, and in 1/x years with a fraction of x capacity devoted to leaves. Therefore, at x = 0.5 capacity: 50M leaves can be put onchain per year 100M leaves can be put onchain in 2 years 1B leaves can be put onchain in 20 years 10B leaves can be put onchain in 200 years 100B leaves can be put onchain in 2,000 years Assuming an active lifetime of 2.1 years, adding an inactive period of 2 years may be plausible, depending on the cost of capital. Therefore, scaling to 100M or so leaves (across all timeout-trees) while maintaining the ability to put all leaves onchain may be doable. On the other hand, an inactive period of 20 years seems unreasonable. As a result, scaling to billions of leaves probably requires trading off safety vs. capital efficiency (as you noted). FEERATE BOUND ON NUMBER OF LEAVES: - If each leaf requires a maximum (when only that leaf is put onchain) of 10,500 vbytes and the feerate is at least 2 satoshis / vbyte, then each leaf must be worth at least 21,000 satoshis (or else the dedicated user may not have an incentive to be honest, as the casual user would lose funds by putting their leaf onchain). There are at most 2.1 * 10^15 satoshis in existence, so there can be at most 2.1 * 10^15 / 21,000 = 10^11 = 100B leaves. I wrote a small python3 program for analyzing scalability given the requirement that all timeout-tree leaves can be put onchain. The trickiest part was figuring out how to quantify the onchain fees caused by increasing the fraction of each block that's devoted to casual users putting their leaves onchain. I wanted a function that multiplies the base feerate by a factor of 1 when no leaves are put onchain and by a factor approaching infinity when nearly all of the block space is devoted to leaves. I started with the function Fe/(1-x), where Fe is the base feerate (without leaves put onchain) and x is the fraction of block space devoted to putting leaves onchain. This function has the desired behavior when x is near 0 or 1, and it doubles the base feerate when half the block space is devoted to leaves. In reality, the feerate probably increases faster than that, so I added an exponent to capture how quickly the feerate grows: feerate = Fe/(1-x)^Ex where Ex is an arbitrary exponent. The program has the following tunable parameters: * Ac: length (in blocks) of active lifetime of each TT (timeout-tree) * Ro: length (in blocks) of rollover period of each TT (provides for casual user's unavailability) * AS: average size (in vbytes) of transactions required to put one TT leaf onchain when all leaves in TT are put onchain * MS: maximum size (in vbytes) of transactions required to put one TT leaf onchain when only one leaf in TT is put onchain * Fe: feerate (in sats/vbyte) assuming 0% of b
Re: [Lightning-dev] Scaling Lightning With Simple Covenants
Hi aj, I completely agree with your observation that there's an important trust/safety vs. capital-efficiency tradeoff, and I almost completely agree with your analysis. > (There are probably ways around this with additional complexity: eg, > you could peer with a dedicated node, and have the timeout path be > "you+them+timeout", so that while you could steal from casual users who > don't rollover, you can't steal from your dedicated peer, so that $4.5B > could be rolled into a channel with them, and used for routing) Yes, that would work, but I think it's better to have dedicated user B pair with another dedicated user C such that each leaf of the timeout-tree funds a hierarchical channel [1] of the form (A_i, (B, C)), where A_i is a casual user. If A_i performs an active rollover, all funds not owned by A_i can *always* be used by B and C to route payments that are unrelated to the casual users in the timeout-tree (including both before and after A_i's funds are drained). This idea was described in the "Improving Capital Efficiency" section of the post. Passive rollovers complicate this, as A_i's funds are neither definitely in the old timeout-tree or in the new timeout-tree during the rollover. However, if one is willing to take on the complexity, it's possible to use *your* (very cool!) idea of funding an HTLC from one of two possible sources, where one of those sources is guaranteed to eventually be available (but the offerer and offeree of the HTLC don't know which one will be available to them) [2][3]. In this case, B and C could use the funds from the old and the new timeout-trees that are not owned by A_i to route payments. If A_i puts the leaf in the old timeout-tree on-chain, B and C use funds from the new timeout-tree to fund their HTLC (and vice-versa). Even if hierarchical channels are used to improve the capital efficiency, I think the "thundering herd" problem is a big concern. This could play out very poorly in practice, as casual users would gain experience with ever larger timeout-trees and not have any problems. Then, suddenly, a large number of dedicated users collude by failing to roll-over timeout-trees at the same time, and they create enough congestion on the blockchain that they're able to steal a large fraction of the casual users' funds. I have a proposed change to the Bitcoin consensus rules that I think could address this problem. Basically, rather than have timeout-trees expire at a given block height, they should expire only after a sufficient number of low-fee blocks have been added to the blockchain after some given block height. As a result, if dedicated users colluded and tried to steal funds by not rolling-over a group of timeout-trees, the thundering herd of transactions from casual users would push up the fees enough to prevent the timeout-trees from expiring, thus safeguarding the casual user's funds. In fact, the impact to the dedicated users (in addition to their loss of reputation) would be that their capital would be unavailable to them for a longer period of time. Thus, this should be effective in deterring dedicated users from attempting such a theft. On the other hand, when the dedicated users do roll-over funds correctly, there is no delay in the old timeout-trees' expiries, and thus better capital efficiency. There are lots of details to the idea and I'm currently in the process of writing a paper and post describing it. A couple more quick details: * rather than counting low-fee blocks, time is measured in low-fee windows, where the size of the window is programmable (this makes it much harder for dishonest miners to collude with the dedicated users by creating enough fake low-fee blocks, not containing the casual users' higher-fee timeout-tree transactions, to enabe the theft; it also reduces the compute cost for counting the low-fee windows), * the threshold for a "low-fee" block is programmable, * there is a bound on how long one keeps waiting for low-fee windows (in order to bound the storage and compute overheads), and * a similar technique supports relative, rather than absolute, delays. I think such a mechanism is likely to be useful in many areas, including HTLCs, but that timeout-trees really highlight the need for something like this. Regards, John [1] Law, "Resizing Lightning Channels Off-Chain With Hierarchical Channels", https://github.com/JohnLaw2/ln-hierarchical-channels [2] Towns, "Re: Resizing Lightning Channels Off-Chain With Hierarchical Channels", https://lists.linuxfoundation.org/pipermail/lightning-dev/2023-April/003913.html [3] Law, "Re: Resizing Lightning Channels Off-Chain With Hierarchical Channels", https://lists.linuxfoundation.org/pipermail/lightning-dev/2023-April/003917.html Sent with Proton Mail secure email. ___ Lightning-dev mailing list Lightning-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/li
Re: [Lightning-dev] Scaling Lightning With Simple Covenants
On Fri, Sep 08, 2023 at 06:54:46PM +, jlspc via Lightning-dev wrote: > TL;DR > = I haven't really digested this, but I think there's a trust vs capital-efficiency tradeoff here that's worth extracting. Suppose you have a single UTXO, that's claimable by "B" at time T+L, but at time T that UTXO holds funds belonging not only to B, but also millions of casual users, C_1..C_100. If B cheats (eg by not signing any further lightning updates between now and time T+L), then each casual user needs to drop their channel to the chain, or else lose all their funds. (Passive rollovers doesn't change this -- it just moves the responsibility for dropping the channel to the chain to some other participant) That then faces the "thundering herd" problem -- instead of the single one-in/one-out tx that we expected when B is doing the right thing, we're instead seeing between 1M and 2M on-chain txs as everyone recovers their funds (the number of casual users multiplied by some factor that depends on how many outputs each internal tx has). But whether an additional couple of million txs is a problem depends on how long a timeframe they're spread over -- if it's a day or two, then it might simply be impossible; if it's over a year or more, it may not even be noticable; if it's somewhere in between, it might just mean you're paying a modest amount in additional fees than you'd have normally expected. Suppose that casual users have a factor in mind, eg "If worst comes to worst, and everyone decides to exit at the same time I do, I want to be sure that only generates 100 extra transactions per block if everyone wants to recover their funds prior to B being able to steal everything". Then in that case, they can calculate along the following lines: 1M users with 2-outputs per internal tx means 2M transactions, divide that by 100 gives 20k blocks, at 144 blocks per day, that's 5 months. Therefore, I'm going to ensure all my funds are rolled over to a new utxo while there's at least 5 months left on the timeout. That lowers B's capital efficiency -- if all the causal users follow that policy, then B is going to own all the funds in Fx for five whole months before it can access them. So each utxo here has its total lifetime (L) actually split into two phases: an active lifetime LA of some period, and an inactive lifetime of LI=5 months, which would have been used by everyone to recover their funds if B had attempted to block normal rollover. The capital efficiency is then reduced by a factor of 1/(1+LA/LI). (LI is dependent on the number of users, their willingness to pay high fees to recover their funds, and global blockchain capacity, LA is L-LI, L is your choice) Note that casual users can't easily reduce their LI timeout just by having the provider split them into different utxos -- if the provider cheats/fails, that's almost certainly a correlated across all their utxos, and all the participants across each of those utxos will need to drop to the chain to preserve their funds, each competing with each other for confirmation. Also, if different providers collude, they can cause problems: if you expected 2M transactions over five months due to one provider failing, that's one thing; but if a dozen providers fail simultaneously, then that balloons up to perhaps 24M txs over the same five months, or perhaps 25% of every block, which may be quite a different matter. Ignoring that caveat, what do numbers here look like? If you're a provider who issues a new utxo every week (so new customers can join without too much delay), have a million casual users as customers, and target LA=16 weeks (~3.5 months), so users don't need to rollover too frequently, and each user has a balanced channel with $2000 of their own funds, and $2000 of your funds, so they can both pay and be paid, then your utxos might look like: active_1 through active_16: 62,500 users each; $250M balance each inactive_17 through inactive_35: $250M balance each, all your funds, waiting for timeout to be usable That's: * $2B of user funds * $2B of your funds in active channels * $4.5B of your funds locked up, waiting for timeout In that case, only 30% of the $6.5B worth of working capital that you've dedicated to lightning is actually available for routing. Optimising that formula by making LA as large as possible doesn't necessarily work -- if a casual user spends all their funds and disappears prior to the active lifetime running out, then those funds can't be easily spent by B until the total lifetime runs out, so depending on how persistent your casual users are, I think that's another way of ending up with your capital locked up unproductively. (There are probably ways around this with additional complexity: eg, you could peer with a dedicated node, and have the timeout path be "you+them+timeout", so that while you could steal from casual users who don't rollover, you can't steal from your dedicated peer, so that $4.5B could be