Re: [Lightning-dev] Superbolt Proposal - a professionally run LN subset delivering superior UX

2020-03-02 Thread ZmnSCPxj via Lightning-dev
Good morning Robert,

Unfortunately, this proposal is basically a proposal to split the network into 
a central network of specially-chosen nodes surrounded by second-class and 
third-class nodes that are utterly dependent on the central network, which I 
personally find disturbing.

Of course, it may be that this is already where the network is heading, which 
is sad.
Gravity exists, and is difficult to resist against; yet as long as I remain 
standing on my own two legs (since I am human, I possess two legs), I resist 
gravity.

In any case, other than that, here are some more thoughts:

> it may be beneficial to have a maximum node capitalization limit.

This is trivially worked around by running multiple virtual nodes, say on the 
same machine but behind different Tor .onion addresses.
Then any benefit you get would be a mirage.
If you go through with this, I suggest that such limits not be imposed anyway, 
as it is trivial to get around them.

Disallowing Tor .onion addresses would be bad as well: it should be allowed 
that some high-liquidity nodes have their privacy protected if needed.

> 5.  Attestation: Any LN node which claims to meet the requirements to be 
> included in SBN would be rated by a randomized subset of the SBN network and 
> the inquiring node would receive cryptographically signed attestation that 
> the node is either valid or invalid.

How would you bootstrap this SBN?
Who are the first members of the SBN, and why should they let, say, furries 
join the SBN by attesting to them?
If the first members all hate furries (and everybody hates furries, after all, 
why do you think the Cats movie bombed?) then even a random subset of those 
first SBN members will not attest to any furries, because furries are ew.

Note that we already have attestation to the liquidity of a node, by having the 
node publish its channels, since channels are attested on the blockchain, whose 
blocks are attested economically (i.e. a sufficiently rich furry can always 
create channels on the blockchain, because censorship-resistant).
What is missing is a censorship-resistant attestation of the ***uptime*** of a 
node.

Now of course, a furry might manage to get through by at least first hiding the 
fact that it is a furry, but once discovered, a "randomly-selected" subset of 
the SBN would then counter-attest that the furry is actually only 98.9% up, 
revoking its membership from the SBN.
This gets worse if the furry was using its open public IP rather than sensibly 
using a Tor .onion address (which means that, for the protection of furry and 
non-furry alike, we must support Tor .onion addresses for SBN members).

Which brings up the next topic: how does the "random selection" work?
It might be doable to use entropy from the onchain block IDs, assuming miners 
are not willing to increase the difficulty of their work further by biasing 
their blocks against those furries (which all miners also hate, because 
everybody hates furries).
But that means there has to be some authoritative set of SBN members (from 
which a deterministic algorithm would choose a subset), and there would need to 
be consensus on what that set of SBN members ***is*** as well, and how to keep 
around this data of who all the SBN members are, and so on.
This starts to look like a tiny federated / proof-of-stake blockchain, which we 
would like to avoid because blockchains have bad scaling, though I suppose this 
might be acceptable if the only transactions are removal and adding of SBN 
members.
What would the block rate be, and who are the genesis SBN members (and are any 
of them furries)?
How bad will this get a decade from now, and how many will be using SPV for 
this set-of-SBN-members federated blockchain?

> 2.  Ignoring the 48% of unreachable nodes, payment success rate is 66% on the 
> first payment attempt. With multiple retries for the payment, success rates 
> reach about 80%. This means that even for nodes which are available and 
> reachable, 20% of payments are not able to complete. This is not good.

I note that, as I understood the presentation, the data-gathering model was 
that every node had an equal chance of being the payee.

However, we should note that not every public node expects to be a payee at all 
times, and that for nodes with a brisk business on Lightning, their success 
chances are, as I understand it, higher.
Thus the actual experience is better than the dire numbers suggested in the 
presentation.
Of course, I have no numbers or data to present here.

In general, if you are expecting a payment over Lightning, you will generally 
arrange to make this as smooth as possible, and will ensure you have incoming 
liquidity, that your node was actually up during the time you expect a payment, 
and so on (you have a strong incentive to do so, because you like money); 
whereas the model used was that everybody gets a  payment (that they cannot 
claim because the payment hash was a random number) and not everyo

[Lightning-dev] Superbolt Proposal - a professionally run LN subset delivering superior UX

2020-03-02 Thread Robert Allen
Superbolt Proposal

*Introduction*

Currently, the LN user experience is far from retail ready.
Inbound/outbound channel liquidity issues and node dropouts mean that many
payment attempts will not succeed.

I have spent some time thinking through these issues and believe a BOLT
specification which would enforce a stricter set of rules for nodes to
follow and which would ensure sufficient liquidity, uptime and channel
rebalancing automation would move the needle greatly in the direction of a
UX which could go mainstream. If LN is currently resulting in many “gutter
balls,” Superbolt would be like bowling with bumpers. This BOLT would be
optional for LN nodes to use or not depending on whether they wish to
participate in the Superbolt network directly.

*The Problem*

In Christian Decker’s talk

at The Lightning Conference (Berlin, October 2019) he presented some
frustrating statistics from a study he conducted to test payment routing
success/failure on Lightning Network (LN) using payment probes. Some of the
salient points:


   1.

   48% of payment probes failed to find a payment path to the targeted
   node. This is likely because either the node itself was offline or a
   connecting node along the path was offline.
   2.

   Ignoring the 48% of unreachable nodes, payment success rate is 66% on
   the first payment attempt. With multiple retries for the payment, success
   rates reach about 80%. This means that even for nodes which are available
   and reachable, 20% of payments are not able to complete. This is not good.
   3.

   Stuck payments (initiated but not completed) because a node died along
   the path occurred at approximately 0.19%.


It should go without saying that a payment network which works less than
50% of the time presents a user experience which will never catch on with
the vast majority of the total addressable market. If you are flipping a
coin every time you attempt to use a payment method, you will quickly
abandon this method for one which works reliably.

To make matters worse, I believe Christian was attempting to route
micropayments for his testing of the network, so the above numbers may
actually be optimistic when you factor in attempting to route larger
payments (even just a few hundred dollars worth of BTC). For example, a
route may be found for the desired payment but if there is insufficient
liquidity on one of those hops (either due to insufficient channel capacity
or because of inbound/outbound liquidity issues), then the payment will
fail.

In summary, there are two fundamental problems with LN as it is currently
functioning:

   1.

   Connectivity: Node uptime and connectivity to the broader network are
   both insufficient to guarantee payment success.
   2.

   Throughput: Node channel capacity is frequently insufficient due to low
   total capacity and/or inbound/outbound liquidity snags.


*Proposal*

I am proposing an LN BOLT, called Superbolt Network (SBN). Conceptually,
this might be analogous to an “electrical grid” for LN. SBN would enforce
and/or automate the following:


   1.

   Liquidity: Distinct and uniform LN node classes with commensurate total
   node and per channel liquidity requirements. To begin, two node classes are
   proposed.
   1.

  Routing Node (RN) - 4 BTC total node capacity, 4 x 1 BTC channels
  (0.5 BTC per side) to other RNs, 8 x 0.5 BTC (0.25 BTC per side) channels
  to ANs. 3 of 4 RN connections should be with shared peers (i.e.
A => B => C
  => A) while the 4th connection should be with an RN without
shared peers to
  ensure the network is sufficiently connected.
  2.

  Access Node (AN) - 1 BTC total node capacity, 2 x 0.5 BTC channels
  (0.25 BTC per side) to RNs. 10 x 0.1 BTC channels (0.05 BTC per side) to
  regular LN wallets/individual users/etc. RNs should be peers to allow off
  chain rebalancing via circular payments.
  3.

  Please note: Additional node classes (larger or smaller) may be
  beneficial to network performance. However, to maintain sufficient
  decentralization, it may be beneficial to have a maximum node
  capitalization limit.
  2.

   Uptime: Nodes would be required to maintain uptime to the network of at
   least 99% availability. Nodes which fall below this requirement for a
   determined period of time would be ostracised by the rest of the network
   and perhaps eventually excised completely from SBN. I believe we could use
   chanfitness  from lnd
   v0.9.0-beta
   
   and add some logic to check for fitness and then some scripting to
   automatically route around bad nodes.
   3.

   Channel balancing: To ensure that channels do not become stuck from
   inbound/outbound liquidity snags, the protocol would include some scripting
   to 

Re: [Lightning-dev] Sphinx Rendezvous Update

2020-03-02 Thread Christian Decker
Hi Bastien,

thanks for verifying my proposal, and I do share your concerns regarding
privacy leaks (how many hops are encoded in the onion) and success ratio
if a payment is based on a fixed (partial) path.

> I believe this makes it quite usable in Bolt 11 invoices, without blowing up
> the size of the QR code (but more experimentation is needed on that).

It becomes a tradeoff of how small you want your onion to be, and how
many hops the partial onion can have. For longer partial onions we're
getting close to the current full onion size, but I expect most partial
onion to be close to the network diameter of ~6 (excluding degerenate
chains). So the example below with 5 hops seemed realistic, and dropping
the legacy format in favor of TLVs we can get a couple of bytes back as
well.

>> As an example such an onion, with 5 legacy hops (65 byte each) results
>> in a 325 + 66 bytes onion, and we save 975 bytes.
>
> While having flexibility when choosing the length of the prefill
> stream feels nice, wouldn't it be safer to impose a fixed size to
> avoid any kind of heuristic at `RV` to try to guess how many hops
> there are between him and the recipient?

I'm currently just using the maximum size, which is an obvious privacy
leak, but I'm also planning on exposing the size to be prefilled, and
hence cropped out when compressing, when generating. Ideally we'd have a
couple of presets, i.e., 1/4, 2/4, 3/4, and adhere to them, randomizing
which one we pick.

Having smaller partial onions would enable my stretch goal of being able
to chain multiple partial onions, though that might be a useless
achievement to unlock xD

>> Compute a shared secret using a random ephemeral private key and
>> `RV`s public key, and then generate a prefill-key
>
>
> While implementing, I felt that the part about the shared secret used
> to generate the prefill stream is a bit blurry (your proposal on
> Github doesn't phrase it the same way). I think it's important to
> stress that this secret is derived from both `ephkey` and `RV`'s
> private key, so that `RV+1` can't compute the same stream.

I noticed the same while implementing the decompress stage, which
requires the node ID from `RV` during generation, and performs ECDH +
HKDF with the `RV` node private and the ephemeral key in the *next*
onion, i.e., the one extracted from the payload itself. This is
necessary since the ephemeral key on the incoming onion, which delivered
the partial onion in its payload is not controlled by the partial onion
creator, while the one in the partial onion is.

This means that the ephemeral key in the partial onion is used twice:

 - Once by `RV` to generate the obfuscation stream to fill in the gap
 - As part of the reconstructed onion, processed by `RV+1` to decode the
   onion.

I'm convinced this is secure and doesn't leak information since
otherwise transporting the ephemeral key publicly would be insecure
(`RV+1` can't generate the obfuscation secret used to fill in the gap
without access to `RV`s private key), and the ephemeral key is only
transmitted in cleartext once (from `RV` to `RV+1`), otherwise it is
hidden in the outer onion.

> Another thing that may be worth mentioning is error forwarding. Since
> the recipient generated the onion, `RV` won't have the shared secrets
> (that's by design). So it's expected that payment errors won't be
> readable by `RV`, but it's probably a good idea if `RV` returns an
> indication to the sender that the payment failed *after* the
> rendezvous point.

Indeed, this is pretty much by design, since otherwise the sender could
provoke errors, e.g., consuming all of `RV`s outgoing capacity with
probes to get back temporary channel failure errors for the channel that
was encoded in the partial onion, and then do that iteratively until we
have identified the real destination which we weren't supposed to learn.

So any error beyond `RV` should be treated by the sender as "rendez-vous
failed, discard partial onion".

> An important side-note is that your proposal is really quick and
> simple to implement from the existing Sphinx code. I have made ASCII
> diagrams of the scheme (see [1]).  This may help readers visualize it
> more easily.

I quickly skimmed the drawings and they're very nice to understand how
regions overlap, that was my main problem with the whole sphinx
construction, so thanks for taking the time :+1:

> It still has the issue that each hop's amount/cltv is fixed at invoice
> generation time by the recipient. That means MPP cannot be used, and
> if any channel along the path updates their fee the partial onion
> becomes invalid (unless you overpay the fees).
>
> Trampoline should be able to address that since it provides more
> freedom to each trampoline node to find an efficient way to forward to
> the next trampoline.  It's not yet obvious to me how I can mix these
> two proposals to make it work though.  I'll spend more time
> experimenting with that.

True, I think rendez-vous routing have