Re: [bitcoin-dev] [Lightning-dev] On the scalability issues of onboarding millions of LN mobile clients

2020-05-05 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi Antoine,

> Even with cheaper, more efficient protocols like BIP 157, you may have a
> huge discrepancy between what is asked and what is offered. Assuming 10M
> light clients [0] each of them consuming ~100MB/month for filters/headers,
> that means you're asking 1PB/month of traffic to the backbone network. If
> you assume 10K public nodes, like today, assuming _all_ of them opt-in to
> signal BIP 157, that's an increase of 100GB/month for each. Which is
> consequent with regards to the estimated cost of 350GB/month for running
> an actual public node

One really dope thing about BIP 157+158, is that the protocol makes serving
light clients now _stateless_, since the full node doesn't need to perform
any unique work for a given client. As a result, the entire protocol could
be served over something like HTTP, taking advantage of all the established
CDNs and anycast serving infrastructure, which can reduce syncing time
(less latency to
fetch data) and also more widely distributed the load of light clients using
the existing web infrastructure. Going further, with HTTP/2's server-push
capabilities, those serving this data can still push out notifications for
new headers, etc.

> Therefore, you may want to introduce monetary compensation in exchange of
> servicing filters. Light client not dedicating resources to maintain the
> network but free-riding on it, you may use their micro-payment
> capabilities to price chain access resources [3]

Piggy backing off the above idea, if the data starts being widely served
over HTTP, then LSATs[1][2] can be used to add a lightweight payment
mechanism by inserting a new proxy server in front of the filter/header
infrastructure. The minted tokens themselves may allow a user to purchase
access to a single header/filter, a range of them in the past, or N headers
past the known chain tip, etc, etc.

-- Laolu

[1]: https://lsat.tech/
[2]: https://lightning.engineering/posts/2020-03-30-lsat/


On Tue, May 5, 2020 at 3:17 AM Antoine Riard 
wrote:

> Hi,
>
> (cross-posting as it's really both layers concerned)
>
> Ongoing advancement of BIP 157 implementation in Core maybe the
> opportunity to reflect on the future of light client protocols and use this
> knowledge to make better-informed decisions about what kind of
> infrastructure is needed to support mobile clients at large scale.
>
> Trust-minimization of Bitcoin security model has always relied first and
> above on running a full-node. This current paradigm may be shifted by LN
> where fast, affordable, confidential, censorship-resistant payment services
> may attract a lot of adoption without users running a full-node. Assuming a
> user adoption path where a full-node is required to benefit for LN may
> deprive a lot of users, especially those who are already denied a real
> financial infrastructure access. It doesn't mean we shouldn't foster node
> adoption when people are able to do so, and having a LN wallet maybe even a
> first-step to it.
>
> Designing a mobile-first LN experience opens its own gap of challenges
> especially in terms of security and privacy. The problem can be scoped as
> how to build a scalable, secure, private chain access backend for millions
> of LN clients ?
>
> Light client protocols for LN exist (either BIP157 or Electrum are used),
> although their privacy and security guarantees with regards to
> implementation on the client-side may still be an object of concern
> (aggressive tx-rebroadcast, sybillable outbound peer selection, trusted fee
> estimation). That said, one of the bottlenecks is likely the number of
> full-nodes being willingly to dedicate resources to serve those clients.
> It's not about _which_ protocol is deployed but more about _incentives_ for
> node operators to dedicate long-term resources to client they have lower
> reasons to care about otherwise.
>
> Even with cheaper, more efficient protocols like BIP 157, you may have a
> huge discrepancy between what is asked and what is offered. Assuming 10M
> light clients [0] each of them consuming ~100MB/month for filters/headers,
> that means you're asking 1PB/month of traffic to the backbone network. If
> you assume 10K public nodes, like today, assuming _all_ of them opt-in to
> signal BIP 157, that's an increase of 100GB/month for each. Which is
> consequent with regards to the estimated cost of 350GB/month for running an
> actual public node. Widening full-node adoption, specially in term of
> geographic distribution means as much as we can to bound its operational
> cost.
>
> Obviously,  deployment of more efficient tx-relay protocol like Erlay will
> free up some resources but it maybe wiser to dedicate them to increase
> health and security of the backbone network like deploying more outbound
> connections.
>
> Unless your light client protocol is so ridiculous cheap to rely on
> niceness of a subset of node operators offering free resources, it won't
> scale. And it's likely you will always have a ratio 

Re: [bitcoin-dev] RBF Pinning with Counterparties and Competing Interest

2020-04-22 Thread Olaoluwa Osuntokun via bitcoin-dev
> Indeed, that is what I’m suggesting

Gotcha, if this is indeed what you're suggesting (all HTLC spends are now
2-of-2 multi-sig), then I think the modifications to the state machine I
sketched out in an earlier email are required. An exact construction which
achieves the requirements of "you can't broadcast until you have a secret
which I can obtain from the htlc sig for your commitment transaction, and my
secret is revealed with another swap", appears to be an open problem, atm.

Even if they're restricted in this fashion (must be a 1-in-1 out,
sighashall, fees are pre agreed upon), they can still spend that with a CPFP
(while still unconfirmed in the mempool) and create another heavy tree,
which puts us right back at the same bidding war scenario?

> There are a bunch of ways of doing pinning - just opting into RBF isn’t
> even close to enough.

Mhmm, there're other ways of doing pinning. But with anchors as is defined
in that spec PR, they're forced to spend with an RBF-replaceable
transaction, which means the party wishing to time things out can enter into
a bidding war. If the party trying to impeded things participates in this
progressive absolute fee increase, it's likely that the war terminates
with _one_ of them getting into the block, which seems to resolve
everything?

-- Laolu


On Wed, Apr 22, 2020 at 4:20 PM Matt Corallo 
wrote:

>
>
> On Apr 22, 2020, at 16:13, Olaoluwa Osuntokun  wrote:
>
>
> > Hmm, maybe the proposal wasn't clear. The idea isn't to add signatures to
> > braodcasted transactions, but instead to CPFP a maybe-broadcasted
> > transaction by sending a transaction which spends it and seeing if it is
> > accepted
>
> Sorry I still don't follow. By "we clearly need to go the other direction -
> all HTLC output spends need to be pre-signed.", you don't mean that the
> HTLC
> spends of the non-broadcaster also need to be an off-chain 2-of-2 multi-sig
> covenant? If the other party isn't restricted w.r.t _how_ they can spend
> the
> output (non-rbf'd, ect), then I don't see how that addresses anything.
>
>
> Indeed, that is what I’m suggesting. Anchor output and all. One thing we
> could think about is only turning it on over a certain threshold, and
> having a separate “only-kinda-enforceable-on-chain-HTLC-in-flight” limit.
>
> Also see my mail elsewhere in the thread that the other party is actually
> forced to spend their HTLC output using an RBF-replaceable transaction.
> With
> that, I think we're all good here? In the end both sides have the ability
> to
> raise the fee rate of their spending transactions with the highest winning.
> As long as one of them confirms within the CLTV-delta, then everyone is
> made whole.
>
>
> It does seem like my cached recollection of RBF opt-in was incorrect but
> please re-read the intro email. There are a bunch of ways of doing pinning
> - just opting into RBF isn’t even close to enough.
>
> [1]: https://github.com/bitcoin/bitcoin/pull/18191
>
>
> On Wed, Apr 22, 2020 at 9:50 AM Matt Corallo 
> wrote:
>
>> A few replies inline.
>>
>> On 4/22/20 12:13 AM, Olaoluwa Osuntokun wrote:
>> > Hi Matt,
>> >
>> >
>> >> While this is somewhat unintuitive, there are any number of good
>> anti-DoS
>> >> reasons for this, eg:
>> >
>> > None of these really strikes me as "good" reasons for this limitation,
>> which
>> > is at the root of this issue, and will also plague any more complex
>> Bitcoin
>> > contracts which rely on nested trees of transaction to confirm (CTV,
>> Duplex,
>> > channel factories, etc). Regarding the various (seemingly arbitrary)
>> package
>> > limits it's likely the case that any issues w.r.t computational
>> complexity
>> > that may arise when trying to calculate evictions can be ameliorated
>> with
>> > better choice of internal data structures.
>> >
>> > In the end, the simplest heuristic (accept the higher fee rate package)
>> side
>> > steps all these issues and is also the most economically rationale from
>> a
>> > miner's perspective. Why would one prefer a higher absolute fee package
>> > (which could be very large) over another package with a higher total
>> _fee
>> > rate_?
>>
>> This seems like a somewhat unnecessary drive-by insult of a project you
>> don't contribute to, but feel free to start with
>> a concrete suggestion here :).
>>
>> >> You'll note that B would be just fine if they had a way to safely
>> monitor the
>> >> global mempool, and while this seems like a prudent mitigation for
>> >> lightning implementations to deploy today, it is itself a quagmire of
>> >> complexity
>> >
>> > Is it really all that complex? Assuming we're talking about just
>> watching
>> > for a certain script template (the HTLC scipt) in the mempool to be
>> able to
>> > pull a pre-image as soon as possible. Early versions of lnd used the
>> mempool
>> > for commitment broadcast detection (which turned out to be a bad idea
>> so we
>> > removed it), but at a glance I don't see why watching the mempool is so
>> > complex.
>>
>> 

Re: [bitcoin-dev] RBF Pinning with Counterparties and Competing Interest

2020-04-22 Thread Olaoluwa Osuntokun via bitcoin-dev
> This seems like a somewhat unnecessary drive-by insult of a project you
> don't contribute to, but feel free to start with a concrete suggestion
> here :).

This wasn't intended as an insult at all. I'm simply saying if there's
concern about worst case eviction/replacement, optimizations likely exist.
Other developers that are interested in more complex multi-transaction
contracts have realized this as well, and there're various open PRs that
attempt to propose such optimizations [1].

> Hmm, maybe the proposal wasn't clear. The idea isn't to add signatures to
> braodcasted transactions, but instead to CPFP a maybe-broadcasted
> transaction by sending a transaction which spends it and seeing if it is
> accepted

Sorry I still don't follow. By "we clearly need to go the other direction -
all HTLC output spends need to be pre-signed.", you don't mean that the HTLC
spends of the non-broadcaster also need to be an off-chain 2-of-2 multi-sig
covenant? If the other party isn't restricted w.r.t _how_ they can spend the
output (non-rbf'd, ect), then I don't see how that addresses anything.

Also see my mail elsewhere in the thread that the other party is actually
forced to spend their HTLC output using an RBF-replaceable transaction. With
that, I think we're all good here? In the end both sides have the ability to
raise the fee rate of their spending transactions with the highest winning.
As long as one of them confirms within the CLTV-delta, then everyone is
made whole.


[1]: https://github.com/bitcoin/bitcoin/pull/18191


On Wed, Apr 22, 2020 at 9:50 AM Matt Corallo 
wrote:

> A few replies inline.
>
> On 4/22/20 12:13 AM, Olaoluwa Osuntokun wrote:
> > Hi Matt,
> >
> >
> >> While this is somewhat unintuitive, there are any number of good
> anti-DoS
> >> reasons for this, eg:
> >
> > None of these really strikes me as "good" reasons for this limitation,
> which
> > is at the root of this issue, and will also plague any more complex
> Bitcoin
> > contracts which rely on nested trees of transaction to confirm (CTV,
> Duplex,
> > channel factories, etc). Regarding the various (seemingly arbitrary)
> package
> > limits it's likely the case that any issues w.r.t computational
> complexity
> > that may arise when trying to calculate evictions can be ameliorated with
> > better choice of internal data structures.
> >
> > In the end, the simplest heuristic (accept the higher fee rate package)
> side
> > steps all these issues and is also the most economically rationale from a
> > miner's perspective. Why would one prefer a higher absolute fee package
> > (which could be very large) over another package with a higher total _fee
> > rate_?
>
> This seems like a somewhat unnecessary drive-by insult of a project you
> don't contribute to, but feel free to start with
> a concrete suggestion here :).
>
> >> You'll note that B would be just fine if they had a way to safely
> monitor the
> >> global mempool, and while this seems like a prudent mitigation for
> >> lightning implementations to deploy today, it is itself a quagmire of
> >> complexity
> >
> > Is it really all that complex? Assuming we're talking about just watching
> > for a certain script template (the HTLC scipt) in the mempool to be able
> to
> > pull a pre-image as soon as possible. Early versions of lnd used the
> mempool
> > for commitment broadcast detection (which turned out to be a bad idea so
> we
> > removed it), but at a glance I don't see why watching the mempool is so
> > complex.
>
> Because watching your own mempool is not guaranteed to work, and during
> upgrade cycles that include changes to the
> policy rules an attacker could exploit your upgraded/non-upgraded status
> to perform the same attack.
>
> >> Further, this is a really obnoxious assumption to hoist onto lightning
> >> nodes - having an active full node with an in-sync mempool is a lot more
> >> CPU, bandwidth, and complexity than most lightning users were expecting
> to
> >> face.
> >
> > This would only be a requirement for Lightning nodes that seek to be a
> part
> > of the public routing network with a desire to _forward_ HTLCs. This
> isn't
> > doesn't affect laptops or mobile phones which likely mostly have private
> > channels and don't participate in HTLC forwarding. I think it's pretty
> > reasonable to expect a "proper" routing node on the network to be backed
> by
> > a full-node. The bandwidth concern is valid, but we'd need concrete
> numbers
> > that compare the bandwidth over head of mempool awareness (assuming the
> > latest and greatest mempool syncing) compared with the overhead of the
> > channel update gossip and gossip queries over head which LN nodes face
> today
> > as is to see how much worse off they really would be.
>
> If mempool-watching were practical, maybe, though there are a number of
> folks who are talking about designing
> partially-offline local lightning hubs which would be rendered impractical.
>
> > As detailed a bit below, if nodes watch the 

Re: [bitcoin-dev] [Lightning-dev] RBF Pinning with Counterparties and Competing Interest

2020-04-22 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi z,

Actually, the current anchors proposal already does this, since it enforces
a
CSV of 1 block before the HTLCs can be spent (the block after
confirmation). So
I think we already do this, meaning the malicious node is already forced to
use
an RBF-replaceable transaction.

-- Laolu


On Wed, Apr 22, 2020 at 4:05 PM Olaoluwa Osuntokun 
wrote:

> Hi Z,
>
> > It seems to me that, if my cached understanding that `<0>
> > OP_CHECKSEQUENCEVERIFY` is sufficient to require RBF-flagging, then
> adding
> > that to the hashlock branch (2 witness bytes, 0.5 weight) would be a
> pretty
> > low-weight mitigation against this attack.
>
> I think this works...so they're forced to spend the output with a non-final
> sequence number, meaning it *must* signal RBF. In this case, now it's the
> timeout-er vs the success-er racing based on fee rate. If the honest party
> (the
> one trying to time out the HTLC) bids a fee rate higher (need to also
> account
> for the whole absolute fee replacement thing), then things should generally
> work out in their favor.
>
> -- Laolu
>
>
> On Tue, Apr 21, 2020 at 11:08 PM ZmnSCPxj  wrote:
>
>> Good morning Laolu, Matt, and list,
>>
>>
>> > >  * With `SIGHASH_NOINPUT` we can make the C-side signature
>> > >  `SIGHASH_NOINPUT|SIGHASH_SINGLE` and allow B to re-sign the B-side
>> > >  signature for a higher-fee version of HTLC-Timeout (assuming my
>> cached
>> > >  understanding of `SIGHASH_NOINPUT` still holds).
>> >
>> > no_input isn't needed. With simply single+anyone can pay, then B can
>> attach
>> > a new input+output pair to increase the fees on their HTLC redemption
>> > transaction. As you mention, they now enter into a race against this
>> > malicious ndoe to bump up their fees in order to win over the other
>> party.
>>
>> Right, right, that works as well.
>>
>> >
>> > If the malicious node uses a non-RBF signalled transaction to sweep
>> their
>> > HTLC, then we enter into another level of race, but this time on the
>> mempool
>> > propagation level. However, if there exists a relay path to a miner
>> running
>> > full RBF, then B's higher fee rate spend will win over.
>>
>> Hmm.
>>
>> So basically:
>>
>> * B has no mempool, because it wants to reduce its costs and etc.
>> * C broadcasts a non-RBF claim tx with low fee before A->B locktime (L+1).
>> * B does not notice this tx because:
>>   1.  The tx is too low fee to be put in a block.
>>   2.  B has no mempool so it cannot see the tx being propagated over the
>> P2P network.
>> * B tries to broadcast higher-fee HTLC-timeout, but fails because it
>> cannot replace a non-RBF tx.
>> * After L+1, C contacts the miners off-band and offers fee payment by
>> other means.
>>
>> It seems to me that, if my cached understanding that `<0>
>> OP_CHECKSEQUENCEVERIFY` is sufficient to require RBF-flagging, then adding
>> that to the hashlock branch (2 witness bytes, 0.5 weight) would be a pretty
>> low-weight mitigation against this attack.
>>
>> So I think the combination below gives us good size:
>>
>> * The HTLC-Timeout signature from C is flagged with
>> `OP_SINGLE|OP_ANYONECANPAY`.
>>   * Normally, the HTLC-Timeout still deducts the fee from the value of
>> the UTXO being spent.
>>   * However, if B notices that the L+1 timeout is approaching, it can
>> fee-bump HTLC-Timeout with some onchain funds, recreating its own signature
>> but reusing the (still valid) C signature.
>> * The hashlock branch in this case includes `<0> OP_CHECKSEQUENCEVERIFY`,
>> preventing C from broadcasting a low-fee claim tx.
>>
>> This has the advantages:
>>
>> * B does not need a mempool still and can run in `blocksonly`.
>> * The normal path is still the same as current behavior, we "only" add a
>> new path where if the L+1 timeout is approaching we fee-bump the
>> HTLC-Timeout.
>> * Costs are pretty low:
>>   * No need for extra RBF carve-out txo.
>>   * Just two additional witness bytes in the hashlock branch.
>> * No mempool rule changes needed, can be done with the P2P network of
>> today.
>>   * Probably still resilient even with future changes in mempool rules,
>> as long as typical RBF behaviors still remain.
>>
>> Is my understanding correct?
>>
>> Regards,
>> ZmnSCPxj
>>
>> >
>> > -- Laolu
>> >
>> > On Tue, Apr 21, 2020 at 9:13 PM ZmnSCPxj via bitcoin-dev <
>> bitcoin-dev@lists.linuxfoundation.org> wrote:
>> >
>> > > Good morning Matt, and list,
>> > >
>> > > > RBF Pinning HTLC Transactions (aka "Oh, wait, I can steal
>> funds, how, now?")
>> > > > =
>> > > >
>> > > > You'll note that in the discussion of RBF pinning we were
>> pretty broad, and that that discussion seems to in fact cover
>> > > > our HTLC outputs, at least when spent via (3) or (4). It does,
>> and in fact this is a pretty severe issue in today's
>> > > > lightning protocol [2]. A lightning counterparty (C, who
>> received the HTLC from B, who received it from A) today could,
>> > > > if B broadcasts the commitment transaction, 

Re: [bitcoin-dev] [Lightning-dev] RBF Pinning with Counterparties and Competing Interest

2020-04-22 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi Z,

> It seems to me that, if my cached understanding that `<0>
> OP_CHECKSEQUENCEVERIFY` is sufficient to require RBF-flagging, then adding
> that to the hashlock branch (2 witness bytes, 0.5 weight) would be a
pretty
> low-weight mitigation against this attack.

I think this works...so they're forced to spend the output with a non-final
sequence number, meaning it *must* signal RBF. In this case, now it's the
timeout-er vs the success-er racing based on fee rate. If the honest party
(the
one trying to time out the HTLC) bids a fee rate higher (need to also
account
for the whole absolute fee replacement thing), then things should generally
work out in their favor.

-- Laolu


On Tue, Apr 21, 2020 at 11:08 PM ZmnSCPxj  wrote:

> Good morning Laolu, Matt, and list,
>
>
> > >  * With `SIGHASH_NOINPUT` we can make the C-side signature
> > >  `SIGHASH_NOINPUT|SIGHASH_SINGLE` and allow B to re-sign the B-side
> > >  signature for a higher-fee version of HTLC-Timeout (assuming my cached
> > >  understanding of `SIGHASH_NOINPUT` still holds).
> >
> > no_input isn't needed. With simply single+anyone can pay, then B can
> attach
> > a new input+output pair to increase the fees on their HTLC redemption
> > transaction. As you mention, they now enter into a race against this
> > malicious ndoe to bump up their fees in order to win over the other
> party.
>
> Right, right, that works as well.
>
> >
> > If the malicious node uses a non-RBF signalled transaction to sweep their
> > HTLC, then we enter into another level of race, but this time on the
> mempool
> > propagation level. However, if there exists a relay path to a miner
> running
> > full RBF, then B's higher fee rate spend will win over.
>
> Hmm.
>
> So basically:
>
> * B has no mempool, because it wants to reduce its costs and etc.
> * C broadcasts a non-RBF claim tx with low fee before A->B locktime (L+1).
> * B does not notice this tx because:
>   1.  The tx is too low fee to be put in a block.
>   2.  B has no mempool so it cannot see the tx being propagated over the
> P2P network.
> * B tries to broadcast higher-fee HTLC-timeout, but fails because it
> cannot replace a non-RBF tx.
> * After L+1, C contacts the miners off-band and offers fee payment by
> other means.
>
> It seems to me that, if my cached understanding that `<0>
> OP_CHECKSEQUENCEVERIFY` is sufficient to require RBF-flagging, then adding
> that to the hashlock branch (2 witness bytes, 0.5 weight) would be a pretty
> low-weight mitigation against this attack.
>
> So I think the combination below gives us good size:
>
> * The HTLC-Timeout signature from C is flagged with
> `OP_SINGLE|OP_ANYONECANPAY`.
>   * Normally, the HTLC-Timeout still deducts the fee from the value of the
> UTXO being spent.
>   * However, if B notices that the L+1 timeout is approaching, it can
> fee-bump HTLC-Timeout with some onchain funds, recreating its own signature
> but reusing the (still valid) C signature.
> * The hashlock branch in this case includes `<0> OP_CHECKSEQUENCEVERIFY`,
> preventing C from broadcasting a low-fee claim tx.
>
> This has the advantages:
>
> * B does not need a mempool still and can run in `blocksonly`.
> * The normal path is still the same as current behavior, we "only" add a
> new path where if the L+1 timeout is approaching we fee-bump the
> HTLC-Timeout.
> * Costs are pretty low:
>   * No need for extra RBF carve-out txo.
>   * Just two additional witness bytes in the hashlock branch.
> * No mempool rule changes needed, can be done with the P2P network of
> today.
>   * Probably still resilient even with future changes in mempool rules, as
> long as typical RBF behaviors still remain.
>
> Is my understanding correct?
>
> Regards,
> ZmnSCPxj
>
> >
> > -- Laolu
> >
> > On Tue, Apr 21, 2020 at 9:13 PM ZmnSCPxj via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org> wrote:
> >
> > > Good morning Matt, and list,
> > >
> > > > RBF Pinning HTLC Transactions (aka "Oh, wait, I can steal funds,
> how, now?")
> > > > =
> > > >
> > > > You'll note that in the discussion of RBF pinning we were pretty
> broad, and that that discussion seems to in fact cover
> > > > our HTLC outputs, at least when spent via (3) or (4). It does,
> and in fact this is a pretty severe issue in today's
> > > > lightning protocol [2]. A lightning counterparty (C, who
> received the HTLC from B, who received it from A) today could,
> > > > if B broadcasts the commitment transaction, spend an HTLC using
> the preimage with a low-fee, RBF-disabled transaction.
> > > > After a few blocks, A could claim the HTLC from B via the
> timeout mechanism, and then after a few days, C could get the
> > > > HTLC-claiming transaction mined via some out-of-band agreement
> with a small miner. This leaves B short the HTLC value.
> > >
> > > My (cached) understanding is that, since RBF is signalled using
> `nSequence`, any `OP_CHECKSEQUENCEVERIFY` also automatically 

Re: [bitcoin-dev] [Lightning-dev] RBF Pinning with Counterparties and Competing Interest

2020-04-21 Thread Olaoluwa Osuntokun via bitcoin-dev
> So what is needed is to allow B to add fees to HTLC-Timeout:

Indeed, anchors as defined in #lightning-rfc/688 allows this.

>  * With `SIGHASH_NOINPUT` we can make the C-side signature
>  `SIGHASH_NOINPUT|SIGHASH_SINGLE` and allow B to re-sign the B-side
>  signature for a higher-fee version of HTLC-Timeout (assuming my cached
>  understanding of `SIGHASH_NOINPUT` still holds).

no_input isn't needed. With simply single+anyone can pay, then B can attach
a new input+output pair to increase the fees on their HTLC redemption
transaction. As you mention, they now enter into a race against this
malicious ndoe to bump up their fees in order to win over the other party.

If the malicious node uses a non-RBF signalled transaction to sweep their
HTLC, then we enter into another level of race, but this time on the mempool
propagation level. However, if there exists a relay path to a miner running
full RBF, then B's higher fee rate spend will win over.

-- Laolu

On Tue, Apr 21, 2020 at 9:13 PM ZmnSCPxj via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> Good morning Matt, and list,
>
>
>
> > RBF Pinning HTLC Transactions (aka "Oh, wait, I can steal funds,
> how, now?")
> > =
> >
> > You'll note that in the discussion of RBF pinning we were pretty
> broad, and that that discussion seems to in fact cover
> > our HTLC outputs, at least when spent via (3) or (4). It does, and
> in fact this is a pretty severe issue in today's
> > lightning protocol [2]. A lightning counterparty (C, who received
> the HTLC from B, who received it from A) today could,
> > if B broadcasts the commitment transaction, spend an HTLC using the
> preimage with a low-fee, RBF-disabled transaction.
> > After a few blocks, A could claim the HTLC from B via the timeout
> mechanism, and then after a few days, C could get the
> > HTLC-claiming transaction mined via some out-of-band agreement with
> a small miner. This leaves B short the HTLC value.
>
> My (cached) understanding is that, since RBF is signalled using
> `nSequence`, any `OP_CHECKSEQUENCEVERIFY` also automatically imposes the
> requirement "must be RBF-enabled", including `<0> OP_CHECKSEQUENCEVERIFY`.
> Adding that clause (2 bytes in witness if my math is correct) to the
> hashlock branch may be sufficient to prevent C from making an RBF-disabled
> transaction.
>
> But then you mention out-of-band agreements with miners, which basically
> means the transaction might not be in the mempool at all, in which case the
> vulnerability is not really about RBF or relay, but sheer economics.
>
> The payment is A->B->C, and the HTLC A->B must have a larger timeout (L +
> 1) than the HTLC B->C (L), in abstract non-block units.
> The vulnerability you are describing means that the current time must now
> be L + 1 or greater ("A could claim the HTLC from B via the timeout
> mechanism", meaning the A->B HTLC has timed out already).
>
> If so, then the B->C transaction has already timed out in the past and can
> be claimed in two ways, either via B timeout branch or C hashlock branch.
> This sets up a game where B and C bid to miners to get their version of
> reality committed onchain.
> (We can neglect out-of-band agreements here; miners have the incentive to
> publicly leak such agreements so that other potential bidders can offer
> even higher fees for their versions of that transaction.)
>
> Before L+1, C has no incentive to bid, since placing any bid at all will
> leak the preimage, which B can then turn around and use to spend from A,
> and A and C cannot steal from B.
>
> Thus, B should ensure that *before* L+1, the HTLC-Timeout has been
> committed onchain, which outright prevents this bidding war from even
> starting.
>
> The issue then is that B is using a pre-signed HTLC-timeout, which is
> needed since it is its commitment tx that was broadcast.
> This prevents B from RBF-ing the HTLC-Timeout transaction.
>
> So what is needed is to allow B to add fees to HTLC-Timeout:
>
> * We can add an RBF carve-out output to HTLC-Timeout, at the cost of more
> blockspace.
> * With `SIGHASH_NOINPUT` we can make the C-side signature
> `SIGHASH_NOINPUT|SIGHASH_SINGLE` and allow B to re-sign the B-side
> signature for a higher-fee version of HTLC-Timeout (assuming my cached
> understanding of `SIGHASH_NOINPUT` still holds).
>
> With this, B can exponentially increase the fee as L+1 approaches.
> If B can get HTLC-Timeout confirmed before L+1, then C cannot steal the
> HTLC value at all, since the UTXO it could steal from has already been
> spent.
>
> In particular, it does not seem to me that it is necessary to change the
> hashlock-branch transaction of C at all, since this mechanism is enough to
> sidestep the issue (as I understand it).
> But it does point to a need to make HTLC-Timeout (and possibly
> symmetrically, HTLC-Success) also fee-bumpable.
>
> Note as well that this does not require a mempool: B can run in
> 

Re: [bitcoin-dev] RBF Pinning with Counterparties and Competing Interest

2020-04-21 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi Matt,


> While this is somewhat unintuitive, there are any number of good anti-DoS
> reasons for this, eg:

None of these really strikes me as "good" reasons for this limitation, which
is at the root of this issue, and will also plague any more complex Bitcoin
contracts which rely on nested trees of transaction to confirm (CTV, Duplex,
channel factories, etc). Regarding the various (seemingly arbitrary) package
limits it's likely the case that any issues w.r.t computational complexity
that may arise when trying to calculate evictions can be ameliorated with
better choice of internal data structures.

In the end, the simplest heuristic (accept the higher fee rate package) side
steps all these issues and is also the most economically rationale from a
miner's perspective. Why would one prefer a higher absolute fee package
(which could be very large) over another package with a higher total _fee
rate_?

> You'll note that B would be just fine if they had a way to safely monitor
the
> global mempool, and while this seems like a prudent mitigation for
> lightning implementations to deploy today, it is itself a quagmire of
> complexity

Is it really all that complex? Assuming we're talking about just watching
for a certain script template (the HTLC scipt) in the mempool to be able to
pull a pre-image as soon as possible. Early versions of lnd used the mempool
for commitment broadcast detection (which turned out to be a bad idea so we
removed it), but at a glance I don't see why watching the mempool is so
complex.

> Further, this is a really obnoxious assumption to hoist onto lightning
> nodes - having an active full node with an in-sync mempool is a lot more
> CPU, bandwidth, and complexity than most lightning users were expecting to
> face.

This would only be a requirement for Lightning nodes that seek to be a part
of the public routing network with a desire to _forward_ HTLCs. This isn't
doesn't affect laptops or mobile phones which likely mostly have private
channels and don't participate in HTLC forwarding. I think it's pretty
reasonable to expect a "proper" routing node on the network to be backed by
a full-node. The bandwidth concern is valid, but we'd need concrete numbers
that compare the bandwidth over head of mempool awareness (assuming the
latest and greatest mempool syncing) compared with the overhead of the
channel update gossip and gossip queries over head which LN nodes face today
as is to see how much worse off they really would be.

As detailed a bit below, if nodes watch the mempool, then this class of
attack assuming the anchor output format as described in the open
lightning-rfc PR is mitigated. At a glance, watching the mempool seems like
a far less involved process compared to modifying the state machine as its
defined today. By watching the mempool and implementing the changes in
#lightning-rfc/688, then this issue can be mitigated _today_. lnd 0.10
doesn't yet watch the mempool (but does include anchors [1]), but unless I'm
missing something it should be pretty straight forward to add which mor or
less
resolves this issue all together.

> not fixing this issue seems to render the whole exercise somewhat useless

Depends on if one considers watching the mempool a fix. But even with that a
base version of anchors still resolves a number of issues including:
eliminating the commitment fee guessing game, allowing users to pay less on
force close, being able to coalesce 2nd level HTLC transactions with the
same CLTV expiry, and actually being able to reliably enforce multi-hop HTLC
resolution.

> Instead of making the HTLC output spending more free-form with
> SIGHASH_ANYONECAN_PAY|SIGHASH_SINGLE, we clearly need to go the other
> direction - all HTLC output spends need to be pre-signed.

I'm not sure this is actually immediately workable (need to think about it
more). To see why, remember that the commit_sig message includes HTLC
signatures for the _remote_ party's commitment transaction, so they can
spend the HTLCs if they broadcast their version of the commitment (force
close). If we don't somehow also _gain_ signatures (our new HTLC signatures)
allowing us to spend HTLCs on _their_ version of the commitment, then if
they broadcast that commitment (without revoking), then we're unable to
redeem any of those HTLCs at all, possibly losing money.

In an attempt to counteract this, we might say ok, the revoke message also
now includes HTLC signatures for their new commitment allowing us to spend
our HTLCs. This resolves things in a weaker security model, but doesn't
address the issue generally, as after they receive the commit_sig, they can
broadcast immediately, again leaving us without a way to redeem our HTLCs.

I'd need to think about it more, but it seems that following this path would
require an overhaul in the channel state machine to make presenting a new
commitment actually take at least _two phases_ (at least a full round trip).
The first phase would tender the commitment, but 

Re: [bitcoin-dev] Interrogating a BIP157 server, BIP158 change proposal

2019-02-06 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi Tamas,

> The only advantage I see in the current design choice is filter size, but
> even that is less impressive in recent history and going forward, as
address
> re-use is much less frequent nowadays than it was Bitcoin’s early days.

Gains aren't only had with address re-use, it's also the case that if an
input is spent in the same block as it was created, then only a single items
is inserted into the filter. Filters spanning across several blocks would
also see savings due to the usage of input scripts.

Another advantage of using input scripts is that it allows rescans where all
keys are known ahead of time to proceed in parallel, which can serve to
greatly speed up rescans in bitcoind. Additionally, it allows light clients
to participate in protocols like atomic swaps using the input scripts as
triggers for state transitions. If outpoints were used, then the party that
initiated the swap would need to send the cooperating party all possible
txid's that may be generated due to fee bumps (RBF or sighash single
tricks). Using the script, the light client simply waits for it to be
revealed in a block (P2WSH) and then it can carry on the protocol.

> Clear advantages of moving to spent outpoint + output script filter:

> 1. Filter correctness can be proven by downloading the block in question
only.

Yep, as is they can verify half the filter. With auxiliary data, they can
verify the entire thing. Once committed, they don't need to verify at all.
We're repeating a discussion that played out 7 months ago with no new
information or context.

> 2. Calculation of the filter on server side does not need UTXO.

This is incorrect. Filter calculation can use the spentness journal (or undo
blocks) that many full node implementations utilize.

> This certainly improves with a commitment, but that is not even on the
> roadmap yet, or is it?

I don't really know of any sort of roadmaps in Bitcoin development. However,
I think there's relatively strong support to adding a commitment, once the
current protocol gets more usage in the wild, which it already is today on
mainnet.

> Should a filter be committed that contains spent outpoints, then such
> filter would be even more useful

Indeed, this can be added as a new filter type, optionally adding created
outpoints as you referenced in your prior email.

> Since Bitcoin Core is not yet serving any filters, I do not think this
> discussion is too late.

See my reply to Matt on the current state of deployment. It's also the case
that bitcoind isn't the only full node implementation used in the wild.
Further changes would also serve to delay inclusion into bitcoind. The
individuals proposing these PRs to bitcoind has participated in this
discussion 7 months ago (along with many of the contributors to this
project). Based in this conversation 7 months ago, it's my understanding
that all parties are aware of the options and tradeoffs to be had.

-- Laolu


On Tue, Feb 5, 2019 at 12:10 PM Tamas Blummer 
wrote:

> Hi Laolu,
>
> The only advantage I see in the current design choice is filter size, but
> even that is less
> impressive in recent history and going forward, as address re-use is much
> less frequent nowadays
> than it was Bitcoin’s early days.
>
> I calculated total filter sizes since block 500,000:
>
> input script + output script (current BIP): 1.09 GB
> spent outpoint + output script: 1.26 GB
>
> Both filters are equally useful for a wallet to discover relevant
> transactions, but the current design
> choice seriously limits, practically disables a light client, to prove
> that the filter is correct.
>
> Clear advantages of moving to spent outpoint + output script filter:
>
> 1. Filter correctness can be proven by downloading the block in question
> only.
> 2. Calculation of the filter on server side does not need UTXO.
> 3. Spent outpoints in the filter enable light clients to do further
> probabilistic checks and even more if committed.
>
> The current design choice offers lower security than now attainable. This
> certainly improves with
> a commitment, but that is not even on the roadmap yet, or is it?
>
> Should a filter be committed that contains spent outpoints, then such
> filter would be even more useful:
> A client could decide on availability of spent coins of a transaction
> without maintaining the UTXO set, by
> checking the filters if the coin was spent after its origin proven in an
> SPV manner, evtl. eliminating false positives
> with a block download. This would be slower than having UTXO but require
> only immutable store, no unwinds and
> only download of a few blocks.
>
> Since Bitcoin Core is not yet serving any filters, I do not think this
> discussion is too late.
>
> Tamas Blummer
>
>
> > On Feb 5, 2019, at 02:42, Olaoluwa Osuntokun  wrote:
> >
> > Hi Tamas,
> >
> > This is how the filter worked before the switch over to optimize for a
> > filter containing the minimal items needed for a regular wallet to
> function.
> > When this was 

Re: [bitcoin-dev] Interrogating a BIP157 server, BIP158 change proposal

2019-02-06 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi Matt,

> In (the realistic) thread model where an attacker is trying to blind you
> from some output, they can simply give you "undo data" where scriptPubKeys
> are OP_TRUE instead of the real script and you'd be none the wiser.

It depends on the input. If I'm trying to verify an input that's P2WSH,
since the witness script is included in the witness (the last element), I
can easily verify that the pkScript given is the proper witness program.

> Huh? I don't think we should seriously consider
> only-one-codebase-has-deployed-anything-with-very-limited-in-the-wild-use
as
> "too late into the current deployment"?

I'd wager that most developers reading this email right now are familiar
with neutrino as a project. Many even routinely use "neutrino" to refer to
BIP 157+158. There are several projects in the wild that have already
deployed applications built on lnd+neutrino live on mainnet. lnd+neutrino is
also the only project (as far as I'm aware) that has fully integrated the
p2p BIP 157+158 into a wallet, and also uses the filters for higher level
applications.

I'm no stranger to this argument, as I made the exact same one 7 months ago
when the change was originally discussed. Since then I realized that using
input scripts can be even _more_ flexible as light clients can use them as
set up or triggers for multi-party protocols such as atomic swaps. Using
scripts also allows for faster rescans if one knows all their keys ahead of
time, as the checks can be parallelized. Additionally, the current filter
also lends better to an eventual commitment as you literally can't remove
anything from it, and still have it be useful for the traditional wallet use
case.

As I mentioned in my last email, this can be added as an additional filter
type, leaving it up the full node implementations that have deployed the
base protocol to integrate it or not.

-- Laolu


On Tue, Feb 5, 2019 at 4:21 AM Matt Corallo 
wrote:

>
> On 2/4/19 8:18 PM, Jim Posen via bitcoin-dev wrote:
> - snip -
>  > 1) Introduce a new P2P message to retrieve all prev-outputs for a given
>  > block (essentially the undo data in Core), and verify the scripts
>  > against the block by executing them. While this permits some forms of
>  > input script malleability (and thus cannot discriminate between all
>  > valid and invalid filters), it restricts what an attacker can do. This
>  > was proposed by Laolu AFAIK, and I believe this is how btcd is
> proceeding.
>
> I'm somewhat confused by this - how does the undo data help you without
> seeing the full (mistate compressed) transaction? In (the realistic)
> thread model where an attacker is trying to blind you from some output,
> they can simply give you "undo data" where scriptPubKeys are OP_TRUE
> instead of the real script and you'd be none the wiser.
>
> On 2/5/19 1:42 AM, Olaoluwa Osuntokun via bitcoin-dev wrote:
> - snip -
> > I think it's too late into the current deployment of the BIPs to change
> > things around yet again. Instead, the BIP already has measures in place
> for
> > adding _new_ filter types in the future. This along with a few other
> filter
> > types may be worthwhile additions as new filter types.
> - snip -
>
> Huh? I don't think we should seriously consider
> only-one-codebase-has-deployed-anything-with-very-limited-in-the-wild-use
> as "too late into the current deployment"?
>
> Matt
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Interrogating a BIP157 server, BIP158 change proposal

2019-02-04 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi Tamas,

This is how the filter worked before the switch over to optimize for a
filter containing the minimal items needed for a regular wallet to function.
When this was proposed, I had already implemented the entire proposal from
wallet to full-node. At that point, we all more or less decided that the
space savings (along with intra-block compression) were worthwhile, we
weren't cutting off any anticipated application level use cases (at that
point we had already comprehensively integrated both filters into lnd), and
that once committed the security loss would disappear.

I think it's too late into the current deployment of the BIPs to change
things around yet again. Instead, the BIP already has measures in place for
adding _new_ filter types in the future. This along with a few other filter
types may be worthwhile additions as new filter types.

-- Laolu

On Mon, Feb 4, 2019 at 12:59 PM Tamas Blummer 
wrote:

> I participated in that discussion in 2018, but have not had the insight
> gathered by now though writing both client and server implementation of
> BIP157/158
>
> Pieter Wuille considered the design choice I am now suggesting here as
> alternative (a) in:
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-June/016064.html
> In his evaluation he recognized that a filter having spent output and
> output scripts would allow decision on filter correctness by knowing the
> block only.
> He did not evaluate the usefulness in the context of checkpoints, which I
> think are an important shortcut here.
>
> Yes, a filter that is collecting input and output scripts is shorter if
> script re-use is frequent, but I showed back in 2018 in the same thread
> that this saving is not that significant in recent history as address reuse
> is no longer that frequent.
>
> A filter on spent outpoint is just as useful for wallets as is one on
> spent script, since they naturally scan the blockchain forward and thereby
> learn about their coins by the output script before they need to check
> spends of those outpoints.
>
> It seems to me that implementing an interrogation by evtl. downloading
> blocks at checkpoints is much simpler than following multiple possible
> filter paths.
>
> A spent outpoint filter allows us to decide on coin availability based on
> immutable store, without updated and eventually rolled back UTXO store. The
> availability could be decided by following the filter path to current tip
> to genesis and
> check is the outpoint was spent earlier. False positives can be sorted out
> with a block download. Murmel implements this if running in server mode,
> where blocks are already there.
>
> Therefore I ask for a BIP change based on better insight gained through
> implementation.
>
> Tamas Blummer
>
> On Feb 4, 2019, at 21:18, Jim Posen  wrote:
>
> Please see the thread "BIP 158 Flexibility and Filter Size" from 2018
> regarding the decision to remove outpoints from the filter [1].
>
> Thanks for bringing this up though, because more discussion is needed on
> the client protocol given that clients cannot reliably determine the
> integrity of a block filter in a bandwidth-efficient manner (due to the
> inclusion of input scripts).
>
> I see three possibilities:
> 1) Introduce a new P2P message to retrieve all prev-outputs for a given
> block (essentially the undo data in Core), and verify the scripts against
> the block by executing them. While this permits some forms of input script
> malleability (and thus cannot discriminate between all valid and invalid
> filters), it restricts what an attacker can do. This was proposed by Laolu
> AFAIK, and I believe this is how btcd is proceeding.
> 2) Clients track multiple possible filter header chains and essentially
> consider the union of their matches. So if any filter received for a
> particular block header matches, the client downloads the block. The client
> can ban a peer if they 1) ever return a filter omitting some data that is
> observed in the downloaded block, 2) repeatedly serve filters that trigger
> false positive block downloads where such a number of false positives is
> statistically unlikely, or 3) repeatedly serves filters that are
> significantly larger than the expected size (essentially padding the actual
> filters with garbage to waste bandwidth). I have not done the analysis yet,
> but we should be able to come up with some fairly simple banning heuristics
> using Chernoff bounds. The main downside is that the client logic to track
> multiple possible filter chains and filters per block is more complex and
> bandwidth increases if connected to a malicious server. I first heard about
> this idea from David Harding.
> 3) Rush straight to committing the filters into the chain (via witness
> reserved value or coinbase OP_RETURN) and give up on the pre-softfork BIP
> 157 P2P mode.
>
> I'm in favor of option #2 despite the downsides since it requires the
> smallest number of changes and is supported by the BIP 157 P2P protocol 

Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-06-12 Thread Olaoluwa Osuntokun via bitcoin-dev
> An example of that cost is you arguing against specifying and supporting
the
> design that is closer to one that would be softforked, which increases the
> time until we can make these filters secure because it
> slows convergence on the design of what would get committed

Agreed, since the commitment is just flat out better, and also also less
code to validate compared to the cross p2p validation, the filter should be
as close to the committed version. This way, wallet and other apps don't
need to modify their logic in X months when the commitment is rolled out.

> Great point, but it should probably exclude coinbase OP_RETURN output.
> This would exclude the current BIP141 style commitment and likely any
> other.

Definitely. I chatted offline with sipa recently, and he suggested this as
well. Upside is that the filters will get even smaller, and also the first
filter type becomes even more of a "barebones" wallet filter. If folks
reaally want to also search OP_RETURN in the filter (as no widely deployed
applications I know of really use it), then an additional filter type can be
added in the future. It would need to be special cased to filter out the
commitment itself.

Alright, color me convinced! I'll further edit my open BIP 158 PR to:

  * exclude all OP_RETURN
  * switch to prev scripts instead of outpoints
  * update the test vectors to include the prev scripts from blocks in
addition to the block itself

-- Laolu


On Sat, Jun 9, 2018 at 8:45 AM Gregory Maxwell  wrote:

> > So what's the cost in using
> > the current filter (as it lets the client verify the filter if they want
> to,
>
> An example of that cost is you arguing against specifying and
> supporting the design that is closer to one that would be softforked,
> which increases the time until we can make these filters secure
> because it slows convergence on the design of what would get
> committed.
>
> >> I don't agree at all, and I can't see why you say so.
> >
> > Sure it doesn't _have_ to, but from my PoV as "adding more commitments"
> is
> > on the top of every developers wish list for additions to Bitcoin, it
> would
> > make sense to coordinate on an "ultimate" extensible commitment once,
> rather
> > than special case a bunch of distinct commitments. I can see arguments
> for
> > either really.
>
> We have an extensible commitment style via BIP141 already. I don't see
> why this in particular demands a new one.
>
> >   1. The current filter format (even moving to prevouts) cannot be
> committed
> >  in this fashion as it indexes each of the coinbase output scripts.
> This
> >  creates a circular dependency: the commitment is modified by the
> >  filter,
>
> Great point, but it should probably exclude coinbase OP_RETURN output.
> This would exclude the current BIP141 style commitment and likely any
> other.
>
> Should I start a new thread on excluding all OP_RETURN outputs from
> BIP-158 filters for all transactions? -- they can't be spent, so
> including them just pollutes the filters.
>
> >   2. Since the coinbase transaction is the first in a block, it has the
> >  longest merkle proof path. As a result, it may be several hundred
> bytes
> >  (and grows with future capacity increases) to present a proof to the
>
> If 384 bytes is a concern, isn't 3840 bytes (the filter size
> difference is in this ballpark) _much_ more of a concern?  Path to the
> coinbase transaction increases only logarithmically so further
> capacity increases are unlikely to matter much, but the filter size
> increases linearly and so it should be much more of a concern.
>
> > In regards to the second item above, what do you think of the old Tier
> Nolan
> > proposal [1] to create a "constant" sized proof for future commitments by
> > constraining the size of the block and placing the commitments within the
> > last few transactions in the block?
>
> I think it's a fairly ugly hack. esp since it requires that mining
> template code be able to stuff the block if they just don't know
> enough actual transactions-- which means having a pool of spendable
> outputs in order to mine, managing private keys, etc... it also
> requires downstream software not tinker with the transaction count
> (which I wish it didn't but as of today it does). A factor of two
> difference in capacity-- if you constrain to get the smallest possible
> proof-- is pretty stark, optimal txn selection with this cardinality
> constraint would be pretty weird. etc.
>
> If the community considers tree depth for proofs like that to be such
> a concern to take on technical debt for that structure, we should
> probably be thinking about more drastic (incompatible) changes... but
> I don't think it's actually that interesting.
>
> > I don't think its fair to compare those that wish to implement this
> proposal
> > (and actually do the validation) to the legacy SPV software that to my
> > knowledge is all but abandoned. The project I work on that seeks to
> deploy
>
> 

Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-06-12 Thread Olaoluwa Osuntokun via bitcoin-dev
> Doesn't the current BIP157 protocol have each filter commit to the filter
> for the previous block?

Yep!

> If that's the case, shouldn't validating the commitment at the tip of the
> chain (or buried back whatever number of blocks that the SPV client
trusts)
> obliviate the need to validate the commitments for any preceeding blocks
in
> the SPV trust model?

Yeah, just that there'll be a gap between the p2p version, and when it's
ultimately committed.

> It seems like you're claiming better security here without providing any
> evidence for it.

What I mean is that one allows you to fully verify the filter, while the
other allows you to only validate a portion of the filter and requires other
added heuristics.

> In the case of prevout+output filters, when a client receives
advertisements
> for different filters from different peers, it:

Alternatively, they can decompress the filter and at least verify that
proper _output scripts_ have been included. Maybe this is "good enough"
until its committed. If a command is added to fetch all the prev outs along
w/ a block (which would let you do another things like verify fees), then
they'd be able to fully validate the filter as well.

-- Laolu


On Sat, Jun 9, 2018 at 3:35 AM David A. Harding  wrote:

> On Fri, Jun 08, 2018 at 04:35:29PM -0700, Olaoluwa Osuntokun via
> bitcoin-dev wrote:
> >   2. Since the coinbase transaction is the first in a block, it has the
> >  longest merkle proof path. As a result, it may be several hundred
> bytes
> >  (and grows with future capacity increases) to present a proof to the
> >  client.
>
> I'm not sure why commitment proof size is a significant issue.  Doesn't
> the current BIP157 protocol have each filter commit to the filter for
> the previous block?  If that's the case, shouldn't validating the
> commitment at the tip of the chain (or buried back whatever number of
> blocks that the SPV client trusts) obliviate the need to validate the
> commitments for any preceeding blocks in the SPV trust model?
>
> > Depending on the composition of blocks, this may outweigh the gains
> > had from taking advantage of the additional compression the prev outs
> > allow.
>
> I think those are unrelated points.  The gain from using a more
> efficient filter is saved bytes.  The gain from using block commitments
> is SPV-level security---that attacks have a definite cost in terms of
> generating proof of work instead of the variable cost of network
> compromise (which is effectively free in many situations).
>
> Comparing the extra bytes used by block commitments to the reduced bytes
> saved by prevout+output filters is like comparing the extra bytes used
> to download all blocks for full validation to the reduced bytes saved by
> only checking headers and merkle inclusion proofs in simplified
> validation.  Yes, one uses more bytes than the other, but they're
> completely different security models and so there's no normative way for
> one to "outweigh the gains" from the other.
>
> > So should we optimize for the ability to validate in a particular
> > model (better security), or lower bandwidth in this case?
>
> It seems like you're claiming better security here without providing any
> evidence for it.  The security model is "at least one of my peers is
> honest."  In the case of outpoint+output filters, when a client receives
> advertisements for different filters from different peers, it:
>
> 1. Downloads the corresponding block
> 2. Locally generates the filter for that block
> 3. Kicks any peers that advertised a different filter than what it
>generated locally
>
> This ensures that as long as the client has at least one honest peer, it
> will see every transaction affecting its wallet.  In the case of
> prevout+output filters, when a client receives advertisements for
> different filters from different peers, it:
>
> 1. Downloads the corresponding block and checks it for wallet
>transactions as if there had been a filter match
>
> This also ensures that as long as the client has at least one honest
> peer, it will see every transaction affecting its wallet.  This is
> equivilant security.
>
> In the second case, it's possible for the client to eventually
> probabalistically determine which peer(s) are dishonest and kick them.
> The most space efficient of these protocols may disclose some bits of
> evidence for what output scripts the client is looking for, but a
> slightly less space-efficient protocol simply uses randomly-selected
> outputs saved from previous blocks to make the probabalistic
> determination (rather than the client's own outputs) and so I think
> should be quite private.  Neither

Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-06-08 Thread Olaoluwa Osuntokun via bitcoin-dev
dating, why would they
> implement a considerable amount of logic for this?).

I don't think its fair to compare those that wish to implement this proposal
(and actually do the validation) to the legacy SPV software that to my
knowledge is all but abandoned. The project I work on that seeks to deploy
this proposal (already has, but mainnet support is behind a flag as I
anticipated further modifications) indeed has implemented the "considerable"
amount of logic to check for discrepancies and ban peers trying to bamboozle
the light clients. I'm confident that the other projects seeking to
implement
this (rust-bitcoin-spv, NBitcoin, bcoin, maybe missing a few too) won't
find it
too difficult to implement "full" validation, as they're bitcoin developers
with quite a bit of experience.

I think we've all learned from the past defects of past light clients, and
don't seek to repeat history by purposefully implementing as little
validation
as possible. With these new projects by new authors, I think we have an
opprotunity to implement light clients "correctly" this time around.

[1]:
https://github.com/TierNolan/bips/blob/00a8d3e1ac066ce3728658c6c40240e1c2ab859e/bip-aux-header.mediawiki

-- Laolu


On Fri, Jun 8, 2018 at 9:14 AM Gregory Maxwell  wrote:

> On Fri, Jun 8, 2018 at 5:03 AM, Olaoluwa Osuntokun via bitcoin-dev
>  wrote:
> > As someone who's written and reviews code integrating the proposal all
> the
> > way up the stack (from node to wallet, to application), IMO, there's no
> > immediate cost to deferring the inclusion/creation of a filter that
> includes
> > prev scripts (b) instead of the outpoint as the "regular" filter does
> now.
> > Switching to prev script in the _short term_ would be costly for the set
> of
> > applications already deployed (or deployed in a minimal or flag flip
> gated
> > fashion) as the move from prev script to outpoint is a cascading one that
> > impacts wallet operation, rescans, HD seed imports, etc.
>
> It seems to me that you're making the argument against your own case
> here: I'm reading this as a "it's hard to switch so it should be done
> the inferior way".  That in argument against adopting the inferior
> version, as that will contribute more momentum to doing it in a way
> that doesn't make sense long term.
>
> > Such a proposal would need to be generalized enough to allow several
> components to be committed,
>
> I don't agree at all, and I can't see why you say so.
>
> > likely have versioning,
>
> This is inherent in how e.g. the segwit commitment is encoded, the
> initial bytes are an identifying cookies. Different commitments would
> have different cookies.
>
> > and also provide the necessary extensibility to allow additional items
> to be committed in the future
>
> What was previously proposed is that the commitment be required to be
> consistent if present but not be required to be present.  This would
> allow changing whats used by simply abandoning the old one.  Sparsity
> in an optional commitment can be addressed when there is less than
> 100% participation by having each block that includes a commitment
> commit to the missing filters ones from their immediate ancestors.
>
> Additional optionality can be provided by the other well known
> mechanisms,  e.g. have the soft fork expire at a block 5 years out
> past deployment, and continue to soft-fork it in for a longer term so
> long as its in use (or eventually without expiration if its clear that
> it's not going away).
>
> > wallets which wish to primarily use the filters for rescan purposes can't
> > just construct them locally for this particular use case independent of
> > what's currently deployed on the p2p network.
>
> Absolutely, but given the failure of BIP37 on the network-- and the
> apparent strong preference of end users for alternatives that don't
> scan (e.g. electrum and web wallets)-- supporting making this
> available via P2P was already only interesting to many as a nearly
> free side effect of having filters for local scanning.  If it's a
> different filter, it's no longer attractive.
>
> It seems to me that some people have forgotten that this whole idea
> was originally proposed to be a committed data-- but with an added
> advantage of permitting expirementation ahead of the commitment.
>
> > Maintaining the outpoint also allows us to rely on a "single honest
> peer"security model in the short term.
>
> You can still scan blocks directly when peers disagree on the filter
> content, regardless of how the filter is constructed-- yes, it uses
> more bandwidth if you're attacked, but it makes the attack ineffective
> and using outpoi

Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-06-07 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi sipa,

> The advantage of (a) is that it can be verified against a full block
without
> access to the outputs being spent by it
>
> The advantage of (b) is that it is more compact (scriot reuse, and outputs
> spent within the same block as they are created).

Thanks for this breakdown. I think you've accurately summarized the sole
remaining discussing point in this thread.

As someone who's written and reviews code integrating the proposal all the
way up the stack (from node to wallet, to application), IMO, there's no
immediate cost to deferring the inclusion/creation of a filter that includes
prev scripts (b) instead of the outpoint as the "regular" filter does now.
Switching to prev script in the _short term_ would be costly for the set of
applications already deployed (or deployed in a minimal or flag flip gated
fashion) as the move from prev script to outpoint is a cascading one that
impacts wallet operation, rescans, HD seed imports, etc.

Maintaining the outpoint also allows us to rely on a "single honest peer"
security model in the short term. In the long term the main barrier to
committing the filters isn't choosing what to place in the filters (as once
you have the gcs code, adding/removing elements is a minor change), but the
actual proposal to add new consensus enforced commitments to Bitcoin in the
first place. Such a proposal would need to be generalized enough to allow
several components to be committed, likely have versioning, and also provide
the necessary extensibility to allow additional items to be committed in the
future. To my knowledge no such soft-fork has yet been proposed in a serious
manner, although we have years of brainstorming on the topic. The timeline
of the drafting, design, review, and deployment of such a change would
likely be measures in years, compared to the immediate deployment of the
current p2p filter model proposed in the BIP.

As a result, I see no reason to delay the p2p filter deployment (with the
outpoint) in the short term, as the long lead time a soft-fork to add
extensible commitments to Bitcoin would give application+wallet authors
ample time to switch to the new model. Also there's no reason that full-node
wallets which wish to primarily use the filters for rescan purposes can't
just construct them locally for this particular use case independent of
what's currently deployed on the p2p network.

Finally, I've addressed the remaining comments on my PR modifying the BIP
from my last message.

-- Laolu

On Sat, Jun 2, 2018 at 11:12 PM Pieter Wuille via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> On Sat, Jun 2, 2018, 22:56 Tamas Blummer via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org> wrote:
>
>> Lighter but SPV secure nodes (filter committed) would help the network
>> (esp. Layer 2) to grow mesh like, but add more user that blindly follow POW.
>>
>> On longer term most users' security will be determined by either trusted
>> hubs or POW.
>> I do not know which is worse, but we should at least offer the choice to
>> the user, therefore commit filters.
>>
>
> I don't think that's the point of discussion here. Of course, in order to
> have filters that verifiably don't lie by omission, the filters need to be
> committed to by blocks.
>
> The question is what data that filter should contain.
>
> There are two suggestions:
> (a) The scriptPubKeys of the block's outputs, and prevouts of the block's
> inputs.
> (b) The scriptPubKeys of the block's outputs, and scriptPubKeys of outputs
> being spent by the block's inputs.
>
> The advantage of (a) is that it can be verified against a full block
> without access to the outputs being spent by it. This allows light clients
> to ban nodes that give them incorrect filters, but they do need to actually
> see the blocks (partially defeating the purpose of having filters in the
> first place).
>
> The advantage of (b) is that it is more compact (scriot reuse, and outputs
> spent within the same block as they are created). It also had the advantage
> of being more easily usable for scanning of a wallet's transactions. Using
> (a) for that in some cases may need to restart and refetch when an output
> is discovered, to go test for its spending (whose outpoint is not known
> ahead of time). Especially when fetching multiple filters at a time this
> may be an issue.
>
> I think both of these potentially good arguments. However, once a
> committed filter exists, the advantage of (a) goes away completely -
> validation of committed filters is trivial and can be done without needing
> the full blocks in the first place.
>
> So I think the question is do we aim for an uncommitted (a) first and a
> committed (b) later, or go for (b) immediately?
>
> Cheers,
>
> --
> Pieter
>
> ___
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
___

Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-06-05 Thread Olaoluwa Osuntokun via bitcoin-dev
It isn't being discussed atm (but was discussed 1 year ago when the BIP
draft was originally published), as we're in the process of removing items
or filters that aren't absolutely necessary. We're now at the point where
there're no longer any items we can remove w/o making the filters less
generally useful which signals a stopping point so we can begin widespread
deployment.

In terms of a future extension, BIP 158 already defines custom filter types,
and BIP 157 allows filters to be fetched in batch based on the block height
and numerical range. The latter feature can later be modified to return a
single composite filter rather than several individual filters.

-- Laolu


On Mon, Jun 4, 2018 at 7:28 AM Riccardo Casatta via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> I was wondering why this multi-layer multi-block filter proposal isn't
> getting any comment,
> is it because not asking all filters is leaking information?
>
> Thanks
>
> Il giorno ven 18 mag 2018 alle ore 08:29 Karl-Johan Alm via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org> ha scritto:
>
>> On Fri, May 18, 2018 at 12:25 AM, Matt Corallo via bitcoin-dev
>>  wrote:
>> > In general, I'm concerned about the size of the filters making existing
>> > SPV clients less willing to adopt BIP 158 instead of the existing bloom
>> > filter garbage and would like to see a further exploration of ways to
>> > split out filters to make them less bandwidth intensive. Some further
>> > ideas we should probably play with before finalizing moving forward is
>> > providing filters for certain script templates, eg being able to only
>> > get outputs that are segwit version X or other similar ideas.
>>
>> There is also the idea of multi-block filters. The idea is that light
>> clients would download a pair of filters for blocks X..X+255 and
>> X+256..X+511, check if they have any matches and then grab pairs for
>> any that matched, e.g. X..X+127 & X+128..X+255 if left matched, and
>> iterate down until it ran out of hits-in-a-row or it got down to
>> single-block level.
>>
>> This has an added benefit where you can accept a slightly higher false
>> positive rate for bigger ranges, because the probability of a specific
>> entry having a false positive in each filter is (empirically speaking)
>> independent. I.e. with a FP probability of 1% in the 256 range block
>> and a FP probability of 0.1% in the 128 range block would mean the
>> probability is actually 0.001%.
>>
>> Wrote about this here: https://bc-2.jp/bfd-profile.pdf (but the filter
>> type is different in my experiments)
>> ___
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>
>
> --
> Riccardo Casatta - @RCasatta 
> ___
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-06-01 Thread Olaoluwa Osuntokun via bitcoin-dev
> A typical network attacker (e.g.  someone on your lan or wifi segmet, or
> someone who has compromised or operates an upstream router) can be all of
> your peers.

This is true, but it cannot make us accept any invalid filters unless the
attacker is also creating invalid blocks w/ valid PoW.

> The original propsal for using these kinds of maps was that their digests
> could eventually be commited and then checked against the commitment,
> matching the same general security model used otherwise in SPV.

Indeed, but no such proposal for committing the filters has emerged yet.
Slinging filters with new p2p messages requires much less coordination that
adding a new committed structure to Bitcoin. One could imagine that if
consensus exists to add new committed structures, then there may also be
initiatives to start to commit sig-ops, block weight, utxo's etc. As a
result one could imagine a much longer deployment cycle compared to a pure
p2p roll out in the near term, and many applications are looking for a
viable alternative to BIP 37.

> Unfortunately, using the scripts instead of the outpoints takes us further
> away from a design that is optimized for committing (or, for that matter,
> use purely locally by a wallet)...

I agree that using the prev input scripts would indeed be optimal from a
size perspective when the filters are to be committed. The current proposal
makes way for future filter types and it's likely the case that only the
most optimal filters should be committed (while other more niche filters
perhaps, remain only on the p2p level).

-- Laolu


On Thu, May 31, 2018 at 9:14 PM Gregory Maxwell  wrote:

> On Fri, Jun 1, 2018 at 2:52 AM, Olaoluwa Osuntokun via bitcoin-dev
>  wrote:
> > One notable thing that I left off is the proposed change to use the
> previous
> > output script rather than the outpoint. Modifying the filters in this
> > fashion would be a downgrade in the security model for light clients, as
> it
>
> Only if you make a very strong assumption about the integrity of the
> nodes the client is talkign to. A typical network attacker (e.g.
> someone on your lan or wifi segmet, or someone who has compromised or
> operates an upstream router) can be all of your peers.
>
> The original propsal for using these kinds of maps was that their
> digests could eventually be commited and then checked against the
> commitment, matching the same general security model used otherwise in
> SPV.
>
> Unfortunately, using the scripts instead of the outpoints takes us
> further away from a design that is optimized for committing (or, for
> that matter, use purely locally by a wallet)...
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-05-31 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi y'all,

I've made a PR to the BIP repo to modify BIP 158 based on this thread, and
other recent threads giving feedback on the current version of the BIP:

  * https://github.com/bitcoin/bips/pull/687

I've also updated the test vectors based on the current parameters (and
filter format), and also the code used to generate the test vectors. Due to
the change in parametrization, the test vectors now target (P=19 M=784931),
and there're no longer any cases related to extended filters.

One notable thing that I left off is the proposed change to use the previous
output script rather than the outpoint. Modifying the filters in this
fashion would be a downgrade in the security model for light clients, as it
would allow full nodes to lie by omission, just as they can with BIP 37. As
is now, if nodes present conflicting information, then the light client can
download the target block, fully reconstruct the filter itself, then ban any
nodes which advertised the incorrect filter. The inclusion of the filter
header checkpoints make it rather straight forward for light clients to
bisect the state to find the conflicting advertisement, and it's strongly
recommended that they do so.

To get a feel for the level of impact these changes would have on existing
applications that depend on the txid being included in the filter, I've
implemented these changes across btcutil, btcd, btcwallet, and lnd (which
previously relied on the txid for confirmation notifications). For lnd at
least, the code impact was rather minimal, as we use the pkScript for
matching a block, but then still scan the block manually to find the precise
transaction (by txid) that we were interested in (if it's there).

-- Laolu


On Mon, May 28, 2018 at 9:01 PM Olaoluwa Osuntokun 
wrote:

> > The additional benefit of the input script/outpoint filter is to watch
> for
> > unexpected spends (coins getting stolen or spent from another wallet) or
> > transactions without a unique change or output address. I think this is a
> > reasonable implementation, and it would be nice to be able to download
> that
> > filter without any input elements.
>
> As someone who's implemented a complete integration of the filtering
> technique into an existing wallet, and a higher application I disagree.
> There's not much gain to be had in splitting up the filters: it'll result
> in
> additional round trips (to fetch these distinct filter) during normal
> operation, complicate routine seed rescanning logic, and also is
> detrimental
> to privacy if one is fetching blocks from the same peer as they've
> downloaded the filters from.
>
> However, I'm now convinced that the savings had by including the prev
> output
> script (addr re-use and outputs spent in the same block as they're created)
> outweigh the additional booking keeping required in an implementation (when
> extracting the precise tx that matched) compared to using regular outpoint
> as we do currently. Combined with the recently proposed re-parametrization
> of the gcs parameters[1], the filter size should shrink by quite a bit!
>
> I'm very happy with the review the BIPs has been receiving as of late. It
> would've been nice to have this 1+ year ago when the draft was initially
> proposed, but better late that never!
>
> Based on this thread, [1], and discussions on various IRC channels, I plan
> to make the following modifications to the BIP:
>
>   1. use P=2^19 and M=784931 as gcs parameters, and also bind these to the
>  filter instance, so future filter types may use distinct parameters
>   2. use the prev output script rather than the prev input script in the
>  regular filter
>   3. remove the txid from the regular filter(as with some extra
> book-keeping
>  the output script is enough)
>   4. do away with the extended filter all together, as our original use
> case
>  for it has been nerfed as the filter size grew too large when doing
>  recursive parsing. instead we watch for the outpoint being spent and
>  extract the pre-image from it if it matches now
>
> The resulting changes should slash the size of the filters, yet still
> ensure
> that they're useful enough for our target use case.
>
> [1]:
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-May/016029.html
>
> -- Laolu
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-05-28 Thread Olaoluwa Osuntokun via bitcoin-dev
> The additional benefit of the input script/outpoint filter is to watch for
> unexpected spends (coins getting stolen or spent from another wallet) or
> transactions without a unique change or output address. I think this is a
> reasonable implementation, and it would be nice to be able to download
that
> filter without any input elements.

As someone who's implemented a complete integration of the filtering
technique into an existing wallet, and a higher application I disagree.
There's not much gain to be had in splitting up the filters: it'll result in
additional round trips (to fetch these distinct filter) during normal
operation, complicate routine seed rescanning logic, and also is detrimental
to privacy if one is fetching blocks from the same peer as they've
downloaded the filters from.

However, I'm now convinced that the savings had by including the prev output
script (addr re-use and outputs spent in the same block as they're created)
outweigh the additional booking keeping required in an implementation (when
extracting the precise tx that matched) compared to using regular outpoint
as we do currently. Combined with the recently proposed re-parametrization
of the gcs parameters[1], the filter size should shrink by quite a bit!

I'm very happy with the review the BIPs has been receiving as of late. It
would've been nice to have this 1+ year ago when the draft was initially
proposed, but better late that never!

Based on this thread, [1], and discussions on various IRC channels, I plan
to make the following modifications to the BIP:

  1. use P=2^19 and M=784931 as gcs parameters, and also bind these to the
 filter instance, so future filter types may use distinct parameters
  2. use the prev output script rather than the prev input script in the
 regular filter
  3. remove the txid from the regular filter(as with some extra book-keeping
 the output script is enough)
  4. do away with the extended filter all together, as our original use case
 for it has been nerfed as the filter size grew too large when doing
 recursive parsing. instead we watch for the outpoint being spent and
 extract the pre-image from it if it matches now

The resulting changes should slash the size of the filters, yet still ensure
that they're useful enough for our target use case.

[1]:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-May/016029.html

-- Laolu
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-05-21 Thread Olaoluwa Osuntokun via bitcoin-dev
> What if instead of trying to decide up front which subset of elements will
> be most useful to include in the filters, and the size tradeoff, we let
the
> full-node decide which subsets of elements it serves filters for?

This is already the case. The current "track" is to add new service bits
(while we're in the uncommitted phase) to introduce new fitler types. Light
clients can then filter out nodes before even connecting to them.

-- Laolu

On Mon, May 21, 2018 at 1:35 AM Johan Torås Halseth <joha...@gmail.com>
wrote:

> Hi all,
>
> Most light wallets will want to download the minimum amount of data
> required to operate, which means they would ideally download the smallest
> possible filters containing the subset of elements they need.
>
> What if instead of trying to decide up front which subset of elements will
> be most useful to include in the filters, and the size tradeoff, we let the
> full-node decide which subsets of elements it serves filters for?
>
> For instance, a full node would advertise that it could serve filters for
> the subsets 110 (txid+script+outpoint), 100 (txid only), 011 (script+outpoint)
> etc. A light client could then choose to download the minimal filter type
> covering its needs.
>
> The obvious benefit of this would be minimal bandwidth usage for the light
> client, but there are also some less obvious ones. We wouldn’t have to
> decide up front what each filter type should contain, only the possible
> elements a filter can contain (more can be added later without breaking
> existing clients). This, I think, would let the most served filter types
> grow organically, with full-node implementations coming with sane defaults
> for served filter types (maybe even all possible types as long as the
> number of elements is small), letting their operator add/remove types at
> will.
>
> The main disadvantage of this as I see it, is that there’s an exponential
> blowup in the number of possible filter types in the number of element
> types. However, this would let us start out small with only the elements we
> need, and in the worst case the node operators just choose to serve the
> subsets corresponding to what now is called “regular” + “extended” filters
> anyway, requiring no more resources.
>
> This would also give us some data on what is the most widely used filter
> types, which could be useful in making the decision on what should be part
> of filters to eventually commit to in blocks.
>
> - Johan
> On Sat, May 19, 2018 at 5:12, Olaoluwa Osuntokun via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org> wrote:
>
> On Thu, May 17, 2018 at 2:44 PM Jim Posen via bitcoin-dev 
>> Monitoring inputs by scriptPubkey vs input-txid also has a massive
>>> advantage for parallel filtering: You can usually known your pubkeys
>>> well in advance, but if you have to change what you're watching block
>>> N+1 for based on the txids that paid you in N you can't filter them
>>> in parallel.
>>>
>>
>> Yes, I'll grant that this is a benefit of your suggestion.
>>
>
> Yeah parallel filtering would be pretty nice. We've implemented a serial
> filtering for btcwallet [1] for the use-case of rescanning after a seed
> phrase import. Parallel filtering would help here, but also we don't yet
> take advantage of batch querying for the filters themselves. This would
> speed up the scanning by quite a bit.
>
> I really like the filtering model though, it really simplifies the code,
> and we can leverage identical logic for btcd (which has RPCs to fetch the
> filters) as well.
>
> [1]:
> https://github.com/Roasbeef/btcwallet/blob/master/chain/neutrino.go#L180
>
> ___ bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-05-21 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi Y'all,

The script finished a few days ago with the following results:

reg-filter-prev-script total size:  161236078  bytes
reg-filter-prev-script avg: 16123.6078 bytes
reg-filter-prev-script median:  16584  bytes
reg-filter-prev-script max: 59480  bytes

Compared to the original median size of the same block range, but with the
current filter (has both txid, prev outpoint, output scripts), we see a
roughly 34% reduction in filter size (current median is 22258 bytes).
Compared to the suggested modified filter (no txid, prev outpoint, output
scripts), we see a 15% reduction in size (median of that was 19198 bytes).
This shows that script re-use is still pretty prevalent in the chain as of
recent.

One thing that occurred to me, is that on the application level, switching
to the input prev output script can make things a bit awkward. Observe that
when looking for matches in the filter, upon a match, one would need access
to an additional (outpoint -> script) map in order to locate _which_
particular transaction matched w/o access to an up-to-date UTOX set. In
contrast, as is atm, one can locate the matching transaction with no
additional information (as we're matching on the outpoint).

At this point, if we feel filter sizes need to drop further, then we may
need to consider raising the false positive rate.

Does anyone have any estimates or direct measures w.r.t how much bandwidth
current BIP 37 light clients consume? It would be nice to have a direct
comparison. We'd need to consider the size of their base bloom filter, the
accumulated bandwidth as a result of repeated filterload commands (to adjust
the fp rate), and also the overhead of receiving the merkle branch and
transactions in distinct messages (both due to matches and false positives).

Finally, I'd be open to removing the current "extended" filter from the BIP
as is all together for now. If a compelling use case for being able to
filter the sigScript/witness arises, then we can examine re-adding it with a
distinct service bit. After all it would be harder to phase out the filter
once wider deployment was already reached. Similarly, if the 16% savings
achieved by removing the txid is attractive, then we can create an
additional
filter just for the txids to allow those applications which need the
information to seek out that extra filter.

-- Laolu


On Fri, May 18, 2018 at 8:06 PM Pieter Wuille <pieter.wui...@gmail.com>
wrote:

> On Fri, May 18, 2018, 19:57 Olaoluwa Osuntokun via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org> wrote:
>
>> Greg wrote:
>> > What about also making input prevouts filter based on the scriptpubkey
>> being
>> > _spent_?  Layering wise in the processing it's a bit ugly, but if you
>> > validated the block you have the data needed.
>>
>> AFAICT, this would mean that in order for a new node to catch up the
>> filter
>> index (index all historical blocks), they'd either need to: build up a
>> utxo-set in memory during indexing, or would require a txindex in order to
>> look up the prev out's script. The first option increases the memory load
>> during indexing, and the second requires nodes to have a transaction index
>> (and would also add considerable I/O load). When proceeding from tip, this
>> doesn't add any additional load assuming that your synchronously index the
>> block as you validate it, otherwise the utxo set will already have been
>> updated (the spent scripts removed).
>>
>
> I was wondering about that too, but it turns out that isn't necessary. At
> least in Bitcoin Core, all the data needed for such a filter is in the
> block + undo files (the latter contain the scriptPubKeys of the outputs
> being spent).
>
> I have a script running to compare the filter sizes assuming the regular
>> filter switches to include the prev out's script rather than the prev
>> outpoint itself. The script hasn't yet finished (due to the increased I/O
>> load to look up the scripts when indexing), but I'll report back once it's
>> finished.
>>
>
> That's very helpful, thank you.
>
> Cheers,
>
> --
> Pieter
>
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-05-18 Thread Olaoluwa Osuntokun via bitcoin-dev
On Thu, May 17, 2018 at 2:44 PM Jim Posen via bitcoin-dev  Monitoring inputs by scriptPubkey vs input-txid also has a massive
>> advantage for parallel filtering:  You can usually known your pubkeys
>> well in advance, but if you have to change what you're watching block
>>  N+1 for based on the txids that paid you in N you can't filter them
>> in parallel.
>>
>
> Yes, I'll grant that this is a benefit of your suggestion.
>

Yeah parallel filtering would be pretty nice. We've implemented a serial
filtering for btcwallet [1] for the use-case of rescanning after a seed
phrase import. Parallel filtering would help here, but also we don't yet
take advantage of batch querying for the filters themselves. This would
speed up the scanning by quite a bit.

I really like the filtering model though, it really simplifies the code,
and we can leverage identical logic for btcd (which has RPCs to fetch the
filters) as well.

[1]:
https://github.com/Roasbeef/btcwallet/blob/master/chain/neutrino.go#L180
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-05-18 Thread Olaoluwa Osuntokun via bitcoin-dev
Riccardo wrote:
> The BIP recall some go code for how the parameter has been selected which
> I can hardly understand and run

The code you're linking to is for generating test vectors (to allow
implementations to check the correctness of their gcs filters. The name of
the file is 'gentestvectors.go'. It produces CSV files which contain test
vectors of various testnet blocks and at various false positive rates.

> it's totally my fault but if possible I would really like more details on
> the process, like charts and explanations

When we published the BIP draft last year (wow, time flies!), we put up code
(as well as an interactive website) showing the process we used to arrive at
the current false positive rate. The aim was to minimize the bandwidth
required to download each filter plus the expected bandwidth from
downloading "large-ish" full segwit blocks. The code simulated a few wallet
types (in terms of number of addrs, etc) focusing on a "mid-sized" wallet.
One could also model the selection as a Bernoulli process where we attempt
to compute the probability that after k queries (let's say you have k
addresses) we have k "successes". A success would mean the queries item
wasn't found in the filter, while a failure is a filter match (false
positive or not). A failure in the process requires fetching the entire
block.

-- Laolu

On Fri, May 18, 2018 at 5:35 AM Riccardo Casatta via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> Another parameter which heavily affects filter size is the false positive
> rate which is empirically set
> 
> to 2^-20
> The BIP recall some go code
> 
> for how the parameter has been selected which I can hardly understand and
> run, it's totally my fault but if possible I would really like more details
> on the process, like charts and explanations (for example, which is the
> number of elements to search for which the filter has been optimized for?)
>
> Instinctively I feel 2^-20 is super low and choosing a lot higher alpha
> will shrink the total filter size by gigabytes at the cost of having to
> wastefully download just some megabytes of blocks.
>
>
> 2018-05-17 18:36 GMT+02:00 Gregory Maxwell via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org>:
>
>> On Thu, May 17, 2018 at 3:25 PM, Matt Corallo via bitcoin-dev
>>  wrote:
>> > I believe (1) could be skipped entirely - there is almost no reason why
>> > you'd not be able to filter for, eg, the set of output scripts in a
>> > transaction you know about
>>
>> I think this is convincing for the txids themselves.
>>
>> What about also making input prevouts filter based on the scriptpubkey
>> being _spent_?  Layering wise in the processing it's a bit ugly, but
>> if you validated the block you have the data needed.
>>
>> This would eliminate the multiple data type mixing entirely.
>> ___
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>
>
>
> --
> Riccardo Casatta - @RCasatta 
> ___
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-05-18 Thread Olaoluwa Osuntokun via bitcoin-dev
Greg wrote:
> What about also making input prevouts filter based on the scriptpubkey
being
> _spent_?  Layering wise in the processing it's a bit ugly, but if you
> validated the block you have the data needed.

AFAICT, this would mean that in order for a new node to catch up the filter
index (index all historical blocks), they'd either need to: build up a
utxo-set in memory during indexing, or would require a txindex in order to
look up the prev out's script. The first option increases the memory load
during indexing, and the second requires nodes to have a transaction index
(and would also add considerable I/O load). When proceeding from tip, this
doesn't add any additional load assuming that your synchronously index the
block as you validate it, otherwise the utxo set will already have been
updated (the spent scripts removed).

I have a script running to compare the filter sizes assuming the regular
filter switches to include the prev out's script rather than the prev
outpoint itself. The script hasn't yet finished (due to the increased I/O
load to look up the scripts when indexing), but I'll report back once it's
finished.

-- Laolu


On Thu, May 17, 2018 at 9:37 AM Gregory Maxwell via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> On Thu, May 17, 2018 at 3:25 PM, Matt Corallo via bitcoin-dev
>  wrote:
> > I believe (1) could be skipped entirely - there is almost no reason why
> > you'd not be able to filter for, eg, the set of output scripts in a
> > transaction you know about
>
> I think this is convincing for the txids themselves.
>
> What about also making input prevouts filter based on the scriptpubkey
> being _spent_?  Layering wise in the processing it's a bit ugly, but
> if you validated the block you have the data needed.
>
> This would eliminate the multiple data type mixing entirely.
> ___
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-05-18 Thread Olaoluwa Osuntokun via bitcoin-dev
Matt wrote:
> I believe (1) could be skipped entirely - there is almost no reason why
> you'd not be able to filter for, eg, the set of output scripts in a
> transaction you know about

Depending on the use-case, the txid is more precise than searching for the
output script as it doesn't need to deal with duplicated output scripts. To
my knowledge, lnd is the only major project that currently utilizes BIP
157+158. At this point, we use the txid in the regular filter for
confirmations (channel confirmed, sweep tx confirmed, cltv confirmed, etc).
Switching to use output scripts instead wouldn't be _too_ invasive w.r.t
changes required in the codebase, only the need to deal with output script
duplication could be annoying.

> (2) and (3) may want to be split out - many wallets may wish to just find
> transactions paying to them, as transactions spending from their outputs
> should generally be things they've created.

FWIW, in the "rescan after importing by seed phrase" both are needed in
order to ensure the wallet ends up with the proper output set after the
scan. In lnd we actively use both (2) to detect deposits to the internal
wallet, and (3) to be notified when our channel outputs are spent on-chain
(and also generally when any of our special scripts are spent).

> In general, I'm concerned about the size of the filters making existing
SPV
> clients less willing to adopt BIP 158 instead of the existing bloom filter
> garbage and would like to see a further exploration of ways to split out
> filters to make them less bandwidth intensive.

Agreed that the current filter size may prevent adoption amongst wallets.
However, the other factor that will likely prevent adoption amongst current
BIP-37 mobile wallets is the lack of support for notifying _unconfirmed_
transactions. When we drafted up the protocol last year and asked around,
this was one of the major points of contention amongst existing mobile
wallets that utilize BIP 37.

On the other hand, the two "popular" BIP 37 wallets I'm aware of
(Breadwallet, and Andreas Schildbach's Bitcoin Wallet) have lagged massively
behind the existing set of wallet related protocol upgrades. For example,
neither of them have released versions of their applications that take
advantage of segwit in any manner. Breadwallet has more or less "pivoted"
(they did an ICO and have a token) and instead is prioritizing things like
adding random ICO tokens over catching up with the latest protocol updates.
Based on this behavior, even if the filter sizes were even _more_ bandwidth
efficient that BIP 37, I don't think they'd adopt the protocol.

> Some further ideas we should probably play with before finalizing moving
> forward is providing filters for certain script templates, eg being able
to
> only get outputs that are segwit version X or other similar ideas.

Why should this block active deployment of BIP 157+158 as is now? As
defined, the protocol already allows future updates to add additional filter
types. Before the filters are committed, each filter type requires a new
filter header. We could move to a single filter header that commits to the
hashes of _all_ filters, but that would mean that a node couldn't serve the
headers unless they had all currently defined features, defeating the
optionality offered.

Additionally, more filters entails more disk utilization for nodes serving
these filters. Nodes have the option to instead create the filters at "query
time", but then this counters the benefit of simply slinging the filters
from disk (or a memory map or w/e). IMO, it's a desirable feature that
serving light clients no longer requires active CPU+I/O and instead just
passive I/O (nodes could even write the filters to disk in protocol msg
format).

To get a feel for the current filter sizes, a txid-only filter size, and a
regular filter w/o txid's, I ran some stats on the last 10k blocks:

regular size:217107653  bytes
regular avg: 21710.7653 bytes
regular median:  22332  bytes
regular max: 61901  bytes

txid-only size:34518463  bytes
txid-only avg: 3451.8463 bytes
txid-only median:  3258  bytes
txid-only max: 10193 bytes

reg-no-txid size:182663961  bytes
reg-no-txid avg: 18266.3961 bytes
reg-no-txid median:  19198  bytes
reg-no-txid max: 60172  bytes

So the median regular filter size over the past 10k blocks is 20KB. If we
extract the txid from the regular filter and add a txid-only filter, the
median size of that is 3.2KB. Finally, the median size of a modified regular
filter (no txid) is 19KB.

-- Laolu


On Thu, May 17, 2018 at 8:33 AM Matt Corallo via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> BIP 158 currently includes the following in the "basic" filter: 1)
> txids, 2) output scripts, 3) input prevouts.
>
> I believe (1) could be skipped entirely - there is almost no reason why
> you'd not be able to filter for, eg, the set of output scripts in a
> transaction you know about and (2) 

Re: [bitcoin-dev] BIP sighash_noinput

2018-05-09 Thread Olaoluwa Osuntokun via bitcoin-dev
> The current proposal kind-of limits the potential damage by still
committing
> to the prevout amount, but it still seems a big risk for all the people
that
> reuse addresses, which seems to be just about everyone.

The typical address re-use doesn't apply here as this is a sighash flag that
would only really be used for doing various contracts on Bitcoin. I don't
see any reason why "regular" wallets would update to use this sighash flag.
We've also seen first hand with segwit that wallet authors are slow to pull
in the latest and greatest features available, even if they solve nuisance
issues like malleability and can result in lower fees.

IMO, sighash_none is an even bigger footgun that already exists in the
protocol today.

-- Laolu


On Tue, May 8, 2018 at 7:41 AM Anthony Towns via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> On Mon, May 07, 2018 at 09:40:46PM +0200, Christian Decker via bitcoin-dev
> wrote:
> > Given the general enthusiasm, and lack of major criticism, for the
> > `SIGHASH_NOINPUT` proposal, [...]
>
> So first, I'm not sure if I'm actually criticising or playing devil's
> advocate here, but either way I think criticism always helps produce
> the best proposal, so
>
> The big concern I have with _NOINPUT is that it has a huge failure
> case: if you use the same key for multiple inputs and sign one of them
> with _NOINPUT, you've spent all of them. The current proposal kind-of
> limits the potential damage by still committing to the prevout amount,
> but it still seems a big risk for all the people that reuse addresses,
> which seems to be just about everyone.
>
> I wonder if it wouldn't be ... I'm not sure better is the right word,
> but perhaps "more realistic" to have _NOINPUT be a flag to a signature
> for a hypothetical "OP_CHECK_SIG_FOR_SINGLE_USE_KEY" opcode instead,
> so that it's fundamentally not possible to trick someone who regularly
> reuses keys to sign something for one input that accidently authorises
> spends of other inputs as well.
>
> Is there any reason why an OP_CHECKSIG_1USE (or OP_CHECKMULTISIG_1USE)
> wouldn't be equally effective for the forseeable usecases? That would
> ensure that a _NOINPUT signature is only ever valid for keys deliberately
> intended to be single use, rather than potentially valid for every key.
>
> It would be ~34 witness bytes worse than being able to spend a Schnorr
> aggregate key directly, I guess; but that's not worse than the normal
> taproot tradeoff: you spend the aggregate key directly in the normal,
> cooperative case; and reserve the more expensive/NOINPUT case for the
> unusual, uncooperative cases. I believe that works fine for eltoo: in
> the cooperative case you just do a SIGHASH_ALL spend of the original
> transaction, and _NOINPUT isn't needed.
>
> Maybe a different opcode maybe makes sense at a "philosophical" level:
> normal signatures are signing a spend of a particular "coin" (in the
> UTXO sense), while _NOINPUT signatures are in some sense signing a spend
> of an entire "wallet" (all the coins spendable by a particular key, or
> more accurately for the current proposal, all the coins of a particular
> value spendable by a particular key). Those are different intentions,
> so maybe it's reasonable to encode them in different addresses, which
> in turn could be done by having a new opcode for _NOINPUT.
>
> A new opcode has the theoretical advantage that it could be deployed
> into the existing segwit v0 address space, rather than waiting for segwit
> v1. Not sure that's really meaningful, though.
>
> Cheers,
> aj
>
> ___
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Making OP_TRUE standard?

2018-05-09 Thread Olaoluwa Osuntokun via bitcoin-dev
> Instead, would you consider to use ANYONECANPAY to sign the tx, so it is
> possible add more inputs for fees? The total tx size is bigger than the
> OP_TRUE approach, but you don’t need to ask for any protocol change.

If one has a "root" commitment with other nested descendent
multi-transaction contracts, then changing the txid of the root commitment
will invalidated all the nested multi tx contracts. In our specific case, we
have pre-signed 2-stage HTLC transaction which rely on a stable txid. As a
result, we can't use the ANYONECANPAY approach atm.

> In long-term, I think the right way is to have a more flexible SIGHASH
> system to allow people to add more inputs and outputs easily.

Agreed, see the recent proposal to introduce SIGHASH_NOINPUT as a new
sighash type. IMO it presents an opportunity to introduce more flexible fine
grained sighash inclusion control.

-- Laolu


On Wed, May 9, 2018 at 11:12 AM Johnson Lau via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> You should make a “0 fee tx with exactly one OP_TRUE output” standard, but
> nothing else. This makes sure CPFP will always be needed, so the OP_TRUE
> output won’t pollute the UTXO set
>
> Instead, would you consider to use ANYONECANPAY to sign the tx, so it is
> possible add more inputs for fees? The total tx size is bigger than the
> OP_TRUE approach, but you don’t need to ask for any protocol change.
>
> In long-term, I think the right way is to have a more flexible SIGHASH
> system to allow people to add more inputs and outputs easily.
>
>
>
> > On 9 May 2018, at 7:57 AM, Rusty Russell via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org> wrote:
> >
> > Hi all,
> >
> >The largest problem we are having today with the lightning
> > protocol is trying to predict future fees.  Eltoo solves this elegantly,
> > but meanwhile we would like to include a 546 satoshi OP_TRUE output in
> > commitment transactions so that we use minimal fees and then use CPFP
> > (which can't be done at the moment due to CSV delays on outputs).
> >
> > Unfortunately, we'd have to P2SH it at the moment as a raw 'OP_TRUE' is
> > non-standard.  Are there any reasons not to suggest such a policy
> > change?
> >
> > Thanks!
> > Rusty.
> > ___
> > bitcoin-dev mailing list
> > bitcoin-dev@lists.linuxfoundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
> ___
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Making OP_TRUE standard?

2018-05-08 Thread Olaoluwa Osuntokun via bitcoin-dev
What are the downsides of just using p2wsh? This route can be rolled out
immediately, while policy changes are pretty "fuzzy" and would require a
near uniform rollout in order to ensure wide propagation of the commitment
transactions.

On Tue, May 8, 2018, 4:58 PM Rusty Russell via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> Hi all,
>
> The largest problem we are having today with the lightning
> protocol is trying to predict future fees.  Eltoo solves this elegantly,
> but meanwhile we would like to include a 546 satoshi OP_TRUE output in
> commitment transactions so that we use minimal fees and then use CPFP
> (which can't be done at the moment due to CSV delays on outputs).
>
> Unfortunately, we'd have to P2SH it at the moment as a raw 'OP_TRUE' is
> non-standard.  Are there any reasons not to suggest such a policy
> change?
>
> Thanks!
> Rusty.
> ___
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP sighash_noinput

2018-05-07 Thread Olaoluwa Osuntokun via bitcoin-dev
Super stoked to see that no_input has been resurrected!!! I actually
implemented a variant back in 2015 when Tadge first described the approach
to
me for both btcd [1], and bitcoind [2]. The version being proposed is
_slightly_ differ though, as the initial version I implemented still
committed
to the script being sent, while this new version just relies on
witness validity instead. This approach is even more flexible as the script
attached to the output being spent can change, without rendering the
spending
transaction invalid as long as the witness still ratifies a branch in the
output's predicate.

Given that this would introduce a _new_ sighash flag, perhaps we should also
attempt to bundle additional more flexible sighash flags concurrently as
well?
This would require a larger overhaul w.r.t to how sighash flags are
interpreted, so in this case, we may need to introduce a new CHECKSIG
operator
(lets call it CHECKSIG_X for now), which would consume an available noop
opcode. As a template for more fine grained sighashing control, I'll refer
to
jl2012's BIP-0YYY [3] (particularly the "New nHashType definitions"
section).
This was originally proposed in the context of his merklized script work as
it
more or less opened up a new opportunity to further modify script within the
context of merklized script executions.  The approach reads in the
sighash flags as a bit vector, and allows developers to express things like:
"don't sign the input value, nor the sequence, but sign the output of this
input, and ONLY the script of this output". This approach is _extremely_
powerful, and one would be able to express the equivalent of no_input by
setting the appropriate bits in the sighash.

Looking forward in hearing y'alls thoughts on this approach, thanks.

[1]: https://github.com/Roasbeef/btcd/commits/SIGHASH_NOINPUT
[2]: https://github.com/Roasbeef/bitcoin/commits/SIGHASH_NOINPUT
[3]:
https://github.com/jl2012/bips/blob/vault/bip-0YYY.mediawiki#new-nhashtype-definitions

-- Laolu

On Mon, Apr 30, 2018 at 10:30 AM Christian Decker via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> Hi all,
>
> I'd like to pick up the discussion from a few months ago, and propose a new
> sighash flag, `SIGHASH_NOINPUT`, that removes the commitment to the
> previous
> output. This was previously mentioned on the list by Joseph Poon [1], but
> was
> never formally proposed, so I wrote a proposal [2].
>
> We have long known that `SIGHASH_NOINPUT` would be a great fit for
> Lightning.
> They enable simple watch-towers, i.e., outsource the need to watch the
> blockchain for channel closures, and react appropriately if our
> counterparty
> misbehaves. In addition to this we just released the eltoo [3,4] paper
> which
> describes a simplified update mechanism that can be used in Lightning, and
> other
> off-chain contracts, with any number of participants.
>
> By not committing to the previous output being spent by the transaction,
> we can
> rebind an input to point to any outpoint with a matching output script and
> value. The binding therefore is no longer explicit through a reference, but
> through script compatibility, and the transaction ID reference in the
> input is a
> hint to validators. The sighash flag is meant to enable some off-chain
> use-cases
> and should not be used unless the tradeoffs are well-known. In particular
> we
> suggest using contract specific key-pairs, in order to avoid having any
> unwanted
> rebinding opportunities.
>
> The proposal is very minimalistic, and simple. However, there are a few
> things
> where we'd like to hear the input of the wider community with regards to
> the
> implementation details though. We had some discussions internally on
> whether to
> use a separate opcode or a sighash flag, some feeling that the sighash flag
> could lead to some confusion with existing wallets, but given that we have
> `SIGHASH_NONE`, and that existing wallets will not sign things with unknown
> flags, we decided to go the sighash way. Another thing is that we still
> commit
> to the amount of the outpoint being spent. The rationale behind this is
> that,
> while rebinding to outpoints with the same value maintains the value
> relationship between input and output, we will probably not want to bind to
> something with a different value and suddenly pay a gigantic fee.
>
> The deployment part of the proposal is left vague on purpose in order not
> to
> collide with any other proposals. It should be possible to introduce it by
> bumping the segwit script version and adding the new behavior.
>
> I hope the proposal is well received, and I'm looking forward to discussing
> variants and tradeoffs here. I think the applications we proposed so far
> are
> quite interesting, and I'm sure there are many more we can enable with this
> change.
>
> Cheers,
> Christian
>
> [1]
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2016-February/012460.html
> [2] 

Re: [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients

2017-06-08 Thread Olaoluwa Osuntokun via bitcoin-dev
> Correct me if I'm wrong, but from my interpretation we can't use that
> method as described as we need to output 64-bit integers rather than
> 32-bit integers.

Had a chat with gmax off-list and came to the realization that the method
_should_ indeed generalize to our case of outputting 64-bit integers.
We'll need to do a bit of bit twiddling to make it work properly. I'll
modify our implementation and report back with some basic benchmarks.

-- Laolu


On Thu, Jun 8, 2017 at 8:42 PM Olaoluwa Osuntokun <laol...@gmail.com> wrote:

> Gregory wrote:
> > I see the inner loop of construction and lookup are free of
> > non-constant divmod. This will result in implementations being
> > needlessly slow
>
> Ahh, sipa brought this up other day, but I thought he was referring to the
> coding loop (which uses a power of 2 divisor/modulus), not the
> siphash-then-reduce loop.
>
> > I believe this can be fixed by using this approach
> >
> http://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
> > which has the same non-uniformity as mod but needs only a multiply and
> > shift.
>
> Very cool, I wasn't aware of the existence of such a mapping.
>
> Correct me if I'm wrong, but from my interpretation we can't use that
> method as described as we need to output 64-bit integers rather than
> 32-bit integers. A range of 32-bits would be constrain the number of items
> we could encode to be ~4096 to ensure that we don't overflow with fp
> values such as 20 (which we currently use in our code).
>
> If filter commitment are to be considered for a soft-fork in the future,
> then we should definitely optimize the construction of the filters as much
> as possible! I'll look into that paper you referenced to get a feel for
> just how complex the optimization would be.
>
> > Shouldn't all cases in your spec where you have N=transactions be
> > n=indexed-outputs? Otherwise, I think your golomb parameter and false
> > positive rate are wrong.
>
> Yep! Nice catch. Our code is correct, but mistake in the spec was an
> oversight on my part. I've pushed a commit[1] to the bip repo referenced
> in the OP to fix this error.
>
> I've also pushed another commit to explicitly take advantage of the fact
> that P is a power-of-two within the coding loop [2].
>
> -- Laolu
>
> [1]:
> https://github.com/Roasbeef/bips/commit/bc5c6d6797f3df1c4a44213963ba12e72122163d
> [2]:
> https://github.com/Roasbeef/bips/commit/578a4e3aa8ec04524c83bfc5d14be1b2660e7f7a
>
>
> On Wed, Jun 7, 2017 at 2:41 PM Gregory Maxwell <g...@xiph.org> wrote:
>
>> On Thu, Jun 1, 2017 at 7:01 PM, Olaoluwa Osuntokun via bitcoin-dev
>> <bitcoin-dev@lists.linuxfoundation.org> wrote:
>> > Hi y'all,
>> >
>> > Alex Akselrod and I would like to propose a new light client BIP for
>> > consideration:
>> >*
>> https://github.com/Roasbeef/bips/blob/master/gcs_light_client.mediawiki
>>
>> I see the inner loop of construction and lookup are free of
>> non-constant divmod. This will result in implementations being
>> needlessly slow (especially on arm, but even on modern x86_64 a
>> division is a 90 cycle-ish affair.)
>>
>> I believe this can be fixed by using this approach
>>
>> http://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
>>which has the same non-uniformity as mod but needs only a multiply
>> and shift.
>>
>> Otherwise fast implementations will have to implement the code to
>> compute bit twiddling hack exact division code, which is kind of
>> complicated. (e.g. via the technique in "{N}-bit Unsigned Division via
>> {N}-bit Multiply-Add" by Arch D. Robison).
>>
>> Shouldn't all cases in your spec where you have N=transactions be
>> n=indexed-outputs? Otherwise, I think your golomb parameter and false
>> positive rate are wrong.
>>
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients

2017-06-08 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi y'all,

Thanks for all the comments so far!

I've pushed a series of updates to the text of the BIP repo linked in the
OP.
The fixes include: typos, components of the specification which were
incorrect
(N is the total number of items, NOT the number of txns in the block), and a
few sections have been clarified.

The latest version also includes a set of test vectors (as CSV files), which
for a series of fp rates (1/2 to 1/2^32) includes (for 6 testnet blocks,
one of
which generates a "null" filter):

   * The block height
   * The block hash
   * The raw block itself
   * The previous basic+extended filter header
   * The basic+extended filter header for the block
   * The basic+extended filter for the block

The size of the test vectors was too large to include in-line within the
document, so we put them temporarily in a distinct folder [1]. The code
used to
generate the test vectors has also been included.

-- Laolu

[1]: https://github.com/Roasbeef/bips/tree/master/gcs_light_client


On Thu, Jun 1, 2017 at 9:49 PM Olaoluwa Osuntokun  wrote:

> > In order to consider the average+median filter sizes in a world worth
> larger
> > blocks, I also ran the index for testnet:
> >
> > * total size:  2753238530
> > * total avg:  5918.95736054141
> > * total median:  60202
> > * total max:  74983
> > * regular size:  1165148878
> > * regular avg:  2504.856172982827
> > * regular median:  24812
> > * regular max:  64554
> > * extended size:  1588089652
> > * extended avg:  3414.1011875585823
> > * extended median:  35260
> > * extended max:  41731
> >
>
> Oops, realized I made a mistake. These are the stats for Feb 2016 until
> about a
> month ago (since height 400k iirc).
>
> -- Laolu
>
>
> On Thu, Jun 1, 2017 at 12:01 PM Olaoluwa Osuntokun 
> wrote:
>
>> Hi y'all,
>>
>> Alex Akselrod and I would like to propose a new light client BIP for
>> consideration:
>>*
>> https://github.com/Roasbeef/bips/blob/master/gcs_light_client.mediawiki
>>
>> This BIP proposal describes a concrete specification (along with a
>> reference implementations[1][2][3]) for the much discussed client-side
>> filtering reversal of BIP-37. The precise details are described in the
>> BIP, but as a summary: we've implemented a new light-client mode that uses
>> client-side filtering based off of Golomb-Rice coded sets. Full-nodes
>> maintain an additional index of the chain, and serve this compact filter
>> (the index) to light clients which request them. Light clients then fetch
>> these filters, query the locally and _maybe_ fetch the block if a relevant
>> item matches. The cool part is that blocks can be fetched from _any_
>> source, once the light client deems it necessary. Our primary motivation
>> for this work was enabling a light client mode for lnd[4] in order to
>> support a more light-weight back end paving the way for the usage of
>> Lightning on mobile phones and other devices. We've integrated neutrino
>> as a back end for lnd, and will be making the updated code public very
>> soon.
>>
>> One specific area we'd like feedback on is the parameter selection. Unlike
>> BIP-37 which allows clients to dynamically tune their false positive rate,
>> our proposal uses a _fixed_ false-positive. Within the document, it's
>> currently specified as P = 1/2^20. We've done a bit of analysis and
>> optimization attempting to optimize the following sum:
>> filter_download_bandwidth + expected_block_false_positive_bandwidth. Alex
>> has made a JS calculator that allows y'all to explore the affect of
>> tweaking the false positive rate in addition to the following variables:
>> the number of items the wallet is scanning for, the size of the blocks,
>> number of blocks fetched, and the size of the filters themselves. The
>> calculator calculates the expected bandwidth utilization using the CDF of
>> the Geometric Distribution. The calculator can be found here:
>> https://aakselrod.github.io/gcs_calc.html. Alex also has an empirical
>> script he's been running on actual data, and the results seem to match up
>> rather nicely.
>>
>> We we're excited to see that Karl Johan Alm (kallewoof) has done some
>> (rather extensive!) analysis of his own, focusing on a distinct encoding
>> type [5]. I haven't had the time yet to dig into his report yet, but I
>> think I've read enough to extract the key difference in our encodings: his
>> filters use a binomial encoding _directly_ on the filter contents, will we
>> instead create a Golomb-Coded set with the contents being _hashes_ (we use
>> siphash) of the filter items.
>>
>> Using a fixed fp=20, I have some stats detailing the total index size, as
>> well as averages for both mainnet and testnet. For mainnet, using the
>> filter contents as currently described in the BIP (basic + extended), the
>> total size of the index comes out to 6.9GB. The break down is as follows:
>>
>> * total size:  6976047156
>> * 

Re: [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients

2017-06-08 Thread Olaoluwa Osuntokun via bitcoin-dev
Tomas wrote:
> A rough estimate would indicate this to be about 2-2.5x as big per block
> as your proposal, but comes with rather different security
> characteristics, and would not require download since genesis.

Our proposal _doesnt_ require downloading from genesis, if by
"downloading" you mean downloading all the blocks. Clients only need to
sync the block+filter headers, then (if they don't care about historical
blocks), will download filters from their "birthday" onwards.

> The client could verify the TXIDs against the merkle root with a much
> stronger (PoW) guarantee compared to the guarantee based on the assumption
> of peers being distinct, which your proposal seems to make

Our proposal only makes a "one honest peer" assumption, which is the same
as any other operating mode. Also as client still download all the
headers, they're able to verify PoW conformance/work as normal.

> I don't completely understand the benefit of making the outpoints and
> pubkey hashes (weakly) verifiable. These only serve as notifications and
> therefore do not seem to introduce an attack vector.

Not sure what you mean by this. Care to elaborate?

> I think client-side filtering is definitely an important route to take,
> but is it worth compressing away the information to verify the merkle
> root?

That's not the case with our proposal. Clients get the _entire_ block (if
they need it), so they can verify the merkle root as normal. Unless one of
us is misinterpreting the other here.

-- Laolu


On Thu, Jun 8, 2017 at 6:34 AM Tomas via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> On Thu, Jun 1, 2017, at 21:01, Olaoluwa Osuntokun via bitcoin-dev wrote:
>
> Hi y'all,
>
> Alex Akselrod and I would like to propose a new light client BIP for
> consideration:
>*
> https://github.com/Roasbeef/bips/blob/master/gcs_light_client.mediawiki
>
>
>
> Very interesting.
>
> I would like to consider how this compares to another light client type
> with rather different security characteristics where each client would
> receive for each transaction in each block,
>
> * The TXID (uncompressed)
> * The spent outpoints (with TXIDs compressed)
> * The pubkey hash (compressed to reasonable amount of false positives)
>
> A rough estimate would indicate this to be about 2-2.5x as big per block
> as your proposal, but comes with rather different security characteristics,
> and would not require download since genesis.
>
> The client could verify the TXIDs against the merkle root with a much
> stronger (PoW) guarantee compared to the guarantee based on the assumption
> of peers being distinct, which your proposal seems to make. Like your
> proposal this removes the privacy and processing  issues from server-side
> filtering, but unlike your proposal retrieval of all txids in each block
> can also serve for a basis of fraud proofs and (disprovable) fraud hints,
> without resorting to full block downloads.
>
> I don't completely understand the benefit of making the outpoints and
> pubkey hashes (weakly) verifiable. These only serve as notifications and
> therefore do not seem to introduce an attack vector. Omitting data is
> always possible, so receiving data is a prerequisite for verification, not
> an assumption that can be made.  How could an attacker benefit from "hiding
> notifications"?
>
> I think client-side filtering is definitely an important route to take,
> but is it worth compressing away the information to verify the merkle root?
>
> Regards,
> Tomas van der Wansem
> bitcrust
>
> ___
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients

2017-06-08 Thread Olaoluwa Osuntokun via bitcoin-dev
Gregory wrote:
> I see the inner loop of construction and lookup are free of
> non-constant divmod. This will result in implementations being
> needlessly slow

Ahh, sipa brought this up other day, but I thought he was referring to the
coding loop (which uses a power of 2 divisor/modulus), not the
siphash-then-reduce loop.

> I believe this can be fixed by using this approach
>
http://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
> which has the same non-uniformity as mod but needs only a multiply and
> shift.

Very cool, I wasn't aware of the existence of such a mapping.

Correct me if I'm wrong, but from my interpretation we can't use that
method as described as we need to output 64-bit integers rather than
32-bit integers. A range of 32-bits would be constrain the number of items
we could encode to be ~4096 to ensure that we don't overflow with fp
values such as 20 (which we currently use in our code).

If filter commitment are to be considered for a soft-fork in the future,
then we should definitely optimize the construction of the filters as much
as possible! I'll look into that paper you referenced to get a feel for
just how complex the optimization would be.

> Shouldn't all cases in your spec where you have N=transactions be
> n=indexed-outputs? Otherwise, I think your golomb parameter and false
> positive rate are wrong.

Yep! Nice catch. Our code is correct, but mistake in the spec was an
oversight on my part. I've pushed a commit[1] to the bip repo referenced
in the OP to fix this error.

I've also pushed another commit to explicitly take advantage of the fact
that P is a power-of-two within the coding loop [2].

-- Laolu

[1]:
https://github.com/Roasbeef/bips/commit/bc5c6d6797f3df1c4a44213963ba12e72122163d
[2]:
https://github.com/Roasbeef/bips/commit/578a4e3aa8ec04524c83bfc5d14be1b2660e7f7a


On Wed, Jun 7, 2017 at 2:41 PM Gregory Maxwell <g...@xiph.org> wrote:

> On Thu, Jun 1, 2017 at 7:01 PM, Olaoluwa Osuntokun via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org> wrote:
> > Hi y'all,
> >
> > Alex Akselrod and I would like to propose a new light client BIP for
> > consideration:
> >*
> https://github.com/Roasbeef/bips/blob/master/gcs_light_client.mediawiki
>
> I see the inner loop of construction and lookup are free of
> non-constant divmod. This will result in implementations being
> needlessly slow (especially on arm, but even on modern x86_64 a
> division is a 90 cycle-ish affair.)
>
> I believe this can be fixed by using this approach
>
> http://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
>which has the same non-uniformity as mod but needs only a multiply
> and shift.
>
> Otherwise fast implementations will have to implement the code to
> compute bit twiddling hack exact division code, which is kind of
> complicated. (e.g. via the technique in "{N}-bit Unsigned Division via
> {N}-bit Multiply-Add" by Arch D. Robison).
>
> Shouldn't all cases in your spec where you have N=transactions be
> n=indexed-outputs? Otherwise, I think your golomb parameter and false
> positive rate are wrong.
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients

2017-06-08 Thread Olaoluwa Osuntokun via bitcoin-dev
Karl wrote:

> I am also curious if you have considered digests containing multiple
> blocks. Retaining a permanent binsearchable record of the entire chain is
> obviously too space costly, but keeping the last X blocks as binsearchable
> could speed up syncing for clients tremendously, I feel.

Originally we hadn't considered such an idea. Grasping the concept a bit
better, I can see how that may result in considerable bandwidth savings
(for purely negative queries) for clients doing a historical sync, or
catching up to the chain after being inactive for months/weeks.

If we were to purse tacking this approach onto the current BIP proposal,
we could do it in the following way:

   * The `getcfilter` message gains an additional "Level" field. Using
 this field, the range of blocks to be included in the returned filter
 would be Level^2. So a level of 0 is just the single filter, 3 is 8
 blocks past the block hash etc.

   * Similarly, the `getcfheaders` message would also gain a similar field
 with identical semantics. In this case each "level" would have a
 distinct header chain for clients to verify.

> How fast are these to create? Would it make sense to provide digests on
> demand in some cases, rather than keeping them around indefinitely?

For larger blocks (like the one referenced at the end of this mail) full
construction of the regular filter takes ~10-20ms (most of this spent
extracting the data pushes). With smaller blocks, it quickly dips down to
the nano to micro second range.

Whether to keep _all_ the filters on disk, or to dynamically re-generate a
particular range (possibly most of the historical data) is an
implementation detail. Nodes that already do block pruning could discard
very old filters once the header chain is constructed allowing them to
save additional space, as it's unlikely most clients would care about the
first 300k or so blocks.

> Ahh, so you actually make a separate digest chain with prev hashes and
> everything. Once/if committed digests are soft forked in, it seems a bit
> overkill but maybe it's worth it.

Yep, this is only a hold-over until when/if a commitment to the filter is
soft-forked in. In that case, there could be some extension message to
fetch the filter hash for a particular block, along with a merkle proof of
the coinbase transaction to the merkle root in the header.

> I created digests for all blocks up until block #469805 and actually ended
> up with 5.8 GB, which is 1.1 GB lower than what you have, but may be worse
> perf-wise on false positive rates and such.

Interesting, are you creating the equivalent of both our "regular" and
"extended" filters? Each of the filter types consume about ~3.5GB in
isolation, with the extended filter type on average consuming more bytes
due to the fact that it includes sigScript/witness data as well.

It's worth noting that those numbers includes the fixed 4-byte value for
"N" that's prepended to each filter once it's serialized (though that
doesn't add a considerable amount of overhead).  Alex and I were
considering instead using Bitcoin's var-int encoding for that number
instead. This would result in using a single byte for empty filters, 1
byte for most filters (< 2^16 items), and 3 bytes for the remainder of the
cases.

> For comparison, creating the digests above (469805 of them) took
> roughly 30 mins on my end, but using the kstats format so probably
> higher on an actual node (should get around to profiling that...).

Does that include the time required to read the blocks from disk? Or just
the CPU computation of constructing the filters? I haven't yet kicked off
a full re-index of the filters, but for reference this block[1] on testnet
takes ~18ms for the _full_ indexing routine with our current code+spec.

[1]: 052184fbe86eff349e31703e4f109b52c7e6fa105cd1588ab6aa

-- Laolu


On Sun, Jun 4, 2017 at 7:18 PM Karl Johan Alm via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> On Sat, Jun 3, 2017 at 2:55 AM, Alex Akselrod via bitcoin-dev
>  wrote:
> > Without a soft fork, this is the only way for light clients to verify
> that
> > peers aren't lying to them. Clients can request headers (just hashes of
> the
> > filters and the previous headers, creating a chain) and look for
> conflicts
> > between peers. If a conflict is found at a certain block, the client can
> > download the block, generate a filter, calculate the header by hashing
> > together the previous header and the generated filter, and banning any
> peers
> > that don't match. A full node could prune old filters if you wanted and
> > recalculate them as necessary if you just keep the filter header chain
> info
> > as really old filters are unlikely to be requested by correctly written
> > software but you can't guarantee every client will follow best practices
> > either.
>
> Ahh, so you actually make a separate digest chain with prev hashes and
> everything. 

Re: [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients

2017-06-01 Thread Olaoluwa Osuntokun via bitcoin-dev
> In order to consider the average+median filter sizes in a world worth
larger
> blocks, I also ran the index for testnet:
>
> * total size:  2753238530
> * total avg:  5918.95736054141
> * total median:  60202
> * total max:  74983
> * regular size:  1165148878
> * regular avg:  2504.856172982827
> * regular median:  24812
> * regular max:  64554
> * extended size:  1588089652
> * extended avg:  3414.1011875585823
> * extended median:  35260
> * extended max:  41731
>

Oops, realized I made a mistake. These are the stats for Feb 2016 until
about a
month ago (since height 400k iirc).

-- Laolu


On Thu, Jun 1, 2017 at 12:01 PM Olaoluwa Osuntokun 
wrote:

> Hi y'all,
>
> Alex Akselrod and I would like to propose a new light client BIP for
> consideration:
>*
> https://github.com/Roasbeef/bips/blob/master/gcs_light_client.mediawiki
>
> This BIP proposal describes a concrete specification (along with a
> reference implementations[1][2][3]) for the much discussed client-side
> filtering reversal of BIP-37. The precise details are described in the
> BIP, but as a summary: we've implemented a new light-client mode that uses
> client-side filtering based off of Golomb-Rice coded sets. Full-nodes
> maintain an additional index of the chain, and serve this compact filter
> (the index) to light clients which request them. Light clients then fetch
> these filters, query the locally and _maybe_ fetch the block if a relevant
> item matches. The cool part is that blocks can be fetched from _any_
> source, once the light client deems it necessary. Our primary motivation
> for this work was enabling a light client mode for lnd[4] in order to
> support a more light-weight back end paving the way for the usage of
> Lightning on mobile phones and other devices. We've integrated neutrino
> as a back end for lnd, and will be making the updated code public very
> soon.
>
> One specific area we'd like feedback on is the parameter selection. Unlike
> BIP-37 which allows clients to dynamically tune their false positive rate,
> our proposal uses a _fixed_ false-positive. Within the document, it's
> currently specified as P = 1/2^20. We've done a bit of analysis and
> optimization attempting to optimize the following sum:
> filter_download_bandwidth + expected_block_false_positive_bandwidth. Alex
> has made a JS calculator that allows y'all to explore the affect of
> tweaking the false positive rate in addition to the following variables:
> the number of items the wallet is scanning for, the size of the blocks,
> number of blocks fetched, and the size of the filters themselves. The
> calculator calculates the expected bandwidth utilization using the CDF of
> the Geometric Distribution. The calculator can be found here:
> https://aakselrod.github.io/gcs_calc.html. Alex also has an empirical
> script he's been running on actual data, and the results seem to match up
> rather nicely.
>
> We we're excited to see that Karl Johan Alm (kallewoof) has done some
> (rather extensive!) analysis of his own, focusing on a distinct encoding
> type [5]. I haven't had the time yet to dig into his report yet, but I
> think I've read enough to extract the key difference in our encodings: his
> filters use a binomial encoding _directly_ on the filter contents, will we
> instead create a Golomb-Coded set with the contents being _hashes_ (we use
> siphash) of the filter items.
>
> Using a fixed fp=20, I have some stats detailing the total index size, as
> well as averages for both mainnet and testnet. For mainnet, using the
> filter contents as currently described in the BIP (basic + extended), the
> total size of the index comes out to 6.9GB. The break down is as follows:
>
> * total size:  6976047156
> * total avg:  14997.220622758816
> * total median:  3801
> * total max:  79155
> * regular size:  3117183743
> * regular avg:  6701.372750217131
> * regular median:  1734
> * regular max:  67533
> * extended size:  3858863413 <(385)%20886-3413>
> * extended avg:  8295.847872541684
> * extended median:  2041
> * extended max:  52508
>
> In order to consider the average+median filter sizes in a world worth
> larger blocks, I also ran the index for testnet:
>
> * total size:  2753238530
> * total avg:  5918.95736054141
> * total median:  60202
> * total max:  74983
> * regular size:  1165148878
> * regular avg:  2504.856172982827
> * regular median:  24812
> * regular max:  64554
> * extended size:  1588089652
> * extended avg:  3414.1011875585823
> * extended median:  35260
> * extended max:  41731
>
> Finally, here are the testnet stats which take into account the increase
> in the maximum filter size due to segwit's block-size increase. The max
> filter sizes are a bit larger due to some of the habitual blocks I
> created last year when testing segwit (transactions with 30k inputs, 30k
> 

Re: [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients

2017-06-01 Thread Olaoluwa Osuntokun via bitcoin-dev
Eric wrote:
> Thanks for sending this proposal! I look forward to having a great
> discussion around this.

Thanks Eric! We really appreciated the early feedback you gave on the
initial design.

One aspect which isn't in this BIP draft is direct support for unconfirmed
transactions. I consider such a feature an important UX feature for mobile
phones, and something which I've personally seen as an important
UX-experience when on-boarding new users to Bitcoin. This was brought up
in the original "bfd" mailing list chain [1]. Possible solutions are: a
new beefier INV message which contains enough information to be able to
identify relevant outputs created in a transaction, or a "streaming" p2p
extension that allows light clients to receive notifications of mempool
inclusion based on only (pkScript, amount) pairs.

Matt wrote:
> looks like you have no way to match the input prevouts being spent, which
> is rather nice from a "watch for this output being spent" pov.

Perhaps we didn't make this clear enough, but it _is_ indeed possible to
watch an output for spentness. Or maybe you mean matching on the
_script_ being spent?

>From the BIP draft:
> for each transaction, normal filters contain:
>  * The outpoints of each input within a transaction.
>  ...

Within the integration for lnd, we specifically use this feature to be
able to watch for when channels have been closed within the network graph,
or channels _directly_ under our control have been spent (either
unilateral channel closure, or a revocation beach).

-- Laolu

[1]:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-January/013397.html


On Thu, Jun 1, 2017 at 2:33 PM Matt Corallo <lf-li...@mattcorallo.com>
wrote:

> Quick comment before I finish reading it completely, looks like you have
> no way to match the input prevouts being spent, which is rather nice from a
> "watch for this output being spent" pov.
>
> On June 1, 2017 3:01:14 PM EDT, Olaoluwa Osuntokun via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org> wrote:
> >Hi y'all,
> >
> >Alex Akselrod and I would like to propose a new light client BIP for
> >consideration:
> >*
> >https://github.com/Roasbeef/bips/blob/master/gcs_light_client.mediawiki
> >
> >This BIP proposal describes a concrete specification (along with a
> >reference implementations[1][2][3]) for the much discussed client-side
> >filtering reversal of BIP-37. The precise details are described in the
> >BIP, but as a summary: we've implemented a new light-client mode that
> >uses
> >client-side filtering based off of Golomb-Rice coded sets. Full-nodes
> >maintain an additional index of the chain, and serve this compact
> >filter
> >(the index) to light clients which request them. Light clients then
> >fetch
> >these filters, query the locally and _maybe_ fetch the block if a
> >relevant
> >item matches. The cool part is that blocks can be fetched from _any_
> >source, once the light client deems it necessary. Our primary
> >motivation
> >for this work was enabling a light client mode for lnd[4] in order to
> >support a more light-weight back end paving the way for the usage of
> >Lightning on mobile phones and other devices. We've integrated neutrino
> >as a back end for lnd, and will be making the updated code public very
> >soon.
> >
> >One specific area we'd like feedback on is the parameter selection.
> >Unlike
> >BIP-37 which allows clients to dynamically tune their false positive
> >rate,
> >our proposal uses a _fixed_ false-positive. Within the document, it's
> >currently specified as P = 1/2^20. We've done a bit of analysis and
> >optimization attempting to optimize the following sum:
> >filter_download_bandwidth + expected_block_false_positive_bandwidth.
> >Alex
> >has made a JS calculator that allows y'all to explore the affect of
> >tweaking the false positive rate in addition to the following
> >variables:
> >the number of items the wallet is scanning for, the size of the blocks,
> >number of blocks fetched, and the size of the filters themselves. The
> >calculator calculates the expected bandwidth utilization using the CDF
> >of
> >the Geometric Distribution. The calculator can be found here:
> >https://aakselrod.github.io/gcs_calc.html. Alex also has an empirical
> >script he's been running on actual data, and the results seem to match
> >up
> >rather nicely.
> >
> >We we're excited to see that Karl Johan Alm (kallewoof) has done some
> >(rather extensive!) analysis of his own, focusing on a distinct
> >encoding
> >type [5]. I haven't had the time yet to dig into his report yet, but I
>

[bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients

2017-06-01 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi y'all,

Alex Akselrod and I would like to propose a new light client BIP for
consideration:
   * https://github.com/Roasbeef/bips/blob/master/gcs_light_client.mediawiki

This BIP proposal describes a concrete specification (along with a
reference implementations[1][2][3]) for the much discussed client-side
filtering reversal of BIP-37. The precise details are described in the
BIP, but as a summary: we've implemented a new light-client mode that uses
client-side filtering based off of Golomb-Rice coded sets. Full-nodes
maintain an additional index of the chain, and serve this compact filter
(the index) to light clients which request them. Light clients then fetch
these filters, query the locally and _maybe_ fetch the block if a relevant
item matches. The cool part is that blocks can be fetched from _any_
source, once the light client deems it necessary. Our primary motivation
for this work was enabling a light client mode for lnd[4] in order to
support a more light-weight back end paving the way for the usage of
Lightning on mobile phones and other devices. We've integrated neutrino
as a back end for lnd, and will be making the updated code public very
soon.

One specific area we'd like feedback on is the parameter selection. Unlike
BIP-37 which allows clients to dynamically tune their false positive rate,
our proposal uses a _fixed_ false-positive. Within the document, it's
currently specified as P = 1/2^20. We've done a bit of analysis and
optimization attempting to optimize the following sum:
filter_download_bandwidth + expected_block_false_positive_bandwidth. Alex
has made a JS calculator that allows y'all to explore the affect of
tweaking the false positive rate in addition to the following variables:
the number of items the wallet is scanning for, the size of the blocks,
number of blocks fetched, and the size of the filters themselves. The
calculator calculates the expected bandwidth utilization using the CDF of
the Geometric Distribution. The calculator can be found here:
https://aakselrod.github.io/gcs_calc.html. Alex also has an empirical
script he's been running on actual data, and the results seem to match up
rather nicely.

We we're excited to see that Karl Johan Alm (kallewoof) has done some
(rather extensive!) analysis of his own, focusing on a distinct encoding
type [5]. I haven't had the time yet to dig into his report yet, but I
think I've read enough to extract the key difference in our encodings: his
filters use a binomial encoding _directly_ on the filter contents, will we
instead create a Golomb-Coded set with the contents being _hashes_ (we use
siphash) of the filter items.

Using a fixed fp=20, I have some stats detailing the total index size, as
well as averages for both mainnet and testnet. For mainnet, using the
filter contents as currently described in the BIP (basic + extended), the
total size of the index comes out to 6.9GB. The break down is as follows:

* total size:  6976047156
* total avg:  14997.220622758816
* total median:  3801
* total max:  79155
* regular size:  3117183743
* regular avg:  6701.372750217131
* regular median:  1734
* regular max:  67533
* extended size:  3858863413
* extended avg:  8295.847872541684
* extended median:  2041
* extended max:  52508

In order to consider the average+median filter sizes in a world worth
larger blocks, I also ran the index for testnet:

* total size:  2753238530
* total avg:  5918.95736054141
* total median:  60202
* total max:  74983
* regular size:  1165148878
* regular avg:  2504.856172982827
* regular median:  24812
* regular max:  64554
* extended size:  1588089652
* extended avg:  3414.1011875585823
* extended median:  35260
* extended max:  41731

Finally, here are the testnet stats which take into account the increase
in the maximum filter size due to segwit's block-size increase. The max
filter sizes are a bit larger due to some of the habitual blocks I
created last year when testing segwit (transactions with 30k inputs, 30k
outputs, etc).

 * total size:  585087597
 * total avg:  520.8839608674402
 * total median:  20
 * total max:  164598
 * regular size:  299325029
 * regular avg:  266.4790836307566
 * regular median:  13
 * regular max:  164583
 * extended size:  285762568
 * extended avg:  254.4048772366836
 * extended median:  7
 * extended max:  127631

For those that are interested in the raw data, I've uploaded a CSV file
of raw data for each block (mainnet + testnet), which can be found here:
 * mainnet: (14MB):
https://www.dropbox.com/s/4yk2u8dj06njbuv/mainnet-gcs-stats.csv?dl=0
 * testnet: (25MB):
https://www.dropbox.com/s/w7dmmcbocnmjfbo/gcs-stats-testnet.csv?dl=0


We look forward to getting feedback from all of y'all!

-- Laolu


[1]: https://github.com/lightninglabs/neutrino
[2]: https://github.com/Roasbeef/btcd/tree/segwit-cbf
[3]: 

Re: [bitcoin-dev] Extension block proposal by Jeffrey et al

2017-04-05 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi Y'all,

Thanks to luke-jr and jl2012 for publishing your analysis of the
xblocks proposal. I'd like to also present some analysis but instead focus
on the professed LN safety enhancing scheme in the proposal. It's a bit
underspecified, so I've taken the liberty of extrapolating a bit to fill
in the gaps to the point that I can analyze it.

TLDR; The xblock proposal includes a sub-proposal for LN which is
essentially a block-size decrease for each open channel within the network.
This decrease reserves space in blocks to allow honest parties guaranteed
space in the blocks to punish dishonest channel counter parties. As a result
the block size is permanently decreased for each channel open. Some may
consider this cost prohibitively high.

>> If the second highest transaction version bit (30th bit) is set to to `1`
>> within an extension block transaction, an extra 700-bytes is reserved on
>> the transaction space used up in the block.

> Why wouldn't users set this on all transactions?

As the proposal stands now, it seems that users _are_ able to unilaterally
use this for all their Bitcoin transactions, as there's no additional cost
to using the smart-contract safety feature outlined in the proposal.

The new safety measures proposed near the end of this xblock proposal
could itself consume a dedicated document outlining the prior background,
context, and implications of this new safety feature. Throughout the rest
of this post, I'll be referring to the scheme as a Pre-Allocated
Smart-contract Dispute arena (PASDA, chosen because it sounds kinda like
"pasta", which brings me many keks). It's rather insufficiently described
and
under specified as it stands in the proposal. As a result, if one doesn't
have the necessary prior context, it might've been skipped over entirely
as it's difficult to extract the sub-proposal from the greater proposal. I
think I possess the necessary prior context required to required to
properly analyze the sub-proposal. As a result, I would like to illuminate
the readers of the ML so y'all may also be able to evaluate this
sub-proposal independently.


## Background

First, some necessary background. Within LN as it exists today there is
one particularly nasty systematic risk related to blockchain availability
in the case of a channel dispute. This risk is clearly outlined in the
original white paper, and in my opinion a satisfactory solution to the
risks which safe guard the use of very high-value channels has yet to be
presented.


### Chain Spam/Censorship Attack Vector

The attack vector mentioned in the original paper is a reoccurring attack
in systems of this nature: DoS attacks. As it stands today, if a channel
counterparty is able to (solely, or in collaboration with other attackers)
prevent one from committing a transaction to the chain, they're able to
steal money from the honest participant in the channel. The attack
proceeds something like this:

   * Mallory opens a very large channel with me.
   * We transfer money back and forth in the channel as normal. The nature
 of these transfers isn't very important. The commitment balances may
 be modified due to Mallory making multi-hop payments through my
 channel, or possibly from Mallory directly purchasing some goods I
 offer, paying via the channel.
   * Let's call the current commitment number state S_i. In the lifetime
 of the channel there may exist some state S_j (i < j) s.t Mallory's
 balance in S_i, is larger than S_j.
   * At this point, depending on the value of the channel's time-based
 security parameter (T) it may be possible for Mallory to broadcast
 state S_i (which has been revoked), and prevent me being able to
 include by my punishment transaction (PTX) within the blockchain.
   * If Mallory is able to incapacitate me for a period of time T, or
 censor my transactions from the chain (either selectively or via a
 spam attack), then at time K (K > T + B, where B is the time the
 commitment transaction was stamped in the chain), then she'll be free
 to walk away with her settled balance at state S_i. For the sake of
 simplicity, we're ignoring HTLC's.
   * Mallory's gain is the difference between the balance at state S_i and
 S_j. Deepening on the gap between the states, my settled balance at
 state S_i and the her balance delta, she may be able to fully recoup
 the funds she initially place in the channel.


### The Role of Channel Reserves as Partial Mitigation

A minor mitigation to this attack that's purely commitment transaction
policy is to mandate that Mallory's balance in the channel never dips
below some reserve value R. Otherwise, if at state S_j, Mallory has a
settled balance of 0 within he channel (all the money if on my side), then
the attack outline above can under certain conditions be _costless_ from
her PoV. Replicate this simultaneously across the network in a synchronized
manner (possibly getting some help from your 

Re: [bitcoin-dev] Idea: Efficient bitcoin block propagation

2015-08-06 Thread Olaoluwa Osuntokun via bitcoin-dev
Other than the source code, the best documentation I've come across is a few
lines on IRC explaining the high-level design of the protocol:
https://botbot.me/freenode/bitcoin-wizards/2015-07-10/?msg=44146764page=2

On Thu, Aug 6, 2015 at 10:18 AM Sergio Demian Lerner via bitcoin-dev 
bitcoin-dev@lists.linuxfoundation.org wrote:

 Is there any up to date documentation about TheBlueMatt relay network
 including what kind of block compression it is currently doing? (apart from
 the source code)

 Regards, Sergio.

 On Wed, Aug 5, 2015 at 7:14 PM, Gregory Maxwell via bitcoin-dev 
 bitcoin-dev@lists.linuxfoundation.org wrote:

 On Wed, Aug 5, 2015 at 9:19 PM, Arnoud Kouwenhoven - Pukaki Corp
 arn...@pukaki.bz wrote:
  Thanks for this (direct) feedback. It would make sense that if blocks
 can be
  submitted using ~5kb packets, that no further optimizations would be
 needed
  at this point. I will look into the relay network transmission protocol
 to
  understand how it works!
 
  I hear that you are saying that this network solves speed of
 transmission
  and thereby (technical) block size issues. Presumably it would solve
 speed
  of block validation too by prevalidating transactions.


 Correct. Bitcoin Core has cached validation for many years now... if
 not for that and other optimizations, things would be really broken
 right now. :)

  Assuming this is all
  true, and I have no reason to doubt that at this point, I do not
 understand
  why there is any discussion at all about the (technical) impact of large
  blocks, why there are large numbers of miners building on invalid blocks
  (SPV mining, https://bitcoin.org/en/alert/2015-07-04-spv-mining), or
 why
  there is any discussion about the speed of block validation (cpu
 processing
  time to verify blocks and transactions in blocks being a limitation).

 I'm also mystified by a lot of the large block discussion, much of it
 is completely divorced from the technology as deployed; much less what
 we-- in industry-- know to be possible. I don't blame you or anyone in
 particular on this; it's a new area and we don't yet know what we need
 to know to know what we need to know; or to the extent that we do it
 hasn't had time to get effectively communicated.

 The technical/security implications of larger blocks are related to
 other things than propagation time, if you assume people are using the
 available efficient relay protocol (or better).

 SPV mining is a bit of a misnomer (If I coined the term, I'm sorry).
 What these parties are actually doing is blinding mining on top of
 other pools' stratum work. You can think of it as sub-pooling with
 hopping onto whatever pool has the highest block (I'll call it VFSSP
 in this post-- validation free stratum subpooling).  It's very easy to
 implement, and there are other considerations.

 It was initially deployed at a time when a single pool in Europe has
 amassed more than half of the hashrate. This pool had propagation
 problems and a very high orphan rate, it may have (perhaps
 unintentionally) been performing a selfish mining attack; mining off
 their stratum work was an easy fix which massively cut down the orphan
 rates for anyone who did it.  This was before the relay network
 protocol existed (the fact that all the hashpower was consolidating on
 a single pool was a major motivation for creating it).

 VFSSP also cuts through a number of practical issues miners have had:
 Miners that run their own bitcoin nodes in far away colocation
 (100ms) due to local bandwidth or connectivity issues (censored
 internet); relay network hubs not being anywhere near by due to
 strange internet routing (e.g. japan to china going via the US for ...
 reasons...); the CreateNewBlock() function being very slow and
 unoptimized, etc.   There are many other things like this-- and VFSSP
 avoids them causing delays even when you don't understand them or know
 about them. So even when they're easily fixed the VFSSP is a more
 general workaround.

 Mining operations are also usually operated in a largely fire and
 forget manner. There is a long history in (esp pooled) mining where
 someone sets up an operation and then hardly maintains it after the
 fact... so some of the use of VFSSP appears to just be inertia-- we
 have better solutions now, but they they work to deploy and changing
 things involves risk (which is heightened by a lack of good
 monitoring-- participants learn they are too latent by observing
 orphaned blocks at a cost of 25 BTC each).

 One of the frustrating things about incentives in this space is that
 bad outcomes are possible even when they're not necessary. E.g. if a
 miner can lower their orphan rate by deploying a new protocol (or
 simply fixing some faulty hardware in their infrastructure, like
 Bitcoin nodes running on cheap VPSes with remote storage)  OR they can
 lower their orphan rate by pointing their hashpower at a free
 centeralized pool, they're likely to do the latter because it takes
 less