Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-05-21 Thread Olaoluwa Osuntokun via bitcoin-dev
> What if instead of trying to decide up front which subset of elements will
> be most useful to include in the filters, and the size tradeoff, we let
the
> full-node decide which subsets of elements it serves filters for?

This is already the case. The current "track" is to add new service bits
(while we're in the uncommitted phase) to introduce new fitler types. Light
clients can then filter out nodes before even connecting to them.

-- Laolu

On Mon, May 21, 2018 at 1:35 AM Johan Torås Halseth 
wrote:

> Hi all,
>
> Most light wallets will want to download the minimum amount of data
> required to operate, which means they would ideally download the smallest
> possible filters containing the subset of elements they need.
>
> What if instead of trying to decide up front which subset of elements will
> be most useful to include in the filters, and the size tradeoff, we let the
> full-node decide which subsets of elements it serves filters for?
>
> For instance, a full node would advertise that it could serve filters for
> the subsets 110 (txid+script+outpoint), 100 (txid only), 011 (script+outpoint)
> etc. A light client could then choose to download the minimal filter type
> covering its needs.
>
> The obvious benefit of this would be minimal bandwidth usage for the light
> client, but there are also some less obvious ones. We wouldn’t have to
> decide up front what each filter type should contain, only the possible
> elements a filter can contain (more can be added later without breaking
> existing clients). This, I think, would let the most served filter types
> grow organically, with full-node implementations coming with sane defaults
> for served filter types (maybe even all possible types as long as the
> number of elements is small), letting their operator add/remove types at
> will.
>
> The main disadvantage of this as I see it, is that there’s an exponential
> blowup in the number of possible filter types in the number of element
> types. However, this would let us start out small with only the elements we
> need, and in the worst case the node operators just choose to serve the
> subsets corresponding to what now is called “regular” + “extended” filters
> anyway, requiring no more resources.
>
> This would also give us some data on what is the most widely used filter
> types, which could be useful in making the decision on what should be part
> of filters to eventually commit to in blocks.
>
> - Johan
> On Sat, May 19, 2018 at 5:12, Olaoluwa Osuntokun via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org> wrote:
>
> On Thu, May 17, 2018 at 2:44 PM Jim Posen via bitcoin-dev 
>> Monitoring inputs by scriptPubkey vs input-txid also has a massive
>>> advantage for parallel filtering: You can usually known your pubkeys
>>> well in advance, but if you have to change what you're watching block
>>> N+1 for based on the txids that paid you in N you can't filter them
>>> in parallel.
>>>
>>
>> Yes, I'll grant that this is a benefit of your suggestion.
>>
>
> Yeah parallel filtering would be pretty nice. We've implemented a serial
> filtering for btcwallet [1] for the use-case of rescanning after a seed
> phrase import. Parallel filtering would help here, but also we don't yet
> take advantage of batch querying for the filters themselves. This would
> speed up the scanning by quite a bit.
>
> I really like the filtering model though, it really simplifies the code,
> and we can leverage identical logic for btcd (which has RPCs to fetch the
> filters) as well.
>
> [1]:
> https://github.com/Roasbeef/btcwallet/blob/master/chain/neutrino.go#L180
>
> ___ bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-05-21 Thread Olaoluwa Osuntokun via bitcoin-dev
Hi Y'all,

The script finished a few days ago with the following results:

reg-filter-prev-script total size:  161236078  bytes
reg-filter-prev-script avg: 16123.6078 bytes
reg-filter-prev-script median:  16584  bytes
reg-filter-prev-script max: 59480  bytes

Compared to the original median size of the same block range, but with the
current filter (has both txid, prev outpoint, output scripts), we see a
roughly 34% reduction in filter size (current median is 22258 bytes).
Compared to the suggested modified filter (no txid, prev outpoint, output
scripts), we see a 15% reduction in size (median of that was 19198 bytes).
This shows that script re-use is still pretty prevalent in the chain as of
recent.

One thing that occurred to me, is that on the application level, switching
to the input prev output script can make things a bit awkward. Observe that
when looking for matches in the filter, upon a match, one would need access
to an additional (outpoint -> script) map in order to locate _which_
particular transaction matched w/o access to an up-to-date UTOX set. In
contrast, as is atm, one can locate the matching transaction with no
additional information (as we're matching on the outpoint).

At this point, if we feel filter sizes need to drop further, then we may
need to consider raising the false positive rate.

Does anyone have any estimates or direct measures w.r.t how much bandwidth
current BIP 37 light clients consume? It would be nice to have a direct
comparison. We'd need to consider the size of their base bloom filter, the
accumulated bandwidth as a result of repeated filterload commands (to adjust
the fp rate), and also the overhead of receiving the merkle branch and
transactions in distinct messages (both due to matches and false positives).

Finally, I'd be open to removing the current "extended" filter from the BIP
as is all together for now. If a compelling use case for being able to
filter the sigScript/witness arises, then we can examine re-adding it with a
distinct service bit. After all it would be harder to phase out the filter
once wider deployment was already reached. Similarly, if the 16% savings
achieved by removing the txid is attractive, then we can create an
additional
filter just for the txids to allow those applications which need the
information to seek out that extra filter.

-- Laolu


On Fri, May 18, 2018 at 8:06 PM Pieter Wuille 
wrote:

> On Fri, May 18, 2018, 19:57 Olaoluwa Osuntokun via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org> wrote:
>
>> Greg wrote:
>> > What about also making input prevouts filter based on the scriptpubkey
>> being
>> > _spent_?  Layering wise in the processing it's a bit ugly, but if you
>> > validated the block you have the data needed.
>>
>> AFAICT, this would mean that in order for a new node to catch up the
>> filter
>> index (index all historical blocks), they'd either need to: build up a
>> utxo-set in memory during indexing, or would require a txindex in order to
>> look up the prev out's script. The first option increases the memory load
>> during indexing, and the second requires nodes to have a transaction index
>> (and would also add considerable I/O load). When proceeding from tip, this
>> doesn't add any additional load assuming that your synchronously index the
>> block as you validate it, otherwise the utxo set will already have been
>> updated (the spent scripts removed).
>>
>
> I was wondering about that too, but it turns out that isn't necessary. At
> least in Bitcoin Core, all the data needed for such a filter is in the
> block + undo files (the latter contain the scriptPubKeys of the outputs
> being spent).
>
> I have a script running to compare the filter sizes assuming the regular
>> filter switches to include the prev out's script rather than the prev
>> outpoint itself. The script hasn't yet finished (due to the increased I/O
>> load to look up the scripts when indexing), but I'll report back once it's
>> finished.
>>
>
> That's very helpful, thank you.
>
> Cheers,
>
> --
> Pieter
>
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] [bitcoin-discuss] Checkpoints in the Blockchain.

2018-05-21 Thread Dave Scotese via bitcoin-dev
 Our wetware memory is faulty at details, but a rendering that provides
features at which it isn't faulty makes it a decent backup in situations
where technology has been used to hide important differences from us. Some
of us may recall being in a situation where something seems off, and we
start to investigate, and then we discover that something was off.  April
Fools jokes are good examples, as well as satire and news reports from The
Onion.

The point of storing the entire blockchain history is to prevent any of
that history being changed in a way that illicitly alters the UTXO Set.
Whenever a memorable enough rendering of the UTXO Set is produced (which
has happened exactly once - when Bitcoin started, it was empty, and after
that, it just looks like a bunch of random computer data), the risk of
altering the history before it goes up, even if you have the computing
power to make subsequent block headers follow all the rules (and even to
successfully execute a 51% attack!).  In this announcement
,
the first item under "new features" has this, which follows the same
principle as my idea:

Introduce experimental SSH Fingerprint ASCII Visualisation to ssh(1) and
> ssh-keygen(1). Visual fingerprinnt display is controlled by a new
> ssh_config(5) option "VisualHostKey". The intent is to render SSH host keys
> in a visual form that is amenable to easy recall and rejection of changed
> host keys. This technique inspired by the graphical hash visualisation
> schemes known as "random art[*]", and by Dan Kaminsky's musings at 23C3 in
> Berlin.
>



On Sat, May 19, 2018 at 10:12 PM, Damian Williamson 
wrote:

> I do understand your point, however, 'something like stuxnet' cannot be
> used to create valid data without re-doing all the PoW. Provided some valid
> copies of the blockchain continue to exist, the network can re-synchronise.
>
>
> Unrelated, it would seem useful to have some kind of deep blockchain
> corruption recovery mechanism if it does not exist; where blocks are
> altered at a depth exceeding the re-scan on init, efficient recovery is
> possible *on detection*. Presumably, it would be possible for some
> stuxnet like thing to modify blocks by modifying the table data making
> blocks invalid but without causing a table corruption. I would also suppose
> that if the node is delving deep into the blockchain for transaction data,
> that is would also validate the block at least that it has a valid hash
> (apart from Merkle tree validation for the individual transaction?) and
> that the hash of its immediate ancestor is also valid.
>
>
> Regards,
>
> Damian Williamson
>
>
> --
> *From:* bitcoin-discuss-boun...@lists.linuxfoundation.org <
> bitcoin-discuss-boun...@lists.linuxfoundation.org> on behalf of Dave
> Scotese via bitcoin-discuss 
> *Sent:* Sunday, 20 May 2018 11:58 AM
> *To:* Scott Roberts
> *Cc:* Bitcoin Discuss
> *Subject:* Re: [bitcoin-discuss] Checkpoints in the Blockchain.
>
> I wouldn't limit my rendering to words, but that is a decent starting
> point.  The richer the rendering, the harder it will be to forget, but it
> needn't all be developed at once. My goal here is to inspire the creation
> of art that is, at the same time, highly useful and based on randomness.
>
> Anyway, I asked what "premise that this is needed" you meant and I still
> don't know the answer.
>
> "The archive is a shared memory" - yes, a shared *computer* memory, and
> growing larger (ie more cumbersome) every day. If something like stuxnet is
> used to change a lot of the copies of it at some point, it seems likely
> that people will notice a change, but which history is correct won't be so
> obvious because for the *humans* whose memories are not so easily
> overwritten, computer data is remarkably non-memorable in it's normal form
> (0-9,a-f, whatever).  If we ever want to abandon the historical transaction
> data, having a shared memory of the state of a recent UTXO Set will help to
> obviate the need for it.  Yes, of course the blockchain is the perfect
> solution, as long as there is *exactly one* and everyone can see that
> it's the same one that everyone else sees.  Any other number of archives
> presents a great difficulty.
>
> In that scenario, there's no other way to prove that the starting point is
> valid.  Bitcoin has included a hardcoded-checkpoint in the code which
> served the same purpose, but this ignores the possibility that two versions
> of the code could be created, one with a fake checkpoint that is useful to
> a powerful attacker.  If the checkpoint were rendered into something
> memorable at the first opportunity, there would be little question about
> which one is correct when the difference is discovered.
>
> On Sat, May 19, 2018 at 5:22 PM, Scott Roberts 
> wrote:
>
> I just don't see the point of 

Re: [bitcoin-dev] Making OP_TRUE standard?

2018-05-21 Thread Russell O'Connor via bitcoin-dev
In the thread "Revisting BIP 125 RBF policy" @
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-February/015717.html
and
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-March/015797.html
I propose replacing rule 3 with a rule that instead demands that the
replacement package fee rate exceeds the package fee rate of the original
transactions, and that there is an absolute fee bump of the particular
transaction being replaced that covers the min-fee rate times the size of
the mempool churn's data size.

Perhaps this would address your issue too Rusty.

On Sun, May 20, 2018 at 11:44 PM, Rusty Russell via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> Jim Posen  writes:
> > I believe OP_CSV with a relative locktime of 0 could be used to enforce
> RBF
> > on the spending tx?
>
> Marco points out that if the parent is RBF, this child inherits it, so
> we're actually good here.
>
> However, Matt Corallo points out that you can block RBF will a
> large-but-lowball tx, as BIP 125 points out:
>
>will be replaced by a new transaction...:
>
>3. The replacement transaction pays an absolute fee of at least the sum
>   paid by the original transactions.
>
> I understand implementing a single mempool requires these kind of
> up-front decisions on which tx is "better", but I wonder about the
> consequences of dropping this heuristic?  Peter?
>
> Thanks!
> Rusty.
> ___
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size

2018-05-21 Thread Johan Torås Halseth via bitcoin-dev
Hi all,
Most light wallets will want to download the minimum amount of data required to 
operate, which means they would ideally download the smallest possible filters 
containing the subset of elements they need.
What if instead of trying to decide up front which subset of elements will be 
most useful to include in the filters, and the size tradeoff, we let the 
full-node decide which subsets of elements it serves filters for?

For instance, a full node would advertise that it could serve filters for the 
subsets 110 (txid+script+outpoint), 100 (txid only), 011 ( script+outpoint) 
etc. A light client could then choose to download the minimal filter type 
covering its needs.
The obvious benefit of this would be minimal bandwidth usage for the light 
client, but there are also some less obvious ones. We wouldn’t have to decide 
up front what each filter type should contain, only the possible elements a 
filter can contain (more can be added later without breaking existing clients). 
This, I think, would let the most served filter types grow organically, with 
full-node implementations coming with sane defaults for served filter types 
(maybe even all possible types as long as the number of elements is small), 
letting their operator add/remove types at will.
The main disadvantage of this as I see it, is that there’s an exponential 
blowup in the number of possible filter types in the number of element types. 
However, this would let us start out small with only the elements we need, and 
in the worst case the node operators just choose to serve the subsets 
corresponding to what now is called “regular” + “extended” filters anyway, 
requiring no more resources.
This would also give us some data on what is the most widely used filter types, 
which could be useful in making the decision on what should be part of filters 
to eventually commit to in blocks.
- Johan On Sat, May 19, 2018 at 5:12, Olaoluwa Osuntokun via bitcoin-dev 
 wrote:
On Thu, May 17, 2018 at 2:44 PM Jim Posen via bitcoin-dev https://github.com/Roasbeef/btcwallet/blob/master/chain/neutrino.go#L180 
[https://github.com/Roasbeef/btcwallet/blob/master/chain/neutrino.go#L180]
___ bitcoin-dev mailing list 
bitcoin-dev@lists.linuxfoundation.org 
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev