Re: [bitcoin-dev] "Compressed" headers stream

2017-12-12 Thread Gregory Maxwell via bitcoin-dev
On Tue, Dec 12, 2017 at 9:07 PM, Suhas Daftuar  wrote:
> I agree with this.  Specifically the way I envisioned this working is that
> we could introduce a new 'cmpctheaders'/'getcmpcthdrs' message pair for
> syncing using this new message type, while leaving the existing
> 'headers'/'getheaders' messages unchanged.  So when communicating with
> upgraded peers, we'd never use 'getheaders' messages, and we'd only use
> 'headers' messages for potentially announcing new blocks.

The question becomes there-- how should it work.

In an ideal world, we'd decide what peers to fetch headers from based
on a compact proof of the total work in their chains... but we cannot
construct such proofs in Bitcoin today.

I think that instead of that a weak heuristic is to fetch first from
the peers with the tip at the highest difficulty. Then work backwards.

See: https://en.bitcoin.it/wiki/User:Gmaxwell/Reverse_header-fetching_sync

Which is the inspiration for the current headers first sync, but
without the reverse part because the protocol didn't permit it: you
can't request a linear wad of headers that come before a header. This
is the thing I was mostly thinking of when I mentioned that we may
want to change the interface.
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] "Compressed" headers stream

2017-12-12 Thread Gregory Maxwell via bitcoin-dev
On Tue, Dec 12, 2017 at 9:07 PM, Suhas Daftuar  wrote:
> But I think we should be able to get nearly all the benefit just by
> including nBits in any messages where the value is ambiguous; ie we include
> it with the first header in a message, and whenever it changes from the
> previous header's nBits.

Yes, that is what I was thinking last time we discussed it, just with
each header include a one byte flag that lets you express:

bit: meaning
(0) if nbits is the same as last,
(1) if timestamp is a full field or a small offset, (e.g. two bytes
defined as unsigned offset between the last time - 7200 and the new
time).
(2,3,4) if version is the same as the last distinct value .. 7th last,
or a new 32bit distinct value;
(5,6) if prev is entirely there, entirely gone, first 4 bytes
provided, or first 8 bytes provided. (if provided they override the
computed values).

That would be 7 bits in total; the 8th could be reserved or use to
signal "more headers follow" to make the encoding self-delimiting.

The downside with nbits the same as last as the optimization is that
if we ever change consensus rules to ones where difficulty management
works differently it may be the case that nbits changes every block.

Alternatively, nbits could get a differential encoding that could be
opted into for small differences-- though I haven't thought much about
it to see if a one byte difference would be that useful (e.g. can bch
differences usually be expressed with one byte?)

I'm kind of dubious of the consensus layer anti-dos separation:  nbits
minimum is so low compared to the speed of a mining device, virtually
any attack that you might do with invalid headers could still be done
with headers at the minimum difficulty. But I'm fully willing to
accept that simpler is better...



>> I would rather not change the serialization of existing messages,
>> nodes are going to have to support speaking both messages for a long
>> time, and I think we already want a different protocol flow for
>> headers fetching in any case.
>
>
> I agree with this.  Specifically the way I envisioned this working is that
> we could introduce a new 'cmpctheaders'/'getcmpcthdrs' message pair for
> syncing using this new message type, while leaving the existing
> 'headers'/'getheaders' messages unchanged.  So when communicating with
> upgraded peers, we'd never use 'getheaders' messages, and we'd only use
> 'headers' messages for potentially announcing new blocks.
>
> Of course, we'll have to support the existing protocol for a long time.  But
> one downside I've discovered from deploying BIP 130, which introduced block
> announcements via headers messages, is that overloading a 'headers' message
> to be either a block announcement or a response to a 'getheaders' message
> results in p2p-handling logic which is more complicated than it needs to be.
> So splitting off the headers chain-sync functionality to a new message pair
> seems like a nice side-effect benefit, in the long run.
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] "Compressed" headers stream

2017-12-12 Thread Suhas Daftuar via bitcoin-dev
Hi,

First, thanks for resurrecting this, I agree this is worth pursuing.

On Mon, Dec 11, 2017 at 4:04 PM, Gregory Maxwell via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

>
> Nbits _never_ needs to be sent even with other consensus rules because
> its more or less necessarily a strict function of the prior headers.
> This still holds in every clone of Bitcoin I'm aware of; sending it
> with the first header in a group probably makes sense so it can be
> checked independently.
>

I think it would be nice, though, to not require the consensus-correct
calculation of nBits in order to process p2p messages.  For instance, I
think there's a use for nBits at the p2p layer for calculating the work on
a chain, which can be used as an anti-DoS measure, even without verifying
that the difficulty adjustments are following the consensus rules.

Moreover I think it's a bit messy if the p2p layer depends on intricate
consensus rules in order to reconstruct a message -- either we'd need to
interact with existing consensus logic in a potentially new way, or we'd
reimplement the same logic in the p2p layer, neither of which is very
desirable imo.

But I think we should be able to get nearly all the benefit just by
including nBits in any messages where the value is ambiguous; ie we include
it with the first header in a message, and whenever it changes from the
previous header's nBits.

I would rather not change the serialization of existing messages,
> nodes are going to have to support speaking both messages for a long
> time, and I think we already want a different protocol flow for
> headers fetching in any case.
>

I agree with this.  Specifically the way I envisioned this working is that
we could introduce a new 'cmpctheaders'/'getcmpcthdrs' message pair for
syncing using this new message type, while leaving the existing
'headers'/'getheaders' messages unchanged.  So when communicating with
upgraded peers, we'd never use 'getheaders' messages, and we'd only use
'headers' messages for potentially announcing new blocks.

Of course, we'll have to support the existing protocol for a long time.
But one downside I've discovered from deploying BIP 130, which introduced
block announcements via headers messages, is that overloading a 'headers'
message to be either a block announcement or a response to a 'getheaders'
message results in p2p-handling logic which is more complicated than it
needs to be.  So splitting off the headers chain-sync functionality to a
new message pair seems like a nice side-effect benefit, in the long run.
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] "Compressed" headers stream

2017-12-11 Thread Gregory Maxwell via bitcoin-dev
On Mon, Dec 11, 2017 at 10:41 PM, Tier Nolan via bitcoin-dev
 wrote:
> There is a method called "high hash highway" that allows compact proofs of
> total POW.

That provides no security without additional consensus enforced
commitments, so I think pretty off-topic for this discussion.
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] "Compressed" headers stream

2017-12-11 Thread Tier Nolan via bitcoin-dev
On Mon, Dec 11, 2017 at 9:56 PM, Jim Posen via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> Omitting nBits entirely seems reasonable, I wrote up a possible
> implementation here
> .
> The downside is that it is more complex because it leaks into the
> validation code. The extra 4 byte savings is certainly nice though.
>

A compromise would be to have 1 byte indicating the difference since the
last header.

Since the exponent doesn't use the full range you could steal bits from
there to indicate mode.

- no change
- mantissa offset (for small changes)
- full difficulty

This would support any nBits rule and you say 3 of the 4 bytes.


> Can you elaborate on how parallel header fetching might work? getheaders
> requests could probably already be pipelined, where the node requests the
> next 2,000 headers before processing the current batch (though would make
> sense to check that they are all above min difficulty first).
>

I suggest adding a message where you can ask for the lowest N hashes
between 2 heights on the main chain.

The reply is an array of {height, header} pairs for the N headers with the
lowest hash in the specified range.

All peers should agree on which headers are in the array.  If there is
disagreement, then you can at least narrow down on which segment there is
disagreement.

It works kind of like a cut and choose.  You pick one segment of the ones
he gave you recursively.

You can ask a peer for proof for a segment between 2 headers of the form.

- first header + coinbase with merkle branch
- all headers in the segment

This proves the segment has the correct height and that all the headers
link up.

There is a method called "high hash highway" that allows compact proofs of
total POW.
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] "Compressed" headers stream

2017-12-11 Thread Jim Posen via bitcoin-dev
On Mon, Dec 11, 2017 at 1:04 PM, Gregory Maxwell  wrote:

> On Mon, Dec 11, 2017 at 8:40 PM, Jim Posen via bitcoin-dev
>  wrote:
> > Firstly, I don't like the idea of making the net header encoding
> dependent
> > on the specific header validation rules that Bitcoin uses (eg. the fact
> that
> > difficulty is only recalculated every 2016 blocks). This would be
> coupling
>
> In the last proposal I recall writing up, there was a one byte flag on
> each header to indicate what was included.
>
>
Is there a link somewhere to that proposal? The only thing I could find was
your forwarded email

on
this thread.


> Nbits _never_ needs to be sent even with other consensus rules because
> its more or less necessarily a strict function of the prior headers.
> This still holds in every clone of Bitcoin I'm aware of; sending it
> with the first header in a group probably makes sense so it can be
> checked independently.
>
> > with insufficient benefit.
>
> another >18% reduction in size beyond the removal of prev. is not
> insubstantial by any means.  I don't think it should lightly be
> ignored.
>
>
Omitting nBits entirely seems reasonable, I wrote up a possible
implementation here
.
The downside is that it is more complex because it leaks into the
validation code. The extra 4 byte savings is certainly nice though.


> Prev omission itself is not, sadly, magically compatible:  I am quite
> confident that if there is a bitcoin hardfork it would recover the
> nbits/4-guarenteed always-zero bits of prev to use as extra nonce for
> miners. This has been proposed many times, implemented at least once,
> and the current requirement for mining infrastructure to reach inside
> the coinbase txn to increment a nonce has been a reliable source of
> failures.  So I think we'd want to have the encoding able to encode
> leading prev bits.
>
> Many altcoins also change the header structures. If the better thing
> is altcoin incompatible, we should still do it. Doing otherwise would
> competitively hobble Bitcoin especially considering the frequent
> recklessly incompetent moves made by various altcoins and the near
> total lack of useful novel development we've seen come out of the
> clones.
>
> Probably the most important change in a new header message wouldn't be
> the encoding, but it would be changing the fetching mechanism so that
> header sets could be pulled in parallel, etc.
>
> I would rather not change the serialization of existing messages,
> nodes are going to have to support speaking both messages for a long
> time, and I think we already want a different protocol flow for
> headers fetching in any case.
>

Can you elaborate on how parallel header fetching might work? getheaders
requests could probably already be pipelined, where the node requests the
next 2,000 headers before processing the current batch (though would make
sense to check that they are all above min difficulty first).

I'm open to more ideas on how to optimize the header download or design the
serialization format to be more flexible, but I'm concerned that we forgo a
40-45% bandwidth savings on the current protocol for a long time because
something better might be possible later on or there might be a hard fork
that at some point requires another upgrade. I do recognize that supporting
multiple serialization formats simultaneously adds code complexity, but in
this case the change seems simple enough to me that the tradeoff is worth
it.
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] "Compressed" headers stream

2017-12-11 Thread Gregory Maxwell via bitcoin-dev
On Mon, Dec 11, 2017 at 8:40 PM, Jim Posen via bitcoin-dev
 wrote:
> Firstly, I don't like the idea of making the net header encoding dependent
> on the specific header validation rules that Bitcoin uses (eg. the fact that
> difficulty is only recalculated every 2016 blocks). This would be coupling

In the last proposal I recall writing up, there was a one byte flag on
each header to indicate what was included.

Nbits _never_ needs to be sent even with other consensus rules because
its more or less necessarily a strict function of the prior headers.
This still holds in every clone of Bitcoin I'm aware of; sending it
with the first header in a group probably makes sense so it can be
checked independently.

> with insufficient benefit.

another >18% reduction in size beyond the removal of prev. is not
insubstantial by any means.  I don't think it should lightly be
ignored.

Prev omission itself is not, sadly, magically compatible:  I am quite
confident that if there is a bitcoin hardfork it would recover the
nbits/4-guarenteed always-zero bits of prev to use as extra nonce for
miners. This has been proposed many times, implemented at least once,
and the current requirement for mining infrastructure to reach inside
the coinbase txn to increment a nonce has been a reliable source of
failures.  So I think we'd want to have the encoding able to encode
leading prev bits.

Many altcoins also change the header structures. If the better thing
is altcoin incompatible, we should still do it. Doing otherwise would
competitively hobble Bitcoin especially considering the frequent
recklessly incompetent moves made by various altcoins and the near
total lack of useful novel development we've seen come out of the
clones.

Probably the most important change in a new header message wouldn't be
the encoding, but it would be changing the fetching mechanism so that
header sets could be pulled in parallel, etc.

I would rather not change the serialization of existing messages,
nodes are going to have to support speaking both messages for a long
time, and I think we already want a different protocol flow for
headers fetching in any case.
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


[bitcoin-dev] "Compressed" headers stream

2017-12-11 Thread Jim Posen via bitcoin-dev
I want to resurrect this thread from August/September because it seems like
a significant improvement for light clients at very little cost. From the
mailing list, it seems like this got stalled in determining how many more
bytes could be save in addition to the prev_block.

The ideas I've gathered from Greg Maxwell's forwarded email are:

1. Omit nBits altogether and have the receiving node determine it from
chain context.
2. Include nBits only on headers with a height that is a multiple of 2016
since it does not change in between.
3. Compress nTime to two bytes by using the bounds on allowed values from
the consensus rules.

I propose just moving ahead with only the exclusion of the prev_block, as
IMO the other savings are not worth the added complexity.

Firstly, I don't like the idea of making the net header encoding dependent
on the specific header validation rules that Bitcoin uses (eg. the fact
that difficulty is only recalculated every 2016 blocks). This would be
coupling together the two layers, breaking net compatibility for some alts,
and possibly making consensus rule changes even more difficult for a
savings with insufficient benefit. So if you buy that argument, I'm not in
favor of #2 or #3.

Option 1 is still viable, though it has some downsides. The implementation
leaks into the validation code, whereas calculating prev_block can occur
just at the net layer (see implementation below). Also, nodes would now be
*required* to sync the header chain from the genesis block, whereas they
had the option of starting from some checkpoint before.

So switching gears, I'd like to ask what the best way to actually implement
this change is. Solutions I can think of are:

1. New headers command name like "cmpctheaders" or "headersv2".
2. Change serialization of existing headers message in a new protocol
version.
3. Change serialization of existing headers message with new service bit.

I wrote up some proof-of-concept implementations in Core a) just omitting
prev_block

and b) omitting nBits as well
.
If people think a) is reasonable, I'll write up a BIP.


> Hi everyone, the Bitcoin headers are probably the most condensed and
> important piece of data in the world, their demand is expected to grow.
> When sending a stream of continuous block headers, a common case in IBD and
> in disconnected clients, I think there is a possible optimization of the
> transmitted data: The headers after the first could avoid transmitting the
> previous hash cause the receiver could compute it by double hashing the
> previous header (an operation he needs to do anyway to verify PoW). In a
> long stream, for example 2016 headers, the savings in bandwidth are about
> 32/80 ~= 40% without compressed headers 2016*80=161280 bytes with
> compressed headers 80+2015*48=96800 bytes What do you think? In
> OpenTimestamps calendars we are going to use this compression to give
> lite-client a reasonable secure proofs (a full node give higher security
> but isn't feasible in all situations, for example for in-browser
> verification) To speed up sync of a new client Electrum starts with the
> download of a file 
> ~36MB containing the first 477637 headers. For this kind of clients could
> be useful a common http API with fixed position chunks to leverage http
> caching. For example /headers/2016/0 returns the headers from the genesis
> to the 2015 header included while /headers/2016/1 gives the headers from
> the 2016th to the 4031. Other endpoints could have chunks of 20160 blocks
> or 201600 such that with about 10 http requests a client could fast sync
> the headers
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] "Compressed" headers stream

2017-09-04 Thread Peter Todd via bitcoin-dev
On Mon, Aug 28, 2017 at 12:26:48PM -0400, Greg Sanders via bitcoin-dev wrote:
> Well, if anything my question may bolster your use-case. If there's a
> heavier chain that is invalid, I kind of doubt it matters for timestamping
> reasons.

Timestamping can easily be *more* vulnerable to malicious miners than financial
applications for a number of reasons, including the fact that there's no
financial feedback loop of miners destroying the value of the coins they
produce - timestamping is a non-financial piggy-back application that doesn't
directly interact with the Bitcoin economy, beyond a trival number of timestamp
transactions.

-- 
https://petertodd.org 'peter'[:-1]@petertodd.org


signature.asc
Description: Digital signature
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] "Compressed" headers stream

2017-08-28 Thread Greg Sanders via bitcoin-dev
Well, if anything my question may bolster your use-case. If there's a
heavier chain that is invalid, I kind of doubt it matters for timestamping
reasons.

/digression

On Mon, Aug 28, 2017 at 12:25 PM, Riccardo Casatta <
riccardo.casa...@gmail.com> wrote:

>
> 2017-08-28 18:13 GMT+02:00 Greg Sanders :
>
>> Is there any reason to believe that you need Bitcoin "full security" at
>> all for timestamping?
>>
>
> This is a little bit out of the main topic of the email which is the
> savings in bandwidth in transmitting headers, any comment about that?
>
>
> P.S. As a personal experience timestamping is nowadays used to prove date
> and integrity of private databases containing a lot of value, so yes, in
> that cases I will go with Bitcoin "full security"
>
>
>>
>> On Mon, Aug 28, 2017 at 11:50 AM, Riccardo Casatta via bitcoin-dev <
>> bitcoin-dev@lists.linuxfoundation.org> wrote:
>>
>>> Hi everyone,
>>>
>>> the Bitcoin headers are probably the most condensed and important piece
>>> of data in the world, their demand is expected to grow.
>>>
>>> When sending a stream of continuous block headers, a common case in IBD
>>> and in disconnected clients, I think there is a possible optimization of
>>> the transmitted data:
>>> The headers after the first could avoid transmitting the previous hash
>>> cause the receiver could compute it by double hashing the previous header
>>> (an operation he needs to do anyway to verify PoW).
>>> In a long stream, for example 2016 headers, the savings in bandwidth are
>>> about 32/80 ~= 40%
>>> without compressed headers 2016*80=161280 bytes
>>> with compressed headers 80+2015*48=96800 bytes
>>>
>>> What do you think?
>>>
>>>
>>> In OpenTimestamps calendars we are going to use this compression to give
>>> lite-client a reasonable secure proofs (a full node give higher security
>>> but isn't feasible in all situations, for example for in-browser
>>> verification)
>>> To speed up sync of a new client Electrum starts with the download of a
>>> file  ~36MB containing
>>> the first 477637 headers.
>>> For this kind of clients could be useful a common http API with fixed
>>> position chunks to leverage http caching. For example /headers/2016/0
>>> returns the headers from the genesis to the 2015 header included while
>>> /headers/2016/1 gives the headers from the 2016th to the 4031.
>>> Other endpoints could have chunks of 20160 blocks or 201600 such that
>>> with about 10 http requests a client could fast sync the headers
>>>
>>>
>>> --
>>> Riccardo Casatta - @RCasatta 
>>>
>>> ___
>>> bitcoin-dev mailing list
>>> bitcoin-dev@lists.linuxfoundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>>
>>>
>>
>
>
> --
> Riccardo Casatta - @RCasatta 
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] "Compressed" headers stream

2017-08-28 Thread Greg Sanders via bitcoin-dev
Is there any reason to believe that you need Bitcoin "full security" at all
for timestamping?

On Mon, Aug 28, 2017 at 11:50 AM, Riccardo Casatta via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> Hi everyone,
>
> the Bitcoin headers are probably the most condensed and important piece of
> data in the world, their demand is expected to grow.
>
> When sending a stream of continuous block headers, a common case in IBD
> and in disconnected clients, I think there is a possible optimization of
> the transmitted data:
> The headers after the first could avoid transmitting the previous hash
> cause the receiver could compute it by double hashing the previous header
> (an operation he needs to do anyway to verify PoW).
> In a long stream, for example 2016 headers, the savings in bandwidth are
> about 32/80 ~= 40%
> without compressed headers 2016*80=161280 bytes
> with compressed headers 80+2015*48=96800 bytes
>
> What do you think?
>
>
> In OpenTimestamps calendars we are going to use this compression to give
> lite-client a reasonable secure proofs (a full node give higher security
> but isn't feasible in all situations, for example for in-browser
> verification)
> To speed up sync of a new client Electrum starts with the download of a
> file  ~36MB containing
> the first 477637 headers.
> For this kind of clients could be useful a common http API with fixed
> position chunks to leverage http caching. For example /headers/2016/0
> returns the headers from the genesis to the 2015 header included while
> /headers/2016/1 gives the headers from the 2016th to the 4031.
> Other endpoints could have chunks of 20160 blocks or 201600 such that with
> about 10 http requests a client could fast sync the headers
>
>
> --
> Riccardo Casatta - @RCasatta 
>
> ___
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


[bitcoin-dev] "Compressed" headers stream

2017-08-28 Thread Riccardo Casatta via bitcoin-dev
Hi everyone,

the Bitcoin headers are probably the most condensed and important piece of
data in the world, their demand is expected to grow.

When sending a stream of continuous block headers, a common case in IBD and
in disconnected clients, I think there is a possible optimization of the
transmitted data:
The headers after the first could avoid transmitting the previous hash
cause the receiver could compute it by double hashing the previous header
(an operation he needs to do anyway to verify PoW).
In a long stream, for example 2016 headers, the savings in bandwidth are
about 32/80 ~= 40%
without compressed headers 2016*80=161280 bytes
with compressed headers 80+2015*48=96800 bytes

What do you think?


In OpenTimestamps calendars we are going to use this compression to give
lite-client a reasonable secure proofs (a full node give higher security
but isn't feasible in all situations, for example for in-browser
verification)
To speed up sync of a new client Electrum starts with the download of a file
 ~36MB containing the
first 477637 headers.
For this kind of clients could be useful a common http API with fixed
position chunks to leverage http caching. For example /headers/2016/0
returns the headers from the genesis to the 2015 header included while
/headers/2016/1 gives the headers from the 2016th to the 4031.
Other endpoints could have chunks of 20160 blocks or 201600 such that with
about 10 http requests a client could fast sync the headers


-- 
Riccardo Casatta - @RCasatta 
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev