Re: [Mimblewimble] Compact blocks

2017-04-27 Thread John Tromp
dear Igno,

> BIP152 introduced a good solution by introducing short transaction ids
> (which we can generalize to inputs/outputs/kernels):
>
> https://github.com/bitcoin/bips/blob/master/bip-0152.mediawiki#Short_transaction_IDs
>
> It does force a full re-hashing of the pool on each block but siphash 2-4 is
> fast and we don't need to introduce a new crypto primitive as siphash 2-4 is
> already used by Cuckoo Cycle.

Not only that, but Cuckoo Cycle proceeds with exactly the same steps:

"Short transaction IDs are used to represent a transaction without
sending a full 256-bit hash. They are calculated by:
1. single-SHA256 hashing the block header with the nonce appended (in
little-endian)
2. Running SipHash-2-4 with the input being the transaction ID and the
keys (k0/k1) set to the first two little-endian 64-bit integers from
the above hash, respectively.
3. Dropping the 2 most significant bytes from the SipHash output to
make it 6 bytes."

Cuckoo Cycle on 2^{k+1} nodes looks for a cycle on the graph of
short transaction IDs truncated to k bits, where edge e, for 0<=e<2^k,
connects the transaction IDs 2*e and 2*e+1..

regards,
-John

-- 
Mailing list: https://launchpad.net/~mimblewimble
Post to : mimblewimble@lists.launchpad.net
Unsubscribe : https://launchpad.net/~mimblewimble
More help   : https://help.launchpad.net/ListHelp


Re: [Mimblewimble] Compact blocks

2017-04-27 Thread Ignotus Peverell
I wanted to amend my proposal on compact blocks to refine further the "first n 
bytes of input/output/kernel hashes". Obviously this opens up a possibility of 
collision forgery, which can then increase network chattiness.

BIP152 introduced a good solution by introducing short transaction ids (which 
we can generalize to inputs/outputs/kernels):

https://github.com/bitcoin/bips/blob/master/bip-0152.mediawiki#Short_transaction_IDs

It does force a full re-hashing of the pool on each block but siphash 2-4 is 
fast and we don't need to introduce a new crypto primitive as siphash 2-4 is 
already used by Cuckoo Cycle.

I was also playing with the idea of committing to the short tx ids in the block 
header (instead of the tx hashes) to avoid network-level malleability or 
compact blocks but after checking with instagibbs on bitcoin-wizards (thanks!) 
that seems like a premature optimization right now. Bitcoin doesn't have such a 
thing and just plain ban on misbehavior seems to work well enough so far.

- Igno

 Original Message 
Subject: Re: [Mimblewimble] Compact blocks
Local Time: March 4, 2017 4:20 PM
UTC Time: March 5, 2017 12:20 AM
From: igno.pever...@protonmail.com
To: John Tromp <john.tr...@gmail.com>
mimblewimble@lists.launchpad.net

Now that we're going down that path, I'm starting to wonder whether we should 
treat inputs, outputs and kernels all in that same way. It's definitely simpler.

Here's the proposal:
- Across the wire, blocks have the regular list of inputs, outputs and kernels. 
At the minimum, they must have one output and one kernel (coinbase).
- Blocks can also include the first n bytes of input/output/kernel hashes that 
are part of the block but not fully included.
- A node receiving a block with partial hashes checks them against its the 
transaction pool.
- Known input/output/kernels are added, missing ones (or collisions) are asked 
to the block sending node all in one batch.
- Once everything is known, the full block can be validated.

This still leaves the possibility to have full blocks, including all the data. 
In the general case the blocks should be very compact and requests for more 
data will only be commonplace for nodes newly started. It also gives an 
incentive to miners to have a fairly full transaction pool (to avoid further 
requests).

As far as how many bytes of the hash, as John mentioned, 8 bytes (a u64) seem 
to suffice. I believe the chances of collisions with a pool of 100,000 elements 
would still be less than 0.0003%. 4 bytes would likely be too small, a 
transaction pool of the same size would have 69% of collisions.

- Igno

 Original Message --------l
Subject: Re: [Mimblewimble] Compact blocks
Local Time: March 3, 2017 2:47 PM
UTC Time: March 3, 2017 10:47 PM
From: john.tr...@gmail.com
To: Ignotus Peverell <igno.pever...@protonmail.com>, 
mimblewimble@lists.launchpad.net

dear Igno,

> That is, the receiver will look up how many kernel she has matching the hint.
> If 0 or more than 1, it will query peers for a list of all known
> matching kernels.

Uhm, that should be: ask the sending peer to expand the hint to a full kernel.

regards,
-John-- 
Mailing list: https://launchpad.net/~mimblewimble
Post to : mimblewimble@lists.launchpad.net
Unsubscribe : https://launchpad.net/~mimblewimble
More help   : https://help.launchpad.net/ListHelp


Re: [Mimblewimble] Compact blocks

2017-03-07 Thread Ignotus Peverell
I'm fine with 2 Merkle trees. Seems like there won't be much that's non-witness 
in the kernel though.

- Igno

P.S. Loving all the mailing list activity today :)




 Original Message 
Subject: Re: [Mimblewimble] Compact blocks
Local Time: March 7, 2017 11:24 AM
UTC Time: March 7, 2017 7:24 PM
From: merop...@protonmail.com
To: mimblewimble@lists.launchpad.net <mimblewimble@lists.launchpad.net>


From: moaningmyr...@protonmail.com

It should be sufficient for the output and its rangeproof to be separately 
committed to the chain to prevent ambiguity. Committing to rangeproofs, which 
are witness data and can be ignored (at a trust tradeoff), will reduce 
flexibility.

This is a good point. I'm not opposed to the rangeproof and output having 
separate identifiers. In the context of the compact block discussion I wanted 
to emphasize the ability to unambiguously commit to a specific (output, range 
proof) pair to avoid a search problem that quickly becomes untenable, but if 
this comes in the form of two separate identifiers resolved independently, that 
should be fine.

Right. It might be better to to have two Merkle structures. I know Igno wanted 
only one, for simplicity, but I think it's easiest and most efficient for 
witness data to be in a parallel tree. So you have a tree with kernels, inputs, 
outputs, and in parallel you have a tree with the respective signatures, "input 
witnesses", and rangeproofs.

This makes things unambiguous: to find a rangeproof for an output, you look in 
the witness tree at the same position that you found the output in the data 
tree.

Input witnesses could be nothing, or maybe it makes sense for them to be full 
Merkle proofs that the referenced inputs are currently in the UTXO tree.-- 
Mailing list: https://launchpad.net/~mimblewimble
Post to : mimblewimble@lists.launchpad.net
Unsubscribe : https://launchpad.net/~mimblewimble
More help   : https://help.launchpad.net/ListHelp


Re: [Mimblewimble] Compact blocks

2017-03-07 Thread Myrtle Warren
It should be sufficient for the output and its rangeproof to be separately 
committed to the chain to prevent ambiguity. Committing to rangeproofs, which 
are witness data and can be ignored (at a trust tradeoff), will reduce 
flexibility.

This is a good point. I'm not opposed to the rangeproof and output having 
separate identifiers. In the context of the compact block discussion I wanted 
to emphasize the ability to unambiguously commit to a specific (output, range 
proof) pair to avoid a search problem that quickly becomes untenable, but if 
this comes in the form of two separate identifiers resolved independently, that 
should be fine.-- 
Mailing list: https://launchpad.net/~mimblewimble
Post to : mimblewimble@lists.launchpad.net
Unsubscribe : https://launchpad.net/~mimblewimble
More help   : https://help.launchpad.net/ListHelp


Re: [Mimblewimble] Compact blocks

2017-03-07 Thread Merope Riddle
From: moaningmyr...@protonmail.com

It should be sufficient for the output and its rangeproof to be separately 
committed to the chain to prevent ambiguity. Committing to rangeproofs, which 
are witness data and can be ignored (at a trust tradeoff), will reduce 
flexibility.

This is a good point. I'm not opposed to the rangeproof and output having 
separate identifiers. In the context of the compact block discussion I wanted 
to emphasize the ability to unambiguously commit to a specific (output, range 
proof) pair to avoid a search problem that quickly becomes untenable, but if 
this comes in the form of two separate identifiers resolved independently, that 
should be fine.

Right. It might be better to to have two Merkle structures. I know Igno wanted 
only one, for simplicity, but I think it's easiest and most efficient for 
witness data to be in a parallel tree. So you have a tree with kernels, inputs, 
outputs, and in parallel you have a tree with the respective signatures, "input 
witnesses", and rangeproofs.

This makes things unambiguous: to find a rangeproof for an output, you look in 
the witness tree at the same position that you found the output in the data 
tree.

Input witnesses could be nothing, or maybe it makes sense for them to be full 
Merkle proofs that the referenced inputs are currently in the UTXO tree.-- 
Mailing list: https://launchpad.net/~mimblewimble
Post to : mimblewimble@lists.launchpad.net
Unsubscribe : https://launchpad.net/~mimblewimble
More help   : https://help.launchpad.net/ListHelp


Re: [Mimblewimble] Compact blocks

2017-03-06 Thread Myrtle Warren
Definitely a fan of compact blocks to mitigate some impact of a faster block 
time. A few random thoughts below:

On range proof validation: Doing all we can to reduce duplicate range proof 
transmission seems like a big win. I prefer the amended version in the last 
email to the original proposal. Using a partial output rather than an output 
identifier introduces a fair bit of complexity.

As far as I can tell, the specific construction of the output hash's preimage 
has not been determined. If we move forward with this mechanism, it becomes 
critically important that the hash covers the range proof itself. A situation 
where the hash resolves ambiguously should be avoided at all costs.

On inputs being part of the scheme: Since an input is by itself little more 
than a hash, it seems like the gain there is likely fairly small. We have thus 
far avoided the need to uniquely identify inputs by anything other than what 
output it is spending. I think I'd prefer inputs to be in the block in their 
entirety.

This is a cool design! I think nailing down some of the specific aspects of the 
design will require a bit of protocol testing but this makes sense as a 
foundation to start with.-- 
Mailing list: https://launchpad.net/~mimblewimble
Post to : mimblewimble@lists.launchpad.net
Unsubscribe : https://launchpad.net/~mimblewimble
More help   : https://help.launchpad.net/ListHelp


Re: [Mimblewimble] Compact blocks

2017-03-04 Thread Ignotus Peverell
Now that we're going down that path, I'm starting to wonder whether we should 
treat inputs, outputs and kernels all in that same way. It's definitely simpler.

Here's the proposal:

- Across the wire, blocks have the regular list of inputs, outputs and kernels. 
At the minimum, they must have one output and one kernel (coinbase).

- Blocks can also include the first n bytes of input/output/kernel hashes that 
are part of the block but not fully included.

- A node receiving a block with partial hashes checks them against its the 
transaction pool.

- Known input/output/kernels are added, missing ones (or collisions) are asked 
to the block sending node all in one batch.

- Once everything is known, the full block can be validated.


This still leaves the possibility to have full blocks, including all the data. 
In the general case the blocks should be very compact and requests for more 
data will only be commonplace for nodes newly started. It also gives an 
incentive to miners to have a fairly full transaction pool (to avoid further 
requests).

As far as how many bytes of the hash, as John mentioned, 8 bytes (a u64) seem 
to suffice. I believe the chances of collisions with a pool of 100,000 elements 
would still be less than 0.0003%. 4 bytes would likely be too small, a 
transaction pool of the same size would have 69% of collisions.

- Igno




 Original Message l
Subject: Re: [Mimblewimble] Compact blocks
Local Time: March 3, 2017 2:47 PM
UTC Time: March 3, 2017 10:47 PM
From: john.tr...@gmail.com
To: Ignotus Peverell <igno.pever...@protonmail.com>, 
mimblewimble@lists.launchpad.net

dear Igno,

> That is, the receiver will look up how many kernel she has matching the hint.
> If 0 or more than 1, it will query peers for a list of all known
> matching kernels.

Uhm, that should be: ask the sending peer to expand the hint to a full kernel.

regards,
-John-- 
Mailing list: https://launchpad.net/~mimblewimble
Post to : mimblewimble@lists.launchpad.net
Unsubscribe : https://launchpad.net/~mimblewimble
More help   : https://help.launchpad.net/ListHelp


Re: [Mimblewimble] Compact blocks

2017-03-03 Thread John Tromp
dear Igno,

>> The second largest piece of data is composed of the transaction kernels
>> (~100 bytes each). We can't fully avoid sending those as they're not
>> "anchored" on anything, like the range proof is anchored on an output.
>> However we could send a shortened version of their hash. In this scheme,
>> when sent across the wire, a block would only contain the first 16 bytes (or
>> maybe even 12) of the hash of the whole kernel. Upon reception, an exchange
>> similar to the one for range proofs would occur. Note that this requires the
>> pool maintains a mapping of those truncated hashes with the actual kernels.
>
> I feel less enthusiastic about this. The reduction is block size is
> rather modest,
> and somewhat negated by the need for extra querying traffic, which will be
> exacerbated by clients submitting transactions directly to mining pool(s) for
> improved aggregation, rather than through the peer-to-peer network.

If you do pursue this path, then the hash-prefix serves merely as a kernel hint.
That is, the receiver will look up how many kernel she has matching the hint.
If 0 or more than 1, it will query peers for a list of all known
matching kernels.
If exactly 1, then it will use that, but if validation fails, it will
again have to go back
to querying peers for ALL the hints this time, so that it may learn of other
matching kernels that would validate. In this approach, even 12 bytes
is overkill,
and somewhere between  and 8 bytes suffices.

Btw, how should we encode inputs? There's two ways to get away with 8 bytes:

1) a 32 bit block number and 32 bit index in sorted outputs of that block
2) a single 64 bit index in list all of outputs, sorted first by block.

regards,
-John

-- 
Mailing list: https://launchpad.net/~mimblewimble
Post to : mimblewimble@lists.launchpad.net
Unsubscribe : https://launchpad.net/~mimblewimble
More help   : https://help.launchpad.net/ListHelp


Re: [Mimblewimble] Compact blocks

2017-03-01 Thread John Tromp
dear Igno,

> In this discussion, I'm only interested in the size of a block when sent
> across the wire during propagation. It's desirable in this context to have
> small block sizes (in bytes) so that blocks can traverse the network quickly
> without causing too much additional traffic. As most nodes will generally
> receive the data included in a transaction twice (as the transaction is
> propagated and in a block), some obvious optimizations are possible.
>
> As a reminder, a Grin block is composed of the following:
>
> A header, including the proof of work, some Merkle-ish roots, a reference to
> the previous block hash, etc. All the header is always required.
> A set of inputs as a direct link to previous outputs.
> A set of outputs, containing a commitment and a range proof.
> Transaction kernels, containing a signature, a public key and the explicit
> fee.

Why does the kernel need the fee? Is that to be hashed into the
Schnorr signature?
And if so, why is that necessary?

> The largest piece of data by a good margin right now is the range proof (~5
> KB each). Therefore we can introduce a first optimization:
>
> Block gets propagated sans range proof.
> Upon reception, a node checks whether it has all the range proofs for all
> outputs included in the block.
> If any missing, the node requests them from the peer that sent the block.
> Validation can proceed, skipping the range proofs that were already known
> (assuming they were already validated by the transaction pool).

Excellent idea.

> The second largest piece of data is composed of the transaction kernels
> (~100 bytes each). We can't fully avoid sending those as they're not
> "anchored" on anything, like the range proof is anchored on an output.
> However we could send a shortened version of their hash. In this scheme,
> when sent across the wire, a block would only contain the first 16 bytes (or
> maybe even 12) of the hash of the whole kernel. Upon reception, an exchange
> similar to the one for range proofs would occur. Note that this requires the
> pool maintains a mapping of those truncated hashes with the actual kernels.

I feel less enthusiastic about this. The reduction is block size is
rather modest,
and somewhat negated by the need for extra querying traffic, which will be
exacerbated by clients submitting transactions directly to mining pool(s) for
improved aggregation, rather than through the peer-to-peer network.

> Beyond this, optimizations become non-trivial. We do *not* want to introduce
> some transaction-level hash as transactions are disaggregated in blocks by
> design. We could introduce something like inverted bloom filters for inputs
> or outputs but those are still rather small (32 to 40 bytes each) and so the
> tradeoff of space saving over complexity doesn't seem worth it at this
> point.

regards,
-John

-- 
Mailing list: https://launchpad.net/~mimblewimble
Post to : mimblewimble@lists.launchpad.net
Unsubscribe : https://launchpad.net/~mimblewimble
More help   : https://help.launchpad.net/ListHelp