Re: [Mimblewimble] Compact blocks
dear Igno, > BIP152 introduced a good solution by introducing short transaction ids > (which we can generalize to inputs/outputs/kernels): > > https://github.com/bitcoin/bips/blob/master/bip-0152.mediawiki#Short_transaction_IDs > > It does force a full re-hashing of the pool on each block but siphash 2-4 is > fast and we don't need to introduce a new crypto primitive as siphash 2-4 is > already used by Cuckoo Cycle. Not only that, but Cuckoo Cycle proceeds with exactly the same steps: "Short transaction IDs are used to represent a transaction without sending a full 256-bit hash. They are calculated by: 1. single-SHA256 hashing the block header with the nonce appended (in little-endian) 2. Running SipHash-2-4 with the input being the transaction ID and the keys (k0/k1) set to the first two little-endian 64-bit integers from the above hash, respectively. 3. Dropping the 2 most significant bytes from the SipHash output to make it 6 bytes." Cuckoo Cycle on 2^{k+1} nodes looks for a cycle on the graph of short transaction IDs truncated to k bits, where edge e, for 0<=e<2^k, connects the transaction IDs 2*e and 2*e+1.. regards, -John -- Mailing list: https://launchpad.net/~mimblewimble Post to : mimblewimble@lists.launchpad.net Unsubscribe : https://launchpad.net/~mimblewimble More help : https://help.launchpad.net/ListHelp
Re: [Mimblewimble] Compact blocks
I wanted to amend my proposal on compact blocks to refine further the "first n bytes of input/output/kernel hashes". Obviously this opens up a possibility of collision forgery, which can then increase network chattiness. BIP152 introduced a good solution by introducing short transaction ids (which we can generalize to inputs/outputs/kernels): https://github.com/bitcoin/bips/blob/master/bip-0152.mediawiki#Short_transaction_IDs It does force a full re-hashing of the pool on each block but siphash 2-4 is fast and we don't need to introduce a new crypto primitive as siphash 2-4 is already used by Cuckoo Cycle. I was also playing with the idea of committing to the short tx ids in the block header (instead of the tx hashes) to avoid network-level malleability or compact blocks but after checking with instagibbs on bitcoin-wizards (thanks!) that seems like a premature optimization right now. Bitcoin doesn't have such a thing and just plain ban on misbehavior seems to work well enough so far. - Igno Original Message Subject: Re: [Mimblewimble] Compact blocks Local Time: March 4, 2017 4:20 PM UTC Time: March 5, 2017 12:20 AM From: igno.pever...@protonmail.com To: John Tromp <john.tr...@gmail.com> mimblewimble@lists.launchpad.net Now that we're going down that path, I'm starting to wonder whether we should treat inputs, outputs and kernels all in that same way. It's definitely simpler. Here's the proposal: - Across the wire, blocks have the regular list of inputs, outputs and kernels. At the minimum, they must have one output and one kernel (coinbase). - Blocks can also include the first n bytes of input/output/kernel hashes that are part of the block but not fully included. - A node receiving a block with partial hashes checks them against its the transaction pool. - Known input/output/kernels are added, missing ones (or collisions) are asked to the block sending node all in one batch. - Once everything is known, the full block can be validated. This still leaves the possibility to have full blocks, including all the data. In the general case the blocks should be very compact and requests for more data will only be commonplace for nodes newly started. It also gives an incentive to miners to have a fairly full transaction pool (to avoid further requests). As far as how many bytes of the hash, as John mentioned, 8 bytes (a u64) seem to suffice. I believe the chances of collisions with a pool of 100,000 elements would still be less than 0.0003%. 4 bytes would likely be too small, a transaction pool of the same size would have 69% of collisions. - Igno Original Message --------l Subject: Re: [Mimblewimble] Compact blocks Local Time: March 3, 2017 2:47 PM UTC Time: March 3, 2017 10:47 PM From: john.tr...@gmail.com To: Ignotus Peverell <igno.pever...@protonmail.com>, mimblewimble@lists.launchpad.net dear Igno, > That is, the receiver will look up how many kernel she has matching the hint. > If 0 or more than 1, it will query peers for a list of all known > matching kernels. Uhm, that should be: ask the sending peer to expand the hint to a full kernel. regards, -John-- Mailing list: https://launchpad.net/~mimblewimble Post to : mimblewimble@lists.launchpad.net Unsubscribe : https://launchpad.net/~mimblewimble More help : https://help.launchpad.net/ListHelp
Re: [Mimblewimble] Compact blocks
I'm fine with 2 Merkle trees. Seems like there won't be much that's non-witness in the kernel though. - Igno P.S. Loving all the mailing list activity today :) Original Message Subject: Re: [Mimblewimble] Compact blocks Local Time: March 7, 2017 11:24 AM UTC Time: March 7, 2017 7:24 PM From: merop...@protonmail.com To: mimblewimble@lists.launchpad.net <mimblewimble@lists.launchpad.net> From: moaningmyr...@protonmail.com It should be sufficient for the output and its rangeproof to be separately committed to the chain to prevent ambiguity. Committing to rangeproofs, which are witness data and can be ignored (at a trust tradeoff), will reduce flexibility. This is a good point. I'm not opposed to the rangeproof and output having separate identifiers. In the context of the compact block discussion I wanted to emphasize the ability to unambiguously commit to a specific (output, range proof) pair to avoid a search problem that quickly becomes untenable, but if this comes in the form of two separate identifiers resolved independently, that should be fine. Right. It might be better to to have two Merkle structures. I know Igno wanted only one, for simplicity, but I think it's easiest and most efficient for witness data to be in a parallel tree. So you have a tree with kernels, inputs, outputs, and in parallel you have a tree with the respective signatures, "input witnesses", and rangeproofs. This makes things unambiguous: to find a rangeproof for an output, you look in the witness tree at the same position that you found the output in the data tree. Input witnesses could be nothing, or maybe it makes sense for them to be full Merkle proofs that the referenced inputs are currently in the UTXO tree.-- Mailing list: https://launchpad.net/~mimblewimble Post to : mimblewimble@lists.launchpad.net Unsubscribe : https://launchpad.net/~mimblewimble More help : https://help.launchpad.net/ListHelp
Re: [Mimblewimble] Compact blocks
It should be sufficient for the output and its rangeproof to be separately committed to the chain to prevent ambiguity. Committing to rangeproofs, which are witness data and can be ignored (at a trust tradeoff), will reduce flexibility. This is a good point. I'm not opposed to the rangeproof and output having separate identifiers. In the context of the compact block discussion I wanted to emphasize the ability to unambiguously commit to a specific (output, range proof) pair to avoid a search problem that quickly becomes untenable, but if this comes in the form of two separate identifiers resolved independently, that should be fine.-- Mailing list: https://launchpad.net/~mimblewimble Post to : mimblewimble@lists.launchpad.net Unsubscribe : https://launchpad.net/~mimblewimble More help : https://help.launchpad.net/ListHelp
Re: [Mimblewimble] Compact blocks
From: moaningmyr...@protonmail.com It should be sufficient for the output and its rangeproof to be separately committed to the chain to prevent ambiguity. Committing to rangeproofs, which are witness data and can be ignored (at a trust tradeoff), will reduce flexibility. This is a good point. I'm not opposed to the rangeproof and output having separate identifiers. In the context of the compact block discussion I wanted to emphasize the ability to unambiguously commit to a specific (output, range proof) pair to avoid a search problem that quickly becomes untenable, but if this comes in the form of two separate identifiers resolved independently, that should be fine. Right. It might be better to to have two Merkle structures. I know Igno wanted only one, for simplicity, but I think it's easiest and most efficient for witness data to be in a parallel tree. So you have a tree with kernels, inputs, outputs, and in parallel you have a tree with the respective signatures, "input witnesses", and rangeproofs. This makes things unambiguous: to find a rangeproof for an output, you look in the witness tree at the same position that you found the output in the data tree. Input witnesses could be nothing, or maybe it makes sense for them to be full Merkle proofs that the referenced inputs are currently in the UTXO tree.-- Mailing list: https://launchpad.net/~mimblewimble Post to : mimblewimble@lists.launchpad.net Unsubscribe : https://launchpad.net/~mimblewimble More help : https://help.launchpad.net/ListHelp
Re: [Mimblewimble] Compact blocks
Definitely a fan of compact blocks to mitigate some impact of a faster block time. A few random thoughts below: On range proof validation: Doing all we can to reduce duplicate range proof transmission seems like a big win. I prefer the amended version in the last email to the original proposal. Using a partial output rather than an output identifier introduces a fair bit of complexity. As far as I can tell, the specific construction of the output hash's preimage has not been determined. If we move forward with this mechanism, it becomes critically important that the hash covers the range proof itself. A situation where the hash resolves ambiguously should be avoided at all costs. On inputs being part of the scheme: Since an input is by itself little more than a hash, it seems like the gain there is likely fairly small. We have thus far avoided the need to uniquely identify inputs by anything other than what output it is spending. I think I'd prefer inputs to be in the block in their entirety. This is a cool design! I think nailing down some of the specific aspects of the design will require a bit of protocol testing but this makes sense as a foundation to start with.-- Mailing list: https://launchpad.net/~mimblewimble Post to : mimblewimble@lists.launchpad.net Unsubscribe : https://launchpad.net/~mimblewimble More help : https://help.launchpad.net/ListHelp
Re: [Mimblewimble] Compact blocks
Now that we're going down that path, I'm starting to wonder whether we should treat inputs, outputs and kernels all in that same way. It's definitely simpler. Here's the proposal: - Across the wire, blocks have the regular list of inputs, outputs and kernels. At the minimum, they must have one output and one kernel (coinbase). - Blocks can also include the first n bytes of input/output/kernel hashes that are part of the block but not fully included. - A node receiving a block with partial hashes checks them against its the transaction pool. - Known input/output/kernels are added, missing ones (or collisions) are asked to the block sending node all in one batch. - Once everything is known, the full block can be validated. This still leaves the possibility to have full blocks, including all the data. In the general case the blocks should be very compact and requests for more data will only be commonplace for nodes newly started. It also gives an incentive to miners to have a fairly full transaction pool (to avoid further requests). As far as how many bytes of the hash, as John mentioned, 8 bytes (a u64) seem to suffice. I believe the chances of collisions with a pool of 100,000 elements would still be less than 0.0003%. 4 bytes would likely be too small, a transaction pool of the same size would have 69% of collisions. - Igno Original Message l Subject: Re: [Mimblewimble] Compact blocks Local Time: March 3, 2017 2:47 PM UTC Time: March 3, 2017 10:47 PM From: john.tr...@gmail.com To: Ignotus Peverell <igno.pever...@protonmail.com>, mimblewimble@lists.launchpad.net dear Igno, > That is, the receiver will look up how many kernel she has matching the hint. > If 0 or more than 1, it will query peers for a list of all known > matching kernels. Uhm, that should be: ask the sending peer to expand the hint to a full kernel. regards, -John-- Mailing list: https://launchpad.net/~mimblewimble Post to : mimblewimble@lists.launchpad.net Unsubscribe : https://launchpad.net/~mimblewimble More help : https://help.launchpad.net/ListHelp
Re: [Mimblewimble] Compact blocks
dear Igno, >> The second largest piece of data is composed of the transaction kernels >> (~100 bytes each). We can't fully avoid sending those as they're not >> "anchored" on anything, like the range proof is anchored on an output. >> However we could send a shortened version of their hash. In this scheme, >> when sent across the wire, a block would only contain the first 16 bytes (or >> maybe even 12) of the hash of the whole kernel. Upon reception, an exchange >> similar to the one for range proofs would occur. Note that this requires the >> pool maintains a mapping of those truncated hashes with the actual kernels. > > I feel less enthusiastic about this. The reduction is block size is > rather modest, > and somewhat negated by the need for extra querying traffic, which will be > exacerbated by clients submitting transactions directly to mining pool(s) for > improved aggregation, rather than through the peer-to-peer network. If you do pursue this path, then the hash-prefix serves merely as a kernel hint. That is, the receiver will look up how many kernel she has matching the hint. If 0 or more than 1, it will query peers for a list of all known matching kernels. If exactly 1, then it will use that, but if validation fails, it will again have to go back to querying peers for ALL the hints this time, so that it may learn of other matching kernels that would validate. In this approach, even 12 bytes is overkill, and somewhere between and 8 bytes suffices. Btw, how should we encode inputs? There's two ways to get away with 8 bytes: 1) a 32 bit block number and 32 bit index in sorted outputs of that block 2) a single 64 bit index in list all of outputs, sorted first by block. regards, -John -- Mailing list: https://launchpad.net/~mimblewimble Post to : mimblewimble@lists.launchpad.net Unsubscribe : https://launchpad.net/~mimblewimble More help : https://help.launchpad.net/ListHelp
Re: [Mimblewimble] Compact blocks
dear Igno, > In this discussion, I'm only interested in the size of a block when sent > across the wire during propagation. It's desirable in this context to have > small block sizes (in bytes) so that blocks can traverse the network quickly > without causing too much additional traffic. As most nodes will generally > receive the data included in a transaction twice (as the transaction is > propagated and in a block), some obvious optimizations are possible. > > As a reminder, a Grin block is composed of the following: > > A header, including the proof of work, some Merkle-ish roots, a reference to > the previous block hash, etc. All the header is always required. > A set of inputs as a direct link to previous outputs. > A set of outputs, containing a commitment and a range proof. > Transaction kernels, containing a signature, a public key and the explicit > fee. Why does the kernel need the fee? Is that to be hashed into the Schnorr signature? And if so, why is that necessary? > The largest piece of data by a good margin right now is the range proof (~5 > KB each). Therefore we can introduce a first optimization: > > Block gets propagated sans range proof. > Upon reception, a node checks whether it has all the range proofs for all > outputs included in the block. > If any missing, the node requests them from the peer that sent the block. > Validation can proceed, skipping the range proofs that were already known > (assuming they were already validated by the transaction pool). Excellent idea. > The second largest piece of data is composed of the transaction kernels > (~100 bytes each). We can't fully avoid sending those as they're not > "anchored" on anything, like the range proof is anchored on an output. > However we could send a shortened version of their hash. In this scheme, > when sent across the wire, a block would only contain the first 16 bytes (or > maybe even 12) of the hash of the whole kernel. Upon reception, an exchange > similar to the one for range proofs would occur. Note that this requires the > pool maintains a mapping of those truncated hashes with the actual kernels. I feel less enthusiastic about this. The reduction is block size is rather modest, and somewhat negated by the need for extra querying traffic, which will be exacerbated by clients submitting transactions directly to mining pool(s) for improved aggregation, rather than through the peer-to-peer network. > Beyond this, optimizations become non-trivial. We do *not* want to introduce > some transaction-level hash as transactions are disaggregated in blocks by > design. We could introduce something like inverted bloom filters for inputs > or outputs but those are still rather small (32 to 40 bytes each) and so the > tradeoff of space saving over complexity doesn't seem worth it at this > point. regards, -John -- Mailing list: https://launchpad.net/~mimblewimble Post to : mimblewimble@lists.launchpad.net Unsubscribe : https://launchpad.net/~mimblewimble More help : https://help.launchpad.net/ListHelp