Re: [bitcoin-dev] Building Blocks of the State Machine Approach to Consensus
On Thu, Jun 23, 2016 at 03:58:29PM +0300, Alex Mizrahi wrote: > > > > The point I'm making is simply that to be useful, when you close a seal you > > have to be able to close it over some data, in particular, another seal. > > That's > > the key thing that makes the idea a useful construct for smart contacts, > > value > > transfer/currency systems, etc. > > > > OK, your second post ("Closed Seal Sets and Truth Lists for Better Privacy > and Censorship Resistance") seems to clarify that this data is one of > arguments to the condition function. > Frankly this stuff is rather hard to follow. (Or maybe I'm dumb.) > > Now I don't get scability properties. Let's consider a simplest scenario > where Alice creates some token, sends it to Bob, who sends it to Claire. So > now Claire needs to get both a proof that Alice sent it to Bob and that Bob > sent it to Claire, right? So Claire needs to verify 2 proofs, and for a > chain of N transfers one would need to verify N proofs, right? Not necessarily. In my writeup I outlined two ways that those chains can be shortened: trusted validity oracles and the probabalistic, inflationary, history proof concept. Equally, even if history grows over time, that's no worse than Bitcoin. > And how it works in general: > > 1. Alice creates a token. To do that she constructs an unique expression > which checks her signature and signs a message "This token has such and > such meaning and its ownership originally associated with seal expression>" with her PGP key. Alice isn't _creating_ a tokne, she's _defining_ a token. > 2. To transfer this token to Bob, she asks Bob for his auth expression and > sends a seal oracle a message (Alice_expression (Bob_expression . > signature)) where signatures is constructed in such a way that it evaluates > as true. Oracle stores this in a map: Alice_expression -> (Bob_expression . > signatures) Nope. In Alice's token definition, the genesis state of the token is defined to be associated with a specific single-use seal. To transfer the token to Bob, she asks Bob for the seal he wishes to use, and then closes the genesis seal over a new state committing to Bob's seal. Now Alice could construct the seal for Bob, in which case she'd just need to know the auth expression Bob wants to use, but that's not the most fundamental way of implementing this. Regardless, the seal oracle doesn't need to know that any of the above is happening; all it needs to do is spit out seal closed witnesses when the authorization expressions are satisfied appropriately; the oracle does not and should not know what the seals have been closed over. Whether or not the oracle stores anything when seals are closed is an implementation decision - see my original writeup on the unbounded vs. bounded oracle case. And of course, seals implemented with decentralized blockchains are a different matter entirely. > 3. Bob sends token to Claire in a same way: (Bob_expression > (Claire_expression . signature)) > 4. Now Claire asks if Alice_expression->(Bob_expression . _) and > Bob_expression->(Claire_expression . _) are in oracle's map. She might > trust the oracle to verify signatures, but oracle doesn't understand token > semantics. Thus she needs to check if these entries were added. > If I understand correctly, Alice_expression->(Bob_expression . _) record > can be communicated in just 3 * size_of_hash_digest bytes. > > So this seems to have rather bad scalability even with trusted oracles, am > I missing something? Yes, as I mentioned above, there exists multiple techniques that can shorten history proofs in a variety of ways, depending on what kinds of tradeoffs your application needs. -- https://petertodd.org 'peter'[:-1]@petertodd.org signature.asc Description: Digital signature ___ bitcoin-dev mailing list bitcoin-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
Re: [bitcoin-dev] Building Blocks of the State Machine Approach to Consensus
> > The point I'm making is simply that to be useful, when you close a seal you > have to be able to close it over some data, in particular, another seal. > That's > the key thing that makes the idea a useful construct for smart contacts, > value > transfer/currency systems, etc. > OK, your second post ("Closed Seal Sets and Truth Lists for Better Privacy and Censorship Resistance") seems to clarify that this data is one of arguments to the condition function. Frankly this stuff is rather hard to follow. (Or maybe I'm dumb.) Now I don't get scability properties. Let's consider a simplest scenario where Alice creates some token, sends it to Bob, who sends it to Claire. So now Claire needs to get both a proof that Alice sent it to Bob and that Bob sent it to Claire, right? So Claire needs to verify 2 proofs, and for a chain of N transfers one would need to verify N proofs, right? And how it works in general: 1. Alice creates a token. To do that she constructs an unique expression which checks her signature and signs a message "This token has such and such meaning and its ownership originally associated with seal " with her PGP key. 2. To transfer this token to Bob, she asks Bob for his auth expression and sends a seal oracle a message (Alice_expression (Bob_expression . signature)) where signatures is constructed in such a way that it evaluates as true. Oracle stores this in a map: Alice_expression -> (Bob_expression . signatures) 3. Bob sends token to Claire in a same way: (Bob_expression (Claire_expression . signature)) 4. Now Claire asks if Alice_expression->(Bob_expression . _) and Bob_expression->(Claire_expression . _) are in oracle's map. She might trust the oracle to verify signatures, but oracle doesn't understand token semantics. Thus she needs to check if these entries were added. If I understand correctly, Alice_expression->(Bob_expression . _) record can be communicated in just 3 * size_of_hash_digest bytes. So this seems to have rather bad scalability even with trusted oracles, am I missing something? ___ bitcoin-dev mailing list bitcoin-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
Re: [bitcoin-dev] Building Blocks of the State Machine Approach to Consensus
On Mon, Jun 20, 2016 at 01:26:22PM +, Police Terror via bitcoin-dev wrote: > Bitcoin could embed a lisp interpreter such as Scheme, reverse engineer > the current protocol into lisp (inside C++), run this alternative engine > alongside the current one as an option for some years (only for fine > tuning) then eventually fade this lisp written validation code instead > of the current one. You know, I'm kinda regretting not making it sufficiently clear that Dex isn't Lisp... It may look like it with all the braces, but expressions in it are evaluated without any global state (they can be evaluated in parallel) and I've got a lot of work ahead of me in type safety. > Scheme is small and minimal, and embeds easily in C++. This could be a > better option than the libconsensus library - validation in a functional > scripting language. I'd be surprised if you could find a scheme interpreter that's sufficiently well defined to be suitable for that; starting with an existing one and whipping it into shape would very likely be more work than starting from scratch. > That doesn't mean people can't program the validation code in other > languages (maybe they'd want to optimize), but this code would be the > standard. Yeah, in general I'd expect most of these systems to be layered to a degree; after all even in something like MAST you need tooling to manage the fact that the opcodes that end up public, on-chain, are only a subset of the script. > I wouldn't be so quick to deride good engineering over systematic > provable systems for all domains. Bitcoin being written in C++ is not a > defect. It's actually a strong language for what it does. Especially > when used correctly (which is not often and takes years to master). It's probably the best of a lot of bad alternatives... We use C++ not because it's good, but because there's no other option. In particular, we have enormous cost and risk in moving to other things due to consensus, so making use of other languages is very difficult; my work with dex/proofchains does not have that constraint. > With the seals idea- am I understand this correctly?: Every transaction > has a number (essentially the index starting from 0 upwards) depending > on where it is in the blockchain. > > Then there is an array (probably an on disk array mapping transaction > indexes to hashes). Each hash entry in the array must be unique (the > hashes) otherwise the transaction will be denied. This is a great idea > to solve transaction hash collisions and simple to implement. No, I think you've very much misunderstood things. The abstract notion of a single-use seal doesn't even need global consensus on anything to implement; it does not require transactions to have "indexes" -- https://petertodd.org 'peter'[:-1]@petertodd.org signature.asc Description: Digital signature ___ bitcoin-dev mailing list bitcoin-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
Re: [bitcoin-dev] Building Blocks of the State Machine Approach to Consensus
On Mon, Jun 20, 2016 at 04:21:39PM +, zaki--- via bitcoin-dev wrote: > Hi Peter, > > I didn't entirely understand the process of transaction linearization. > > What I see is a potential process where when the miner assembles the block, > he strips all but one sigscript per tx. The selection of which sigscript > is retained is determined by the random oracle. Is this is primary benefit > you are suggesting? > > It appears to me that blocks still need to contain a list of full TX Input > and Tx Outputs with your approach. Some of the description seems to > indicate that there are opportunities to elide further data but it's > unclear to me how. I think you've misunderstood what I'm proposing. The state machine approach I described doesn't necessarily require blocks or even miners to exist at all. Rather, it assumes that a single-use seal primitive is available, and a random beacon primitive for tx linearization, and then builds a system on top of those primitives. Transaction data - the proofs that certain states have been reached in the system - does not need to be broadcast publicly; if Alice wants to convince Bob that she has given him money, the only person who needs that transaction (and transactions prior to it in the tx history) is Bob. So as to your question about miners assembling blocks, and what blocks contain: there doesn't need to be blocks at all! Transaction history linearization is something your wallet would do for you. -- https://petertodd.org 'peter'[:-1]@petertodd.org signature.asc Description: Digital signature ___ bitcoin-dev mailing list bitcoin-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
Re: [bitcoin-dev] Building Blocks of the State Machine Approach to Consensus
> All practical single-use seals will be associated with some kind of > condition, > such as a pubkey, or deterministic expression, that needs to be satisfied > for > the seal to be closed. I think it would be useful to classify systems w.r.t. what data is available to condition. I imagine it might be useful if status of other seals is available. > Secondly, the contents of the proof will be able to > commit to new data, such as the transaction spending the output associated > with > the seal. > So basically a "condition" returns that "new data", right? If it commits to a data in a recognizable way, then it's practically a function which yields a tuple (valid, new_data). If an oracle doesn't care about data then you can convert it to a predicate using a simple projection. But from point of view of a client, it is a function which returns a tuple. It might help if you describe a type of the condition function. Some related work on UTXO-based smart contracts: 1. Typecoin described in the paper "Peer-to-peer Affine Commitment using Bitcoin" Karl Crary and Michael J. Sullivan Carnegie Mellon University PLDI ’15, Portland June 17, 2015 I don't see the paper in open access and I've lost my copy, but there are slides: https://www.msully.net/stuff/typecoin-slides.pdf The paper is written by programming language researchers, and thus use fairly complex constructs. The idea is to use the language of linear logic, but it's actually implemented using type-oriented programming. So, basically, they associate logical propositions with transaction outputs. Transactions proof that output-propositions logically follow from input-propositions. The paper first describes as a colored coin kind of a system, where color values are propositions/types. But in the implementation part it became more like a metacoin, as it uses a complete transaction history. A setup with a trusted server is also mentioned. The interesting thing about Typecoin is that a contract language is based on logic, which makes it powerful and -- I guess -- analyzable. However, the paper doesn't mention any performance details, and I guess it's not good. Another problem is that it looks very unusual to people who aren't used to type-oriented programming. 2. Generic coins Seeing how much Typecoin people had to struggle to describe a Bitcoin-style system I decided to describe a generalized Bitcoin-style system, so it can be easily referenced in research. Sadly all I got so far is a draft of an introduction/definition sections: https://github.com/chromaway/ngcccbase/wiki/gc In the first section I described a transaction graph model which is supposed to be general enough to describe any kind of a transaction graph system with explicit dependencies and no "spooky action at distance". As it turns out, any such system can be defined in terms of few predicate functions, however, using these functions directly might be very inefficient. The next section introduces a coin-based model. A coin-based system can be described using a single function called coin kernel which is applied to a transaction and a list of input coinstates. It is then described how to go from a coin-based model to a transaction-graph model. The reverse should also be possible if we add additional restrictions on a transaction-graph model, it's probably enough to define that coin can be spent only once. (Partial coin spends were described in Freimarkets.) There is a fairly shitty prototype in Haskell: https://github.com/baldmaster/ColorCoin 3. flexichains This is a prototype done by me more recently, the interesting thing about it is that it unifies account-based and UTXO-based models in a single model. We first introduce a notion of record. A record can be of an arbitrary type, the only restriction is that it must have a key which must be unique within a system. Then transaction model can be introduced using two function: txDependencies returns a list of keys of records transaction depends on applyTx takes a transaction and a list of records it depends on and returns either a list of records or an error. A list of records includes * new records which are created by a transaction * updated records will have the same key but different content A simple account-based system can be implement using tuples (pubkey, balance, last_update) as records. In an UTXO-based system records are transaction output, and they should include a spent flag. (Obviously, records with spent flag can be pruned.) A system with custom smart contracts can be implemented by adding some sort of a function or bytecode to records. A Haskell prototype is here: https://bitbucket.org/chromawallet/flexichains/src/21059080bed6?at=develop (It's kinda broken and incomplete, though.) ___ bitcoin-dev mailing list bitcoin-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
Re: [bitcoin-dev] Building Blocks of the State Machine Approach to Consensus
Hi Peter, I didn't entirely understand the process of transaction linearization. What I see is a potential process where when the miner assembles the block, he strips all but one sigscript per tx. The selection of which sigscript is retained is determined by the random oracle. Is this is primary benefit you are suggesting? It appears to me that blocks still need to contain a list of full TX Input and Tx Outputs with your approach. Some of the description seems to indicate that there are opportunities to elide further data but it's unclear to me how. On Mon, Jun 20, 2016 at 7:14 AM Police Terror via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org> wrote: > Bitcoin could embed a lisp interpreter such as Scheme, reverse engineer > the current protocol into lisp (inside C++), run this alternative engine > alongside the current one as an option for some years (only for fine > tuning) then eventually fade this lisp written validation code instead > of the current one. > > Scheme is small and minimal, and embeds easily in C++. This could be a > better option than the libconsensus library - validation in a functional > scripting language. > > That doesn't mean people can't program the validation code in other > languages (maybe they'd want to optimize), but this code would be the > standard. > > It's really good how you are thinking deeply how Bitcoin can be used, > and the implications of everything. Also there's a lot of magical utopic > thinking in Ethereum, which is transhumanist nonsense that is life > denying. Bitcoin really speaks to me because it is real and a great tool > following the UNIX principle. > > I wouldn't be so quick to deride good engineering over systematic > provable systems for all domains. Bitcoin being written in C++ is not a > defect. It's actually a strong language for what it does. Especially > when used correctly (which is not often and takes years to master). > > With the seals idea- am I understand this correctly?: Every transaction > has a number (essentially the index starting from 0 upwards) depending > on where it is in the blockchain. > > Then there is an array (probably an on disk array mapping transaction > indexes to hashes). Each hash entry in the array must be unique (the > hashes) otherwise the transaction will be denied. This is a great idea > to solve transaction hash collisions and simple to implement. > > Probabilistic validation is a good idea, although the real difficulty > now seems to be writing and indexing all the blockchain data for > lookups. And validation is disabled for most of the blocks. Pruning is > only a stop gap measure (which loses data) that doesn't solve the issue > of continually growing resource consumption. Hardware and implementation > can only mitigate this so much. If only there was a way to simplify the > underlying protocol to make it more resource efficient... > > Peter Todd via bitcoin-dev: > > In light of Ethereum's recent problems with its imperative, > account-based, > > programming model, I thought I'd do a quick writeup outlining the > building > > blocks of the state-machine approach to so-called "smart contract" > systems, an > > extension of Bitcoin's own design that I personally have been developing > for a > > number of years now as my Proofchains/Dex research work. > > > > > > # Deterministic Code / Deterministic Expressions > > > > We need to be able to run code on different computers and get identical > > results; without this consensus is impossible and we might as well just > use a > > central authoritative database. Traditional languages and surrounding > > frameworks make determinism difficult to achieve, as they tend to be > filled > > with undefined and underspecified behavior, ranging from signed integer > > overflow in C/C++ to non-deterministic behavior in databases. While some > > successful systems like Bitcoin are based on such languages, their > success is > > attributable to heroic efforts by their developers. > > > > Deterministic expression systems such as Bitcoin's scripting system and > the > > author's Dex project improve on this by allowing expressions to be > precisely > > specified by hash digest, and executed against an environment with > > deterministic results. In the case of Bitcoin's script, the expression > is a > > Forth-like stack-based program; in Dex the expression takes the form of a > > lambda calculus expression. > > > > > > ## Proofs > > > > So far the most common use for deterministic expressions is to specify > > conditions upon which funds can be spent, as seen in Bitcoin > (particularly > > P2SH, and the upcoming Segwit). But we can generalize their use to > precisely > > defining consensus protocols in terms of state machines, with each state > > defined in terms of a deterministic expression that must return true for > the > > state to have been reached. The data that causes a given expression to > return > > true is then a "proof", and that proof can be passed from one
Re: [bitcoin-dev] Building Blocks of the State Machine Approach to Consensus
Bitcoin could embed a lisp interpreter such as Scheme, reverse engineer the current protocol into lisp (inside C++), run this alternative engine alongside the current one as an option for some years (only for fine tuning) then eventually fade this lisp written validation code instead of the current one. Scheme is small and minimal, and embeds easily in C++. This could be a better option than the libconsensus library - validation in a functional scripting language. That doesn't mean people can't program the validation code in other languages (maybe they'd want to optimize), but this code would be the standard. It's really good how you are thinking deeply how Bitcoin can be used, and the implications of everything. Also there's a lot of magical utopic thinking in Ethereum, which is transhumanist nonsense that is life denying. Bitcoin really speaks to me because it is real and a great tool following the UNIX principle. I wouldn't be so quick to deride good engineering over systematic provable systems for all domains. Bitcoin being written in C++ is not a defect. It's actually a strong language for what it does. Especially when used correctly (which is not often and takes years to master). With the seals idea- am I understand this correctly?: Every transaction has a number (essentially the index starting from 0 upwards) depending on where it is in the blockchain. Then there is an array (probably an on disk array mapping transaction indexes to hashes). Each hash entry in the array must be unique (the hashes) otherwise the transaction will be denied. This is a great idea to solve transaction hash collisions and simple to implement. Probabilistic validation is a good idea, although the real difficulty now seems to be writing and indexing all the blockchain data for lookups. And validation is disabled for most of the blocks. Pruning is only a stop gap measure (which loses data) that doesn't solve the issue of continually growing resource consumption. Hardware and implementation can only mitigate this so much. If only there was a way to simplify the underlying protocol to make it more resource efficient... Peter Todd via bitcoin-dev: > In light of Ethereum's recent problems with its imperative, account-based, > programming model, I thought I'd do a quick writeup outlining the building > blocks of the state-machine approach to so-called "smart contract" systems, an > extension of Bitcoin's own design that I personally have been developing for a > number of years now as my Proofchains/Dex research work. > > > # Deterministic Code / Deterministic Expressions > > We need to be able to run code on different computers and get identical > results; without this consensus is impossible and we might as well just use a > central authoritative database. Traditional languages and surrounding > frameworks make determinism difficult to achieve, as they tend to be filled > with undefined and underspecified behavior, ranging from signed integer > overflow in C/C++ to non-deterministic behavior in databases. While some > successful systems like Bitcoin are based on such languages, their success is > attributable to heroic efforts by their developers. > > Deterministic expression systems such as Bitcoin's scripting system and the > author's Dex project improve on this by allowing expressions to be precisely > specified by hash digest, and executed against an environment with > deterministic results. In the case of Bitcoin's script, the expression is a > Forth-like stack-based program; in Dex the expression takes the form of a > lambda calculus expression. > > > ## Proofs > > So far the most common use for deterministic expressions is to specify > conditions upon which funds can be spent, as seen in Bitcoin (particularly > P2SH, and the upcoming Segwit). But we can generalize their use to precisely > defining consensus protocols in terms of state machines, with each state > defined in terms of a deterministic expression that must return true for the > state to have been reached. The data that causes a given expression to return > true is then a "proof", and that proof can be passed from one party to another > to prove desired states in the system have been reached. > > An important implication of this model is that we need deterministic, and > efficient, serialization of proof data. > > > ## Pruning > > Often the evaluation of an expression against a proof doesn't require all all > data in the proof. For example, to prove to a lite client that a given block > contains a transaction, we only need the merkle path from the transaction to > the block header. Systems like Proofchains and Dex generalize this process - > called "pruning" - with built-in support to both keep track of what data is > accessed by what operations, as well as support in their underlying > serialization schemes for unneeded data to be elided and replaced by the hash > digest of the pruned data. > > > # Transactions > > A common type of state
[bitcoin-dev] Building Blocks of the State Machine Approach to Consensus
In light of Ethereum's recent problems with its imperative, account-based, programming model, I thought I'd do a quick writeup outlining the building blocks of the state-machine approach to so-called "smart contract" systems, an extension of Bitcoin's own design that I personally have been developing for a number of years now as my Proofchains/Dex research work. # Deterministic Code / Deterministic Expressions We need to be able to run code on different computers and get identical results; without this consensus is impossible and we might as well just use a central authoritative database. Traditional languages and surrounding frameworks make determinism difficult to achieve, as they tend to be filled with undefined and underspecified behavior, ranging from signed integer overflow in C/C++ to non-deterministic behavior in databases. While some successful systems like Bitcoin are based on such languages, their success is attributable to heroic efforts by their developers. Deterministic expression systems such as Bitcoin's scripting system and the author's Dex project improve on this by allowing expressions to be precisely specified by hash digest, and executed against an environment with deterministic results. In the case of Bitcoin's script, the expression is a Forth-like stack-based program; in Dex the expression takes the form of a lambda calculus expression. ## Proofs So far the most common use for deterministic expressions is to specify conditions upon which funds can be spent, as seen in Bitcoin (particularly P2SH, and the upcoming Segwit). But we can generalize their use to precisely defining consensus protocols in terms of state machines, with each state defined in terms of a deterministic expression that must return true for the state to have been reached. The data that causes a given expression to return true is then a "proof", and that proof can be passed from one party to another to prove desired states in the system have been reached. An important implication of this model is that we need deterministic, and efficient, serialization of proof data. ## Pruning Often the evaluation of an expression against a proof doesn't require all all data in the proof. For example, to prove to a lite client that a given block contains a transaction, we only need the merkle path from the transaction to the block header. Systems like Proofchains and Dex generalize this process - called "pruning" - with built-in support to both keep track of what data is accessed by what operations, as well as support in their underlying serialization schemes for unneeded data to be elided and replaced by the hash digest of the pruned data. # Transactions A common type of state machine is the transaction. A transaction history is a directed acyclic graph of transactions, with one or more genesis transactions having no inputs (ancestors), and one or more outputs, and zero or more non-genesis transactions with one or more inputs, and zero or more outputs. The edges of the graph connect inputs to outputs, with every input connected to exactly one output. Outputs with an associated input are known as spent outputs; outputs with out an associated input are unspent. Outputs have conditions attached to them (e.g. a pubkey for which a valid signature must be produced), and may also be associated with other values such as "# of coins". We consider a transaction valid if we have a set of proofs, one per input, that satisfy the conditions associated with each output. Secondly, validity may also require additional constraints to be true, such as requiring the coins spent to be >= the coins created on the outputs. Input proofs also must uniquely commit to the transaction itself to be secure - if they don't the proofs can be reused in a replay attack. A non-genesis transaction is valid if: 1. Any protocol-specific rules such as coins spent >= coins output are followed. 2. For every input a valid proof exists. 3. Every input transaction is itself valid. A practical implementation of the above for value-transfer systems like Bitcoin could use two merkle-sum trees, one for the inputs, and one for the outputs, with inputs simply committing to the previous transaction's txid and output # (outpoint), and outputs committing to a scriptPubKey and output amount. Witnesses can be provided separately, and would sign a signature committing to the transaction or optionally, a subset of of inputs and/or outputs (with merkle trees we can easily avoid the exponential signature validation problems bitcoin currently has). As so long as all genesis transactions are unique, and our hash function is secure, all transaction outputs can be uniquely identified (prior to BIP34 the Bitcoin protocol actually failed at this!). ## Proof Distribution How does Alice convince Bob that she has done a transaction that puts the system into the state that Bob wanted? The obvious answer is she gives Bob data proving that the system is now in the desired