Re: [bitcoin-dev] Signing CHECKSIG position in Tapscript

2019-12-05 Thread Anthony Towns via bitcoin-dev
On Thu, Dec 05, 2019 at 03:24:46PM -0500, Russell O'Connor wrote:

Thanks for the careful write up! That matches what I was thinking.

> This analysis suggests that we should amend CODESEPARATORs behaviour to update
> an accumulator (presumably a running hash value), so that all executed
> CODESEPARATOR positions end up covered by the signature.

On IRC, gmaxwell suggests "OP_BREADCRUMB" as a name for (something like)
this functionality.

(I think it's a barely plausible stretch to use the name "CODESEPARATOR"
for marking a position in the script -- that separates what was before
and after, at least; anything more general seems like it warrants a
better name though)

> That would provide a
> solution to the above problem for those cases where taproot's MAST cannot be
> used.  I'm not sure if it is better to propose such an amendment to
> CODESEPARATOR's behaviour now, or to propose to soft-fork in such optional
> behaviour at a later time.
> However, what I said above was even too simplified.  

FWIW, I think it's too soon to propose this because (a) it's not clear
there's a practical need for it, (b) it's not clear the functionality is
quite right (opcode vs more automatic sighash flag?), and (c) as you say,
it's not clear it's powerful enough.

> In general, a policy of the form.
>     (Exists w[1]. C[1](w[1]) && PK[1,1](w[1]) && ... && PK[1,m[1]](w[1]) || 
> ...
> || (Exists w[n]. C[n](w[n]) && PK[n,1](w[n]) && ... && PK[n,m[n]](w[n]))
> where each term could possibly be parameterized by some witness value (though
> at the moment there isn't enough functionality in Script to parameterize the
> pubkeys in any reasonably way and it maybe isn't even possible to parameterise
> the conditions in any reasonable way).  In general, you might want your
> signature to cover (some function of) this witness value.  This suggests that
> we would actually want a CODESEPARATOR variant that pushes a stack item into
> the accumulator that gets covered by the signature rather than pushing the
> CODESEPARATOR position.  Though at this point the name CODESEPARATOR is
> probably not suitable, even if it subsumes the functionality.

> Again, I'm not
> sure if it is better to propose such a replacement for CODESEPARATOR's
> behaviour now, or to propose to soft-fork in such optional behaviour at a 
> later
> time.

Last bit first, it seems pretty clear to me that this is too novel an
idea to propose it immediately -- we should explore the problem space
more first to see what's the best way of doing it before coding it into
consensus. And (guessing) I think the tapscript upgrade methods should
be fine for handling this later.

I think the annex is also not general enough for what you're thinking
here, in that it wouldn't allow for one signature to constrain the witness
data more than some other signature -- so you'd need to determine all
the constraints for all signatures to finish filling out the annex,
and could only then start signing.

I think you could conceivably do any/all of:

 * commit to a hash of all the witness data that hasn't been popped off
   the stack ("suffix" commitment -- the data will be used by later script
   opcodes)
 * commit to a hash of all the witness data that has been popped off the
   stack ("prefix" commitment -- this is the data that's been used by
   earlier script opcodes)
 * commit to the hash of the current stack

That would be expensive, but still doable as O(1) per opcode / stack
element. I think any other masking would mean you'd have potentially
O(size of witness data) or O(size of stack) runtime per signature which
I think would be unacceptable...

I guess a general implementation to at least think about the possibilities
might be an "OP_DATACOMMIT" opcode that pops an element from the stack,
does hash_"DataCommit"(element), and then any later signatures commit
to that value (maybe with OP_0 OP_DATACOMMIT allowing you to get back to
the default state). You'd either need to write your script carefully to
commit to witness data you're using elsewhere, or have some other new
opcodes to do that more conveniently...

CODESEP at position "x" in the script is equivalent to " DATACOMMIT"
here, I think. "BREADCRUMB .. BREADCRUMB" could be something like:

   OP_0 TOALT [at start of script]
   ..
   FROMALT x CAT SHA256 DUP TOALT DATACOMMIT   
   ..
   FROMALT y CAT SHA256 DUP TOALT DATACOMMIT   

if the altstack was otherwise unused, I guess; so the accumulator
behaviour probably warrants something better.

It also more or less gives you CHECKSIGFROMSTACK behaviour by doing
"SWAP OP_DATACOMMIT OP_CHECKSIG" and a SIGHASH_NONE|ANYPREVOUTANYSCRIPT
signature.

But that seems like a plausible generalisation to think about?

Cheers,
aj

___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Signing CHECKSIG position in Tapscript

2019-12-05 Thread Russell O'Connor via bitcoin-dev
After chatting with andytoshi and others, and some more thinking I've been
convinced that my specific concern about other users masquerading other
people pubkeys as their own in complex scripts is actually a non-issue.

No matter what you write in Script (today), you are limited to expressing
some policy that is logically equivalent to a set of conditions and
signatures on pubkeys that can be expressed in disjunctive normal form.  We
can write such a policy as

(C[1] && PK[1,1] && ... && PK[1,m[1]]) || ... || (C[n] && PK[n,1] && ... &&
PK[n,m[n]])

where C[i] is some conjunction of conditions such as timelock constraints,
or hash-lock constraints or any other kind of proof of publication, and
where PK[i,j] is a requirement of a signature against a specific public key.

>From Alice's point of view, she can divide set of clauses under the
disjunction into those that contain a pubkey that she considers (partially)
under her control and those clauses that she does not control (even though
as we shall see those other keys might actually be under Alice's control,
unbeknownst to her). To that end, let us consider a specific representative
policy.

(C[1] && APK[1]) || (C[2] && APK[2] && BPK[2]) || (C[3] && BPK[3])

where Alice considers herself in control of APK[1] and APK[2], and where
she considers Bob in control of BPK[2] and BPK[3] and where C[1], C[2], and
C[3] are different conditions, let's say three different hash-locks.  We
will also say that Alice has ensured that her pubkeys in different clauses
are different (i.e. APK[1] != APK[2]), but she doesn't make any such
assumption for Bob's keys and neither will we.

When Alice funded this Script, or otherwise accepted it for consideration,
she agreed that she wouldn't control the redemption of the UTXO as long as
the preimage for C[3] is published.  In particular, Alice doesn't even need
to fully decode the Script semantics for that clause beyond determining
that it enforces the C[3] requirement that she cares about. Even if Bob was
masquerading Alice's pubkey as his own (as in BPK[3] = APK[1] or BPK[3] =
APK[2]), and he ends up copying her signature into that clause, Alice ends
up with C[3] published as she originally accepted as a possibility.  Bob
masquerading Alice's pubkey as his own only serves to hamper his own
ability to sign for his clauses (I mean, Bob might be trying to convince
some third party that Alice signed for something she didn't actually sign
for, but such misrepresentations of the meaning of digital signatures is
outside our scope and this just serves as a reminder not to be deceived by
Bob's tricks here).

And the same argument holds for BPK[2].  Even if BPK[2] = APK[1] and Bob
tries to copy Alice's signature into the C[2] condition, he still needs a
countersignature with her APK[2], so Alice remains in control of that
clause.  And if BPK[2] = APK[2] then Bob can only copy Alice's signature on
the C[2] condition, but in that case she has already authorised that
condition.  Again, Bob masquerading Alice's pubkey as his own only serves
to hamper his own ability to sign for his clauses.

So what makes our potential issue here safe, versus the dangers that would
happen in  where Bob
masqurades Alice's UTXO as his own?  The key problem in the UTXO case isn't
so much Bob masquerading Alice's pubkey as his own, as it is an issue with
Alice reusing her pubkeys and Bob taking advantage of that.  We do, in
fact, have exactly the same issue in Script.  If Alice were to reuse
pubkeys such that APK[1] = APK[2], then Bob could take her signature for
C[1] and transplant it to redeem under condition C[2].  We see that it is
imperative that Alice ensures that she doesn't reuse pubkeys that she
considers under her control for different conditions when she wants her
signature to distinguish between them.

For various reasons, some historical, it is much harder to avoid pubkey
reuse for different UTXOs than it is to avoid pubkey reuse within a single
script.  We often use Bitcoin addresses in non-interactive ways, such as
putting them on flyers or painting them on walls and such.  Without a
standard for tweaking such pubkeys in a per-transaction way, we end up with
a lot of pubkey reuse between various UTXOs.  However, within a Script,
avoiding pubkey reuse ought to be easier.  Alice must communicate different
pubkeys intended for different clauses, or if Bob is constructing a whole
complex script on Alice's behalf, he may need to add CODESEPARATORs if
tweaking Alice's pubkey isn't an option.

The conversion of a policy to disjunctive normal form can involve an
exponential blowup (see <
https://en.wikipedia.org/wiki/Disjunctive_normal_form#Conversion_to_DNF>).
For instance, if Alice's policy (not in disjunctive normal form) is of the
form

(C[1] || D[1]) && ... && (C[n] || D[n]) && APK

where C[i] and D[i] are all distinct hashlocks, we require O(2^n) clauses
to put it in disjunctive normal form.  If 

Re: [bitcoin-dev] Signing CHECKSIG position in Tapscript

2019-12-03 Thread Anthony Towns via bitcoin-dev
On Sun, Dec 01, 2019 at 11:09:54AM -0500, Russell O'Connor wrote:
> On Thu, Nov 28, 2019 at 3:07 AM Anthony Towns  wrote:
> First, it seems like a bad idea for Alice to have put funds behind a
> script she doesn't understand in the first place. There's plenty of
> scripts that are analysable, so just not using ones that are too hard to
> analyse sure seems like an option.
> I don't think this is true in general.  When constructing a script it seems
> quite reasonable for one party to come to the table with their own custom
> script that they want to use because they have some sort of 7-of-11 scheme but
> in one of those cases is really a 2-of-3 and another is 5-of-6.  The point is
> that you shouldn't need to decode their exact policy in order to collaborate
> with them.

Hmm, I take the opposite lesson from your scenario -- it's only fine for
people to bring their own 2-of-3 or 5-of-6 or whatever and replace a
simple key if you've got something like miniscript where you understand
the script completely enough that you can be sure those changes are
fine. 

For contrast, with ECDSA and pre-miniscript, the above scenario might
have gone like someone proposing to change:

  7 A B C1 C2 C3 C4 C5 C6 C7 C8 C9 11 CHECKMULTISIG

for something like

  7
  SWAP IF TOALT 2 A1 A2 A3 3 CHECKMULTISIGVERIFY FROMALT 1SUB ENDIF
  SWAP IF TOALT 5 B1 B2 B3 B4 B5 B6 6 CHECKMULTISIGVERIFY FROMALT 1SUB ENDIF
  C1 C2 C3 C4 C5 C6 C7 C8 C9 11 CHECKMULTISIG

but I think you'd want to be pretty sure you can decode those added
policies rather than just accepting it because your "C4" key is still
there. (In particular, any script fragment that uses an opcode that used
to be OP_SUCCESS could have arbitrary effects on the script)

[0]

> This notion is captured quite clearly in the MAST aspect of
> taproot. In many circumstances, it is sufficient for you to know that there
> exists a branch that contains a particular script without need to know what
> every branch contains.

(I'm trying to avoid using MAST in the context of taproot, despite the
backronym, so please excuse the rephrasing--)

I think if you're going to start using a taproot address with multiple
tapscripts, either as a participant in a multiparty smart contract,
or just to have different ways of spending your funds, then you do have
to analyse all the branches to make sure there's no hidden "all the
money goes to the Lizard People" script.

Once you've done that, you can then simplify things -- maybe some
scripts are only useful for other participants in the contract, or maybe
you've got a few different hardware wallets and one only needs to know
about one branch, while the other only needs to know about some other
branch, but you still need to have done the analysis in the first place.

Of course, probably most of the time that "analysis" is just making sure
the scripts match some well known, hardcoded template, as filled out
with various (tweaked) keys that you've checked elsewhere, but that
still ensures you know all the scripts do what you need them too.

> Third, if you are doing something crazy complex where a particular key
> could appear in different CHECKSIG operators and they should have
> independent signatures, that seems like you're at the level of
> complexity where learning about CODESEPARATOR is a reasonable thing to
> do.
> So while I agree that learning about CODESEPARATOR is a reasonable thing to 
> do,
> given that I haven't heard the CODESEPARATOR being proposed as protection
> against this sort of signature-copying attack before

Err? The current behaviour of CODESEP with taproot was first discussed in
[1], which summarised it as "CODESEP -- lets you require different sigs
for different parts of a single script" which seems to me like just a
different way of saying the same thing.

[1] 
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/016500.html

I don't think tapscript's CODESEP or the current CODESEP can be used
for anything other than preventing a signature from being reused for a
different CHECKSIG operation on the same pubkey within the same script.

> and given the subtle
> nature of the issue, I'm not sure people will know to use it to protect
> themselves.  We should aim for a Script design that makes the cheaper default
> Script programming choices the safer one.

I think techniques like miniscript and having fixed templates specified
in BIPs and BOLTs and the like are better approaches -- both let you
easily allow a limited set of changes that can be safely made to a policy
(maybe just substituting keys, hashes and times, maybe allowing more
general changes).

> On the other hand, in a previous thread a while ago I was also arguing that
> sophisticated people are plausibly using CODESEPARATOR today, hidden away in
> unredeemed P2SH UTXOs.  So perhaps I'm right about at least one of these two
> points. :)

Sounds like an economics argument :)

>  IF HASH160 x EQUALVERIFY groupa 

Re: [bitcoin-dev] Signing CHECKSIG position in Tapscript

2019-12-01 Thread Russell O'Connor via bitcoin-dev
On Thu, Nov 28, 2019 at 3:07 AM Anthony Towns  wrote:

> FWIW, there's discussion of this at
> http://www.erisian.com.au/taproot-bip-review/log-2019-11-28.html#l-65
>

I think variants like signing the position of the enclosing
OP_IF/OP_NOTIF/OP_ELSE of the OP_IF/OP_NOTIF/OP_ELSE block that the
checksig is within, or signing the byte offset instead of the opcode number
offset are all fine.  In particular, signing the enclosing OP_IF... would
allow sharing of the hashed signed data in a normal multisig sequence of
operations.  Below I'll continue to refer to my proposal as signing the
CHECKSIG position, but please take it to mean any of these proposed,
semantically equivalent, realizations of this idea.

I also think that it is quite reasonable to have a sighash flag control
whether or not the signature covers the CHECKSIG position or not, with
SIGHASH_ALL including the CHECKSIG position.


> First, it seems like a bad idea for Alice to have put funds behind a
> script she doesn't understand in the first place. There's plenty of
> scripts that are analysable, so just not using ones that are too hard to
> analyse sure seems like an option.
>

I don't think this is true in general.  When constructing a script it seems
quite reasonable for one party to come to the table with their own custom
script that they want to use because they have some sort of 7-of-11 scheme
but in one of those cases is really a 2-of-3 and another is 5-of-6.  The
point is that you shouldn't need to decode their exact policy in order to
collaborate with them.  This notion is captured quite clearly in the MAST
aspect of taproot.  In many circumstances, it is sufficient for you to know
that there exists a branch that contains a particular script without need
to know what every branch contains.  Because we include the tapleaf in the
signature, we already prevent this signature copying attack against
attempts to transplant one's signature from one tapleaf to another.  My
proposal is to simply extend this same protection to branches within a
single tapscript.

Second, if there are many branches in the script, it's probably more
> efficient to do them via different branches in the merkle tree, which
> at least for this purpose would make them easier to analyse as well
> (since you can analyse them independently).
>

Of course this should be done when practical.  This point isn't under
dispute.


> Third, if you are doing something crazy complex where a particular key
> could appear in different CHECKSIG operators and they should have
> independent signatures, that seems like you're at the level of
> complexity where learning about CODESEPARATOR is a reasonable thing to
> do.
>

So while I agree that learning about CODESEPARATOR is a reasonable thing to
do, given that I haven't heard the CODESEPARATOR being proposed as
protection against this sort of signature-copying attack before and given
the subtle nature of the issue, I'm not sure people will know to use it to
protect themselves.  We should aim for a Script design that makes the
cheaper default Script programming choices the safer one.

On the other hand, in a previous thread a while ago I was also arguing that
sophisticated people are plausibly using CODESEPARATOR today, hidden away
in unredeemed P2SH UTXOs.  So perhaps I'm right about at least one of these
two points. :)

I think CODESEPARATOR is a better solution to this problem anyway. In
> particular, consider a "leaf path root OP_MERKLEPATHVERIFY" opcode,
> and a script that says "anyone in group A can spend if the preimage for
> X is revelaed, anyone in group B can spend unconditionally":
>
>  IF HASH160 x EQUALVERIFY groupa ELSE groupb ENDIF
>  MERKLEPATHVERIFY CHECKSIG
>
> spendable by
>
>  siga keya path preimagex 1
>
> or
>
>  sigb keyb path 0
>
> With your proposed semantics, if my pubkey is in both groups, my signature
> will sign for position 10, and still be valid on either path, even if
> the signature commits to the CHECKSIG position.
>
> I could fix my script either by having two CHECKSIG opcodes (one for
> each branch) and also duplicating the MERKLEPATHVERIFY; or I could
> add a CODESEPARATOR in either IF branch.
>

> (Or I could just not reuse the exact same pubkey across groups; or I could
> have two separate scripts: "HASH160 x EQUALVERIFY groupa MERKLEPATHVERIFY
> CHECKSIG" and "groupb MERKLEPATHVERIFY CHECKSIG")
>

I admit my proposal doesn't automatically prevent this signature-copying
attack against every Script template.  To be fully effective you need to be
aware of this signature-copying attack vector to ensure your scripts are
designed so that your CHECKSIG operations are protected by being within the
IF block that does the verification of the hash-preimage.  My thinking is
that my proposal is effective enough to save most people most of the time,
even if it doesn't save everyone all the time, all while having no
significant burden otherwise.  Therefore, I don't think your point that
there still exists a 

Re: [bitcoin-dev] Signing CHECKSIG position in Tapscript

2019-11-28 Thread Anthony Towns via bitcoin-dev
On Wed, Nov 27, 2019 at 04:29:32PM -0500, Russell O'Connor via bitcoin-dev 
wrote:
> The current tapscript proposal requires a signature on the last executed
> CODESEPRATOR position.  I'd like to propose an amendment whereby instead of
> signing the last executed CODESEPRATOR position, we simply always sign the
> position of the CHECKSIG (or other signing opcode) being executed.

FWIW, there's discussion of this at
http://www.erisian.com.au/taproot-bip-review/log-2019-11-28.html#l-65

> However, unless CODESEPARATOR is explicitly used, there is no protection
> against these sorts of attacks when there are multiple participants that have
> signing conditions within a single UTXO (or rather within a single tapleaf in
> the tapscript case).

(You already know this, but:)

With taproot key path spending, the only other conditions that can be
placed on a transaction are nSequence, nLockTime, and the annex, all of
which are committed to via the signature; so I think this concern only
applies to taproot script path spending.

The proposed sighashes for taproot script path spending all commit to
the script being used, so you can't reuse the signature in a different
leaf of the merkle tree of scripts for the UTXO, only in a separate
execution path within the script you're already looking at.

> So for example, if Alice and Bob are engaged in some kind of multi-party
> protocol, and Alice wants to pre-sign a transaction redeeming a UTXO but
> subject to the condition that a certain hash-preimage is revealed, she might
> verify the Script template shows that the code path to her public key enforces
> that the hash pre-image is revealed (using a toolkit like miniscript can aid 
> in
> this), and she might make her signature feeling secure that it, if her
> signature is used, the required preimage must be revealed on the blockchain. 
> But perhaps Bob has masquated Alice's pubkey as his own, and maybe he has
> inserted a copy of Alice's pubkey into a different path of the Script
> template.
>
> Now Alice's signature can be copied and used in this alternate path,
> allowing the UTXO to be redeemed under circumstances that Alice didn't believe
> she was authorizing.  In general, to protect herself, Alice needs to inspect
> the Script to see if her pubkey occurs in any other branch.  Given that her
> pubkey, in principle, could be derived from a computation rather that pushed
> directly into the stack, it is arguably infeasible for Alice to perform the
> required check in general.

First, it seems like a bad idea for Alice to have put funds behind a
script she doesn't understand in the first place. There's plenty of
scripts that are analysable, so just not using ones that are too hard to
analyse sure seems like an option.

Second, if there are many branches in the script, it's probably more
efficient to do them via different branches in the merkle tree, which
at least for this purpose would make them easier to analyse as well
(since you can analyse them independently).

Third, if you are doing something crazy complex where a particular key
could appear in different CHECKSIG operators and they should have
independent signatures, that seems like you're at the level of
complexity where learning about CODESEPARATOR is a reasonable thing to
do.

I think CODESEPARATOR is a better solution to this problem anyway. In
particular, consider a "leaf path root OP_MERKLEPATHVERIFY" opcode,
and a script that says "anyone in group A can spend if the preimage for
X is revelaed, anyone in group B can spend unconditionally":

 IF HASH160 x EQUALVERIFY groupa ELSE groupb ENDIF
 MERKLEPATHVERIFY CHECKSIG

spendable by

 siga keya path preimagex 1

or

 sigb keyb path 0

With your proposed semantics, if my pubkey is in both groups, my signature
will sign for position 10, and still be valid on either path, even if
the signature commits to the CHECKSIG position.

I could fix my script either by having two CHECKSIG opcodes (one for
each branch) and also duplicating the MERKLEPATHVERIFY; or I could
add a CODESEPARATOR in either IF branch.

(Or I could just not reuse the exact same pubkey across groups; or I could
have two separate scripts: "HASH160 x EQUALVERIFY groupa MERKLEPATHVERIFY
CHECKSIG" and "groupb MERKLEPATHVERIFY CHECKSIG")

> I believe that it would be safer, and less surprising to users, to always sign
> the CHECKSIG position by default.

> As a side benefit, we get to eliminate CODESEPARATOR, removing a fairly 
> awkward
> opcode from this script version.

As it stands, ANYPREVOUTANYSCRIPT proposes to not sign the script code
(allowing the signature to be reused in different scripts) but does
continue signing the CODESEPARATOR position, allowing you to optionally
restrict how flexibly you can reuse signatures. That seems like a better
tradeoff than having ANYPREVOUTANYSCRIPT signatures commit to the CHECKSIG
position which would make it a fair bit harder to design scripts that
can share signatures, or not having any way to restrict