Re: [bitcoin-dev] Codex32

2023-02-23 Thread Russell O'Connor via bitcoin-dev
One more thing (Again apologies. This idea of doing partial verification is
novel to me, and I see now that I should have just waited to give a
consolidated reply).

Focusing in on the example of performing 2 character quick checks.  There
are 7 different ways of building the table used in this quick check
verification process (actually there are 8, but we only need 7 of them for
our purposes here).  Fun fact: If you perform the 2 character quick check
in all 7 different ways, this is equivalent to doing a full checksum
verification.

This suggests a strategy of visiting your shares on a regular basis and
performing a different 2 character quick check each time, rotating through
the 7 different ways of performing it.

Moreover, these 7 different 2 character quick checks come with a prescribed
order that will accumulate BCH guarantees as you go along.  Assuming the
string isn't changing between visits then

* After the 1st table you are guaranteed to detect any 1 character error.
* After the 2nd table you are guaranteed to detect any 2 character error.
* After the 3rd table you are guaranteed to detect any 4 character error.
* After the 4th table you are guaranteed to detect any 5 character error.
* After the 5th table you are guaranteed to detect any 6 character error.
* After the 6th table you are guaranteed to detect any 7 character error.
* After the 7th table you are guaranteed to detect any 8 character error,
which is the guarantee of the full 13 character checksum.  You are also
guaranteed that the full 13 character checksum is now correct.

You could perform the checks out of order, and that is okay.  You will
eventually reach these BCH levels of guarantees, just not as quickly as if
you follow the prescribed order.

Of course, doing a series of 7 different 2 character quick checks is
overall more work than doing the full 13 character checksum validation.
But there is certainly an advantage in spreading the work out over time.
Each time you visit you still have the guarantee of catching any new 1
character error introduced since the last time you visited and a 99.9%
chance of catching any other random errors introduced since your last
visit.  Personally I am likely to follow such a validation strategy myself,
now that I am aware of it.


On Wed, Feb 22, 2023 at 10:30 PM Russell O'Connor 
wrote:

> After some consultation, I now see that generators for all degree 2 BCH
> codes, such as ours, are smooth and factor into quadratic and linear
> components.
>
> Anyhow the upshot of all this is that you can perform a "quickcheck"
> verification of the codex32 strings for whatever size of verification you
> want to do, 1 character, 2 characters, 3 characters, upto the full 13
> characters.  Each of these partial verifications will have BCH properties.
> A 1 character quickchecksum will guarantee to detect any 1 character
> error.  A 3 character quickchecksum will guarantee to detect any 2
> character error, etc.  There remains a 1 in 32^n chance of failing to
> detect larger numbers of errors where n is the size of your quickcheck.
>
> To illustrate, let's consider a quickcheck of size 2.  This can detect any
> 1 character error and will only have a 1/1024 chance of failing to detect
> other random errors.  Let's take the second test vector as our example: "
> MS12NAMEA320ZYXWVUTSRQPNMLKJHGFEDCAXRPP870HKKQRM"
>
> You start in a specified initial state with a pair of bech32 characters.
> For MS1 strings and a size 2 quickcheck it would be the pair of Bech32
> characters 'AS'.
>
> Next we "add" the first character after the prefix, which is '2' by using
> the addition volvelle or lookup table.  "Adding" '2' to 'S' yields '6' and
> our state becomes 'A6'.
>
> Next we have to look up 'A6' in a lookup table.  This table is too big to
> fit in the margin of this email, so I will have to omit it.  But it would
> have an entry mapping 'A6' -> 'QM'.  Our state becomes 'QM'
>
> From this point we have an even number of remaining characters in the
> input string and we can work 2 characters at a time. We "add" the next two
> data characters from our string, which are 'NA'.  Again, using the volvelle
> or lookup table we get that adding 'N' to 'Q' yields 'N', and adding 'A' to
> 'M' yields 'X'.  So our state is now 'NX'
>
> Next we look up 'NX' in this table I haven't given you and we will find an
> entry mapping 'NX' -> 'DX', making 'DX' our new state.
>
> We keep repeating this process alternating between adding pairs of
> characters and using this unstated lookup table all the way until the end
> where we will reach a final state which will be 'H9'.
>
> If you follow this procedure with any string (upto 400 bit master seeds)
> you will always end up in the state 'H9'.
>
> A specialized worksheet would help guide the process making the process
> easier to follow.
>
>
> This process is somewhat close to Peter Todd's suggestion of using a
> lookup table on "words", which in this case would be pairs of bech32
> 

Re: [bitcoin-dev] Codex32

2023-02-23 Thread Russell O'Connor via bitcoin-dev
Sorry for the repeated replies, but I would like to make one more remark
regarding the 1 character "quick check".

Because the 1 character "quick check" state is so small, the procedure
becomes simplified to just using a single table.  You start with the
specified initial state, which would be the bech32 character '9', and then
you have a lookup mapping (, ) ->
.  You go through the table for each character after the
prefix, updating the state as you go along. ('9','2') -> '0', then
('0','N') -> '4', and so on until you reach the final state which should be
'5'.  If you like volvelles, one could be designed to implement this lookup
table.

However, I do want to note that an adjustment could be made to the codex32
generator so that this 1 character "quick check" table would become
identical to the Bech32 addition table.  In other words the 1 character
quick check would become the same as adding up all the characters and
checking that you get the required final constant.

If this change were free to make, I would probably make it.  However such
an adjustment would come at a cost.  The current generator was chosen to
have three identical coefficients in a row (you can find the generator in
the appendix of the draft BIP).  This special property slightly eases the
computation needed when creating checksums by hand (no compromise to the
quality of the checksum itself).  If we made the above adjustment to the
codex32 generator, we would lose this property of having three identical
coefficients in a row.

Therefore, I am pretty hesitant to make this adjustment.  Firstly the 1
character quick check is simply too small to be safely used.  While it does
guarantee to detect single character errors, it has a 1 in 32 chance of
failing to detect more errors.  I think a 3% failure rate is pretty bad,
and would definitely recommend people performing quick checks use a minimum
size of 2 (which has a 0.1% failure rate).  Secondly the difference between
using the addition table/volvelle versus a specific table/volvelle for the
purpose of performing 1 character quick checks (which you ought not to be
doing anyways) is pretty minimal.  The addition table is possibly slightly
less error prone to use because it is symmetric, but other than that the
amount of work to do is pretty much the same either way.

My conclusion is that it isn't worth compromising the ease of hand
computation for the sake of possibly making a
too-low-quality-checksum-that-no-one-should-be-using slightly more
convenient, but I thought I should mention it at least.


On Wed, Feb 22, 2023 at 10:30 PM Russell O'Connor 
wrote:

> After some consultation, I now see that generators for all degree 2 BCH
> codes, such as ours, are smooth and factor into quadratic and linear
> components.
>
> Anyhow the upshot of all this is that you can perform a "quickcheck"
> verification of the codex32 strings for whatever size of verification you
> want to do, 1 character, 2 characters, 3 characters, upto the full 13
> characters.  Each of these partial verifications will have BCH properties.
> A 1 character quickchecksum will guarantee to detect any 1 character
> error.  A 3 character quickchecksum will guarantee to detect any 2
> character error, etc.  There remains a 1 in 32^n chance of failing to
> detect larger numbers of errors where n is the size of your quickcheck.
>
> To illustrate, let's consider a quickcheck of size 2.  This can detect any
> 1 character error and will only have a 1/1024 chance of failing to detect
> other random errors.  Let's take the second test vector as our example: "
> MS12NAMEA320ZYXWVUTSRQPNMLKJHGFEDCAXRPP870HKKQRM"
>
> You start in a specified initial state with a pair of bech32 characters.
> For MS1 strings and a size 2 quickcheck it would be the pair of Bech32
> characters 'AS'.
>
> Next we "add" the first character after the prefix, which is '2' by using
> the addition volvelle or lookup table.  "Adding" '2' to 'S' yields '6' and
> our state becomes 'A6'.
>
> Next we have to look up 'A6' in a lookup table.  This table is too big to
> fit in the margin of this email, so I will have to omit it.  But it would
> have an entry mapping 'A6' -> 'QM'.  Our state becomes 'QM'
>
> From this point we have an even number of remaining characters in the
> input string and we can work 2 characters at a time. We "add" the next two
> data characters from our string, which are 'NA'.  Again, using the volvelle
> or lookup table we get that adding 'N' to 'Q' yields 'N', and adding 'A' to
> 'M' yields 'X'.  So our state is now 'NX'
>
> Next we look up 'NX' in this table I haven't given you and we will find an
> entry mapping 'NX' -> 'DX', making 'DX' our new state.
>
> We keep repeating this process alternating between adding pairs of
> characters and using this unstated lookup table all the way until the end
> where we will reach a final state which will be 'H9'.
>
> If you follow this procedure with any string (upto 400 bit master seeds)
> you will always 

Re: [bitcoin-dev] Codex32

2023-02-22 Thread Russell O'Connor via bitcoin-dev
After some consultation, I now see that generators for all degree 2 BCH
codes, such as ours, are smooth and factor into quadratic and linear
components.

Anyhow the upshot of all this is that you can perform a "quickcheck"
verification of the codex32 strings for whatever size of verification you
want to do, 1 character, 2 characters, 3 characters, upto the full 13
characters.  Each of these partial verifications will have BCH properties.
A 1 character quickchecksum will guarantee to detect any 1 character
error.  A 3 character quickchecksum will guarantee to detect any 2
character error, etc.  There remains a 1 in 32^n chance of failing to
detect larger numbers of errors where n is the size of your quickcheck.

To illustrate, let's consider a quickcheck of size 2.  This can detect any
1 character error and will only have a 1/1024 chance of failing to detect
other random errors.  Let's take the second test vector as our example: "
MS12NAMEA320ZYXWVUTSRQPNMLKJHGFEDCAXRPP870HKKQRM"

You start in a specified initial state with a pair of bech32 characters.
For MS1 strings and a size 2 quickcheck it would be the pair of Bech32
characters 'AS'.

Next we "add" the first character after the prefix, which is '2' by using
the addition volvelle or lookup table.  "Adding" '2' to 'S' yields '6' and
our state becomes 'A6'.

Next we have to look up 'A6' in a lookup table.  This table is too big to
fit in the margin of this email, so I will have to omit it.  But it would
have an entry mapping 'A6' -> 'QM'.  Our state becomes 'QM'

>From this point we have an even number of remaining characters in the input
string and we can work 2 characters at a time. We "add" the next two data
characters from our string, which are 'NA'.  Again, using the volvelle or
lookup table we get that adding 'N' to 'Q' yields 'N', and adding 'A' to
'M' yields 'X'.  So our state is now 'NX'

Next we look up 'NX' in this table I haven't given you and we will find an
entry mapping 'NX' -> 'DX', making 'DX' our new state.

We keep repeating this process alternating between adding pairs of
characters and using this unstated lookup table all the way until the end
where we will reach a final state which will be 'H9'.

If you follow this procedure with any string (upto 400 bit master seeds)
you will always end up in the state 'H9'.

A specialized worksheet would help guide the process making the process
easier to follow.


This process is somewhat close to Peter Todd's suggestion of using a lookup
table on "words", which in this case would be pairs of bech32 characters,
and adding values together.  The catch is that the addition is done with
Bech32 addition rather than calculator addition, which I accept is a
moderately large catch.

Anyhow, the point is that you can do this sort of partial verification by
hand to whatever degree you like, if you are in a rush and are willing to
accept larger chances of failing to catch random errors.


On Wed, Feb 22, 2023 at 2:01 PM Russell O'Connor 
wrote:

> After some poking around at the math, I do see that the 13 character
> generator (for regular sized shares) is reasonably "smooth", having roots
> at T{11}, S{16}, and C{24}.
>
> This means we could build a "quick check" worksheet to evaluate the string
> modulo (x - T) to verify a 5 bit checksum, whose operation would be similar
> to the existing checksum worksheet in structure but significantly less work.
>
> Perhaps 5 bits is too short, and it is more reasonable working modulo (x -
> T)*(x - S) to get a 10 bit checksum.  A worksheet for a 15 bit checksum is
> also an option, and possibly others well depending on the size of the other
> factors.  I think this process is would be about as simple as any other
> comparable hand operated checksum over the bech32 alphabet would be.
>
> I haven't looked into the smoothness of the long generator for seeds that
> are greater than 400 bits.  If it isn't smooth and smoother options are
> available, perhaps it should be changed.
>
> On Wed, Feb 22, 2023 at 11:29 AM Peter Todd via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org> wrote:
>
>> On Sun, Feb 19, 2023 at 10:12:51PM +, Andrew Poelstra via bitcoin-dev
>> wrote:
>> > > What really did catch my attention, but which was kind of buried in
>> the
>> > > project documentation, is the ability to verify the integrity of each
>> > > share independently without using a computer.  For example, if I
>> store a
>> > > share with some relative who lives thousands of kilometers away, I'll
>> be
>> > > able to take that share out of its tamper-evident bag on my annual
>> > > holiday visit, verify that I can still read it accurately by
>> validating
>> > > its checksum, and put it into a new bag for another year.  For this
>> > > procedure, I don't need to bring copies of any of my other shares,
>> > > allowing them (and my seed) to stay safe.
>> > >
>> >
>> > This is good feedback. I strongly agree that this is the big selling
>> > point for this -- that you can vet 

Re: [bitcoin-dev] Codex32

2023-02-22 Thread Russell O'Connor via bitcoin-dev
After some poking around at the math, I do see that the 13 character
generator (for regular sized shares) is reasonably "smooth", having roots
at T{11}, S{16}, and C{24}.

This means we could build a "quick check" worksheet to evaluate the string
modulo (x - T) to verify a 5 bit checksum, whose operation would be similar
to the existing checksum worksheet in structure but significantly less work.

Perhaps 5 bits is too short, and it is more reasonable working modulo (x -
T)*(x - S) to get a 10 bit checksum.  A worksheet for a 15 bit checksum is
also an option, and possibly others well depending on the size of the other
factors.  I think this process is would be about as simple as any other
comparable hand operated checksum over the bech32 alphabet would be.

I haven't looked into the smoothness of the long generator for seeds that
are greater than 400 bits.  If it isn't smooth and smoother options are
available, perhaps it should be changed.

On Wed, Feb 22, 2023 at 11:29 AM Peter Todd via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> On Sun, Feb 19, 2023 at 10:12:51PM +, Andrew Poelstra via bitcoin-dev
> wrote:
> > > What really did catch my attention, but which was kind of buried in the
> > > project documentation, is the ability to verify the integrity of each
> > > share independently without using a computer.  For example, if I store
> a
> > > share with some relative who lives thousands of kilometers away, I'll
> be
> > > able to take that share out of its tamper-evident bag on my annual
> > > holiday visit, verify that I can still read it accurately by validating
> > > its checksum, and put it into a new bag for another year.  For this
> > > procedure, I don't need to bring copies of any of my other shares,
> > > allowing them (and my seed) to stay safe.
> > >
> >
> > This is good feedback. I strongly agree that this is the big selling
> > point for this -- that you can vet shares/seeds which *aren't* being
> > actively used, without exposing them to the sorts of threats associated
> > with active use.
> >
> > We should make this more prominent in the BIP motivation.
>
> I don't think that use-case is a good selling point. The checksum that
> Codex32
> uses is much more complex than necessary if you are simply verifying a
> share by
> itself.
>
> A *much* simpler approach would be to use a simple mod N = 0 checksum,
> either
> by creating the seed such that each share passes, or by just storing an
> additional word/symbol with the seed in such a way that sum(words) mod N =
> 0
> passes. This approach is not only possible to compute by hand with a
> word/symbol->number lookup table, and pen and paper or a calculator. It's
> so
> simple they could probably *remember* how to do it themselves.
>
>
> Secondly, if all shares have mod N checksums, it may be sufficient for
> everyone
> to write down the checksums of the *other* shares, to verify they are the
> correct ones and a different (otherwise correct) share hasn't accidentally
> been
> substituted.
>
> Indeed, with some brute forcing and small checksums, I'd expect it to be
> mathematically possible to generate Shamir's secret sharing shards such
> that
> every shard can share the *same* checksum. In which case the share
> verification
> procedure would be to simply ask every share holder to verify the checksum
> manually using the mod N procedure, and then verify that each share holder
> has
> the same checksum. This would be less error prone in terms of leaking
> information accidentally if the checksum was obviously *not* part of the
> share:
> eg by encoding the share with words, and the checksum with a number.
>
> Obviously, small checksums aren't fool proof. But we're probably better off
> creating a relatively easy procedure with a 1-in-1000 chance of an error
> going
> undetected than a complex procedure that people don't actually do at all.
>
> --
> https://petertodd.org 'peter'[:-1]@petertodd.org
> ___
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Codex32

2023-02-22 Thread Russell O'Connor via bitcoin-dev
On Sun, Feb 19, 2023 at 3:13 PM David A. Harding via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

>Codex32 allows the individual to periodically perform their
>recollection on paper in a private room without electronics and use
>nothing but a pen and some loookup tables (or a paper device) to
>verify that they recalled the string correctly (and its checksum can
>help with correcting up to several errors, although you might need a
>computer for error location and correction assistance).
>

While perhaps not entirely impossible, doing error correction by hand is
far more difficult than just the error detection process.
Fortunately, error correction can be aided by a computer needing very
little trust.

When doing hand computation of the checksum, there is an error if the final
"residue" is different from the value given in the codex32 spec.
After double checking your hand computation by redoing it, if you still get
the same incorrect residue, there is an error in your share data somewhere.

What you can do is plug in the residue that you did compute into a
computer, and have it produce an error correction string.
You can then, by hand, add this error correction string to your share to
get a corrected share.

If it were the case that all types of character errors were equally likely
(as in during an error any character is equally likely to be transformed
into any other character), then the computer would gain zero information
about your actual share data.  Of course, it is not in fact the case that
all transcription errors are equally likely, and so the computer can learn
a little bit about your share. The fewer errors that are made, the less
data it can recover. If you only have one character in error, then 5 bits
is the most data it can recover, and that is assuming that it can somehow
infer your error perfectly from the delta of the error correction, which
isn't actually going to be the case.

Of course, a single share from a split secret has no information about your
master seed (the master seed is hidden in the correlation between different
shares).  So learning partial information about one share isn't going to be
enough by itself to even begin compromising your master key.

This all still needs to be written up in more detail, but I figured I would
give a preview here.

- Hierarchy: Codex32 does not natively provide support for nested 
>whereas SLIP39 does.  E.g., in SLIP39, you can require 2-of-3 for
>{me, family, friends} where me is 2-of-3 {fire_safe, bank_safe,
>buried_in_woods}, family is 1-of-3 {alice, bob, carol}, and friends
>are 2-of-5 {d, e, f, g, h}.  I assume you can do the same with Codex32
>by using the share for one level as the secret for the next level,
>although this is not described in the protocol.
>

There was a design for a second level share scheme floating around
somewhere. I'll have to dig it up. As I recall this is made slightly more
complicated by needing to incorporate share metadata (i.e. the share index)
when doing a second split, but it seemed doable at the time.


> - Versioning: Codex32's metadata can store version information for
>wallets that use implicit BIP32 paths (e.g. BIP44/49/84/86), although
>this would cut into the space available for users to set their own
>metadata and it is not specified in the draft BIP.  SLIP39 also
>doesn't specify anything about implicit path versioning and, AFAICT,
>doesn't have any room to store such metadata without reducing seed
>entropy.
>

Personally, I don't consider the derivation path / descriptor data as that
sensitive, and I would recommend making wide backups of that data.
It certainly would make sense to store descriptor data alongside wherever
you keep your shares, and more places beyond that.
On the other hand, if you are trying to keep your shares innocuous somehow,
perhaps you won't be able to keep the descriptor data alongside your shares.

When I first saw the post about this, it was unclear to me that it was a
> serious project, but I've become increasingly interested as I researched
> it.  I'm not personally that interested in generating entropy from dice
> or encoding shares by hand---it's already imperative that I acquire a
> trustworthy computer and load it with trustworthy software in order to
> use my seed securely, so I might as well have it generate my seeds and
> my
> recovery codes for me.
>

I do think hardware wallets are great, and overall provide a lot of
practical protection.  What is notable is that once the secrets are loaded
onto a hardware wallet, as long as that wallet remains isolated, it cannot
leak any secrets.

Of course, a wallet needs to interact with the Bitcoin protocol and P2P
network, at least indirectly, in order to  function, so we must break that
isolation.  However, if we can limit the communication capabilities of a
hardware wallet, even a malicious wallet shouldn't be able to leak the

Re: [bitcoin-dev] Codex32

2023-02-22 Thread Peter Todd via bitcoin-dev
On Sun, Feb 19, 2023 at 10:12:51PM +, Andrew Poelstra via bitcoin-dev wrote:
> > What really did catch my attention, but which was kind of buried in the
> > project documentation, is the ability to verify the integrity of each
> > share independently without using a computer.  For example, if I store a
> > share with some relative who lives thousands of kilometers away, I'll be
> > able to take that share out of its tamper-evident bag on my annual
> > holiday visit, verify that I can still read it accurately by validating
> > its checksum, and put it into a new bag for another year.  For this
> > procedure, I don't need to bring copies of any of my other shares,
> > allowing them (and my seed) to stay safe.
> > 
> 
> This is good feedback. I strongly agree that this is the big selling
> point for this -- that you can vet shares/seeds which *aren't* being
> actively used, without exposing them to the sorts of threats associated
> with active use.
> 
> We should make this more prominent in the BIP motivation.

I don't think that use-case is a good selling point. The checksum that Codex32
uses is much more complex than necessary if you are simply verifying a share by
itself.

A *much* simpler approach would be to use a simple mod N = 0 checksum, either
by creating the seed such that each share passes, or by just storing an
additional word/symbol with the seed in such a way that sum(words) mod N = 0
passes. This approach is not only possible to compute by hand with a
word/symbol->number lookup table, and pen and paper or a calculator. It's so
simple they could probably *remember* how to do it themselves.


Secondly, if all shares have mod N checksums, it may be sufficient for everyone
to write down the checksums of the *other* shares, to verify they are the
correct ones and a different (otherwise correct) share hasn't accidentally been
substituted.

Indeed, with some brute forcing and small checksums, I'd expect it to be
mathematically possible to generate Shamir's secret sharing shards such that
every shard can share the *same* checksum. In which case the share verification
procedure would be to simply ask every share holder to verify the checksum
manually using the mod N procedure, and then verify that each share holder has
the same checksum. This would be less error prone in terms of leaking
information accidentally if the checksum was obviously *not* part of the share:
eg by encoding the share with words, and the checksum with a number.

Obviously, small checksums aren't fool proof. But we're probably better off
creating a relatively easy procedure with a 1-in-1000 chance of an error going
undetected than a complex procedure that people don't actually do at all.

-- 
https://petertodd.org 'peter'[:-1]@petertodd.org


signature.asc
Description: PGP signature
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Codex32

2023-02-20 Thread Pavol Rusnak via bitcoin-dev
Thanks Andrew!

New draft makes it much more obvious what are the differences between
Codex32 and SLIP-0039 scheme.
-- 
Best Regards / S pozdravom,

Pavol "Stick" Rusnak
Co-Founder, SatoshiLabs
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Codex32

2023-02-20 Thread Andrew Poelstra via bitcoin-dev
On Wed, Feb 15, 2023 at 09:16:02PM -0500, Russell O'Connor via bitcoin-dev 
wrote:
> I've been asked by Dr. Curr and Professor Snead to forward this message to
> this mailing list, as it may be of general interest to Bitcoin users.
>
> 
>

I have opened a PR to the BIPs repo for this scheme: 
https://github.com/bitcoin/bips/pull/1425

Thanks very much to Pavol Rusnak, David Harding, and Christopher Allen
for their comments on this list. We've updated the draft text to try to
address your concerns. In particular:

  * Added more text to "Motivation" distinguishing the scheme from SLIP-39
  * Added more details to "Rationale"  about error correction capabilities
of the code, and to explain that the code does not defend against
malicious errors
  * Added a note to use uppercase for QR codes
  * Rearranged and clarified the "creating shares" instructions
  * Added text about hand-computation, in particular hand-computation
of share verification, to "Motivation".

If any of you would like to be listed under Acknowledgements, please let
me know here or on the PR.


Cheers
Andrew


-- 
Andrew Poelstra
Director of Research, Blockstream
Email: apoelstra at wpsoftware.net
Web:   https://www.wpsoftware.net/andrew

The sun is always shining in space
-Justin Lewis-Webster



signature.asc
Description: PGP signature
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Codex32

2023-02-19 Thread Russell O'Connor via bitcoin-dev
On Sun, Feb 19, 2023 at 5:13 PM Andrew Poelstra via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> On Sun, Feb 19, 2023 at 10:13:33AM -1000, David A. Harding wrote:
>
> I'm curious about whether there's a way to prevent this attack without
> > otherwise compromising the properties of the code?  For example, some
> > extra data that Bob can carry around (or memorize) for verifying the
> > shares haven't changed, but which is not otherwise needed for recovery
> > (so there's no problem if it's lost).
> >
>
> Unfortunately not, as near as I can tell ... one way to think of this is
> that Alice can flip a lot of random tiles then "error correct" it to get
> a new valid, but incorrect, seed. So as long as we support error
> correction it'll be possible to wreck seeds in this way.
>
> It's actually even worse than this ... as long as there's a clearly
> defined "checksum" at the end of a share, Alice will be able to mangele
> tiles and then just re-compute the checksum at the end.
>
> So what we really need to prevent this is something like a MAC: where
> Bob has a secret value which gets input into the checksum somehow, which
> Alice can't create valid checksums without knowing. Unfortunately I
> don't see any way to do this with linear codes. With a hash-based
> "checksum" a la BIP39 it would definitely be possible, but of course,
> not hand-computable.
>

Speaking off the cuff and as a non-cryptographer (i.e do NOT rush off and
do this without any vetting) but Christopher Allen once referred me to an
cypher tile set called LS47 .  If we
set aside the cypertext, I suspect we can form a MAC by recording some
random initial tile configuration, running the LS47 algorithm, and
recording the final tile configuration.  These records are not sensitive as
(presumably!) the share data is not recoverable from just knowing these two
configurations.  So one can keep these records with you, digitally sign
them or whatever, and then take them to your share on a regular basis to
rerun the LS47 algorithm to see if you still get the same final state from
the initial state.

Perhaps something more specific to Bech32 could be designed, but otherwise
this (alleged) MAC process isn't Codex32 specific.
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Codex32

2023-02-19 Thread Christopher Allen via bitcoin-dev
An easy but possibly very important tip:

If you use only UPPER CASE alpha and numbers in codex32, and avoid most
punctuation, it makes QR rendering of it significantly smaller. This is
because the QR code to the ISO SPEC, when seeing lowercase, assumes the
value is binary, then converts it to a two byte value. If instead, the
codex32 is all upper, and it uses only the 45 allowed characters (see
https://www.thonky.com/qr-code-tutorial/alphanumeric-table) , it will leave
it single byte and try to compress it. Of course it doesn’t compress well,
but that is OK because it at least didn’t double the size first.

See our research on this topic at
https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2020-003-uri-binary-compatibility.md

A superior QR codec can do better (see our
https://github.com/BlockchainCommons/QRCodeGenerator or
https://www.nayuki.io/page/qr-code-generator-library) but many platforms
and more basic QR codecs will double the size of the QR if you have any
lower case.

— Christopher Allen
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Codex32

2023-02-19 Thread Andrew Poelstra via bitcoin-dev
On Sun, Feb 19, 2023 at 10:13:33AM -1000, David A. Harding wrote:
> On 2023-02-16 03:49, Andrew Poelstra via bitcoin-dev wrote:
> > the draft lists several benefits over SLIP-0039.
> 
> The only benefit over SLIP39 that I see explicitly mentioned in the
> draft BIP is "simple enough for hand computation".  In the FAQ[1] on the
> project's website, I see some additional reasons:
>

Oh, you're right! I think we removed this text in one of our revisions.
I'll see if it makes sense to put it back.

> | This scheme is essentially the same as SLIP39, with the following
> differences:
> |
> | - The checksum is longer, slightly stronger, and designed to be
> |   computable by hand.
> |
> | - Our encoding is more compact, giving us room for a bit of more
> |   metadata, which is also designed to be readable by hand.
> |
> | - Unlike SLIP39, we do not support passphrases or hardening of any
> |   form.
> |
> | - Unlike SLIP39, we have no hardware wallet support. But we hope that
> |   will change!
> 

These are roughly the benefits -- a more compact encoding which is
always a fixed width. We also list "not supporting features such as
passphrases" as a benefit, for users who don't need/want this.

> 
> 
> When I first saw the post about this, it was unclear to me that it was a
> serious project, but I've become increasingly interested as I researched
> it.  I'm not personally that interested in generating entropy from dice
> or encoding shares by hand---it's already imperative that I acquire a
> trustworthy computer and load it with trustworthy software in order to
> use my seed securely, so I might as well have it generate my seeds and my
> recovery codes for me.
>

Yes, we've been a bit coy about how serious this project is, because on
its face it's such a silly thing. But for my part, I've been using it
for real coins for over a year and I consider it to be serious.

> What really did catch my attention, but which was kind of buried in the
> project documentation, is the ability to verify the integrity of each
> share independently without using a computer.  For example, if I store a
> share with some relative who lives thousands of kilometers away, I'll be
> able to take that share out of its tamper-evident bag on my annual
> holiday visit, verify that I can still read it accurately by validating
> its checksum, and put it into a new bag for another year.  For this
> procedure, I don't need to bring copies of any of my other shares,
> allowing them (and my seed) to stay safe.
> 

This is good feedback. I strongly agree that this is the big selling
point for this -- that you can vet shares/seeds which *aren't* being
actively used, without exposing them to the sorts of threats associated
with active use.

We should make this more prominent in the BIP motivation.

> 
> I do have one question after watching an excellent video[2] about the
> motivation for this system.  In the video, one of the threat models
> described is a disarrangement of the words in a metal backup system.
> The implication seems to be that this would be an accidental
> disarrangement, which obviously the Codex32 checksum would catch during
> periodic offline verification.  But what about deliberate modification
> of a recovery code?  For example, Bob doesn't keep his seed loaded on
> any computer; it only exists in Codex32 shares which Bob plans to
> combine together in 20 years when he retires, although he makes regular
> deposits to the pubkeys derived from the seed's master xpub.  Mallory is
> able to obtain access to Bob's shares, allowing her to immediately steal
> his current funds---but also allowing her to replace them with
> similar-looking
> shares with the same metadata and valid checksums so that Bob
> continues making deposits to the wallet.
> 
> I'm curious about whether there's a way to prevent this attack without
> otherwise compromising the properties of the code?  For example, some
> extra data that Bob can carry around (or memorize) for verifying the
> shares haven't changed, but which is not otherwise needed for recovery
> (so there's no problem if it's lost).
>

Unfortunately not, as near as I can tell ... one way to think of this is
that Alice can flip a lot of random tiles then "error correct" it to get
a new valid, but incorrect, seed. So as long as we support error
correction it'll be possible to wreck seeds in this way.

It's actually even worse than this ... as long as there's a clearly
defined "checksum" at the end of a share, Alice will be able to mangele
tiles and then just re-compute the checksum at the end.

So what we really need to prevent this is something like a MAC: where
Bob has a secret value which gets input into the checksum somehow, which
Alice can't create valid checksums without knowing. Unfortunately I
don't see any way to do this with linear codes. With a hash-based
"checksum" a la BIP39 it would definitely be possible, but of course,
not hand-computable.

BTW, to execute this attack Alice doesn't 

Re: [bitcoin-dev] Codex32

2023-02-19 Thread David A. Harding via bitcoin-dev

On 2023-02-16 03:49, Andrew Poelstra via bitcoin-dev wrote:

the draft lists several benefits over SLIP-0039.


The only benefit over SLIP39 that I see explicitly mentioned in the
draft BIP is "simple enough for hand computation".  In the FAQ[1] on the
project's website, I see some additional reasons:

| This scheme is essentially the same as SLIP39, with the following 
differences:

|
| - The checksum is longer, slightly stronger, and designed to be
|   computable by hand.
|
| - Our encoding is more compact, giving us room for a bit of more
|   metadata, which is also designed to be readable by hand.
|
| - Unlike SLIP39, we do not support passphrases or hardening of any
|   form.
|
| - Unlike SLIP39, we have no hardware wallet support. But we hope that
|   will change!

From having perused the extended documentation myself, I think I would
personally note the following differences.

- Alphabet: Codex32 uses the bech32 alphabet rather than SLIP39's
  alphabet consisting of English words.  The benefit to human-language
  words is easier memorization for those proficient in the particular
  language (in this case, SLIP39 only allows the use of English).  A
  disadvantage, IMO, is that it encourages the practice of memorization
  (which does have a few advantages but also a lot of drawbacks).

  Interestingly, Codex32 addresses what I think is the main problems of
  memorization: difficult-to-prove successful recollection.  Someone who
  wants to reliably keep seed-related material only in their head
  needs to practice recalling it on a regular basis, but for BIP39,
  SLIP39, Aezeed, etc... there's no way for them to confirm they
  successfully recalled it short of going through the entire recovery
  process; they probably just judge how confident they feel about the
  recollection and assume that feeling like they recalled it correctly
  is the same thing as recalling it correctly.

  Codex32 allows the individual to periodically perform their
  recollection on paper in a private room without electronics and use
  nothing but a pen and some loookup tables (or a paper device) to
  verify that they recalled the string correctly (and its checksum can
  help with correcting up to several errors, although you might need a
  computer for error location and correction assistance).

- Hierarchy: Codex32 does not natively provide support for nested 
  whereas SLIP39 does.  E.g., in SLIP39, you can require 2-of-3 for
  {me, family, friends} where me is 2-of-3 {fire_safe, bank_safe,
  buried_in_woods}, family is 1-of-3 {alice, bob, carol}, and friends
  are 2-of-5 {d, e, f, g, h}.  I assume you can do the same with Codex32
  by using the share for one level as the secret for the next level,
  although this is not described in the protocol.

- Versioning: Codex32's metadata can store version information for
  wallets that use implicit BIP32 paths (e.g. BIP44/49/84/86), although
  this would cut into the space available for users to set their own
  metadata and it is not specified in the draft BIP.  SLIP39 also
  doesn't specify anything about implicit path versioning and, AFAICT,
  doesn't have any room to store such metadata without reducing seed
  entropy.

- Plausible deniability dummy wallets: Codex32 doesn't support this;
  SLIP39 does.  Much has been written by other people about whether
  dummy wallets are a good idea or not, with strong opinions on both
  sides, so maybe we can just leave it at that.

---

When I first saw the post about this, it was unclear to me that it was a
serious project, but I've become increasingly interested as I researched
it.  I'm not personally that interested in generating entropy from dice
or encoding shares by hand---it's already imperative that I acquire a
trustworthy computer and load it with trustworthy software in order to
use my seed securely, so I might as well have it generate my seeds and 
my

recovery codes for me.

What really did catch my attention, but which was kind of buried in the
project documentation, is the ability to verify the integrity of each
share independently without using a computer.  For example, if I store a
share with some relative who lives thousands of kilometers away, I'll be
able to take that share out of its tamper-evident bag on my annual
holiday visit, verify that I can still read it accurately by validating
its checksum, and put it into a new bag for another year.  For this
procedure, I don't need to bring copies of any of my other shares,
allowing them (and my seed) to stay safe.

---

I do have one question after watching an excellent video[2] about the
motivation for this system.  In the video, one of the threat models
described is a disarrangement of the words in a metal backup system.
The implication seems to be that this would be an accidental
disarrangement, which obviously the Codex32 checksum would catch during
periodic offline verification.  But what about deliberate modification
of a recovery code?  For example, Bob doesn't 

Re: [bitcoin-dev] Codex32

2023-02-16 Thread Andrew Poelstra via bitcoin-dev
On Thu, Feb 16, 2023 at 12:50:12PM +0100, Pavol Rusnak via bitcoin-dev wrote:
> Hi!
> 
> The BIP states that its only advantage over SLIP-0039, which has been used
> in production for nearly three years (in at at least 3 SW/HW wallet
> implementations), is that it aims to be simple enough for hand computation.
> However, the BIP also indicates that "details of hand computation are
> outside the scope of this standard, and implementers do not need to be
> concerned with this possibility." Therefore, I am curious about how
> significant this advantage over SLIP-0039 really is. If hand computation is
> not straightforward and there are no other substantial advantages over
> SLIP-0039, I cannot help but feel that this BIP is simply a result of
> not-invented-here syndrome, but please correct me if I am wrong.
>

In my view, the hand computation is actually the main benefit of this
scheme. The process *is* straightforward, but tedious enough (and the
security benefits obscure enough, though they really shouldn't be...
"computers are opaque and untrustworthy" should be a common sentiment)
that it's hard to expect more than a small absolute number of users to
actually do it.

But for the purpose of the *standard*, what is important is that it is
possible to implement and use this within a normal hww workflow. This is
important for hand-computing users who know that their coins will not
die with them (since the 'standard' has fallen into obscurity), and
important for "normal" users who have the option to seamlessly switch
over to hand computation as the BTC price goes up or the world becomes
scarier.

For what it's worth, the draft lists several benefits over SLIP-0039.
I agree that none of them are particularly strong [1], and even together
they probably wouldn't meet the threshold to take the time to write a
standard, but I assure you the motivation was not NIH :).

> Keep in mind that the encoded shares in SLIP-0039 consist of exactly 200 or
> 330 bits, both of which are divisible by 5. This makes it straightforward
> to encode them as Bech32 strings.
> 

This is true! And very convenient for people who may want to simply
"layer on" the codex32 checksum/splitting logic onto their SLIP39 words.
They can use a lookup table to do the conversion, spend years or
whataever doing hand-computation on them, and then use a lookup table
to go back.


[1] One listed reason is that "a SLIP is not a BIP". I have heard people
speculate that this is one reason SLIP-0039 is not nearly as
widespread as BIP-0039, even though it is objectively a far better
standard. I'm unsure whether I believe this, but "there is no other
BIP" does seem like a good reason for BIP-0039's continued
dominance.

At the very least, it means that on BIP-0039 itself we have nothing
that we could say "supercedes" or "is recommended instead of" the
BIP. See https://github.com/bitcoin/bips/pull/1413

So it's something of an aside, but I think it would probably be good
for the ecosystem (though maybe bad for this BIP's prospects :)) if
you would request a BIP number for SLIP-0039.


-- 
Andrew Poelstra
Director of Research, Blockstream
Email: apoelstra at wpsoftware.net
Web:   https://www.wpsoftware.net/andrew

The sun is always shining in space
-Justin Lewis-Webster



signature.asc
Description: PGP signature
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Codex32

2023-02-16 Thread Pavol Rusnak via bitcoin-dev
Hi!

The BIP states that its only advantage over SLIP-0039, which has been used
in production for nearly three years (in at at least 3 SW/HW wallet
implementations), is that it aims to be simple enough for hand computation.
However, the BIP also indicates that "details of hand computation are
outside the scope of this standard, and implementers do not need to be
concerned with this possibility." Therefore, I am curious about how
significant this advantage over SLIP-0039 really is. If hand computation is
not straightforward and there are no other substantial advantages over
SLIP-0039, I cannot help but feel that this BIP is simply a result of
not-invented-here syndrome, but please correct me if I am wrong.

Keep in mind that the encoded shares in SLIP-0039 consist of exactly 200 or
330 bits, both of which are divisible by 5. This makes it straightforward
to encode them as Bech32 strings.

On Thu, 16 Feb 2023 at 09:30, Russell O'Connor via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:

> I've been asked by Dr. Curr and Professor Snead to forward this message to
> this mailing list, as it may be of general interest to Bitcoin users.
>
> Dear Colleague:
>
> In 1967, during excavation for the construction of a new shopping center
> in
> Monroeville, Pennsylvania, workers uncovered a vault containing a cache of
> ancient scrolls[1].  Most were severely damaged, but those that could be
> recovered confirmed the existence of a secret society long suspected to
> have
> been active in the region around the year 200 BC.
>
> Based on a translation of these documents, we now know that the society,
> the
> Cult of the Bound Variable, was devoted to the careful study of
> computation,
> over two millennia before the invention of the digital computer.
>
> While the Monroeville scrolls make reference to computing machines made of
> sandstone, most researchers believed this to be a poetic metaphor and that
> the
> "computers" were in fact the initiates themselves, carrying out the
> unimaginably tedious steps of their computations with reed pens on
> parchment.
>
> Within the vault, a collection of sandstone wheels marked in a language
> consisting of 32 glyphs was found. After 15 years of study, we have
> successfully
> completed the translation of what is known as "Codex32," a document that
> describes the functions of the wheels. It was discovered that the wheels
> operate
> a system of cryptographic computations that was used by cult members to
> safeguard their most valuable secrets.
>
> The Codex32 system allows secrets to be carved into multiple tablets and
> scattered to the far corners of the earth. When a sufficient number of
> tablets are
> brought together the stone wheels are manipulated in a manner to recover
> the
> secrets. This finding may be of particular interest to the Bitcoin
> community.
>
> Below we provide a summary of the cult's secret sharing system, which is
> graciously hosted at
> <
> https://github.com/apoelstra/bips/blob/2023-02--volvelles/bip-.mediawiki
> >.
> We are requesting a record assignment in the Bibliography of Immemorial
> Philosophy (BIP) repository.
>
> Thank you for your consideration.
>
> Dr. Leon O. Curr and Professor Pearlwort Snead
> Department of Archaeocryptography
> Harry Q. Bovik Institute for the Advancement
>
> [1] http://www.boundvariable.org/task.shtml
>
> -BEGIN BIP-
>
> 
>   BIP: 
>   Layer: Applications
>   Title: codex32
>   Author: Leon Olsson Curr and Pearlwort Sneed 
>   Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-
>   Status: Draft
>   Type: 
>   Created: 2023-02-13
>   License: BSD-3-Clause
>   Post-History: FIXME
> 
>
> ==Introduction==
>
> ===Abstract===
>
> This document describes a standard for backing up and restoring the master
> seed of a
> [https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki BIP-0032]
> hierarchical deterministic wallet, using Shamir's secret sharing.
> It includes an encoding format, a BCH error-correcting checksum, and
> algorithms for share generation and secret recovery.
> Secret data can be split into up to 31 shares.
> A minimum threshold of shares, which can be between 1 and 9, is needed to
> recover the secret, whereas without sufficient shares, no information about
> the secret is recoverable.
>
> ===Copyright===
>
> This document is licensed under the 3-clause BSD license.
>
> ===Motivation===
>
> BIP-0032 master seed data is the source entropy used to derive all private
> keys in an HD wallet.
> Safely storing this secret data is the hardest and most important part of
> self-custody.
> However, there is a tension between security, which demands limiting the
> number of backups, and resilience, which demands widely replicated backups.
> Encrypting the seed does not change this fundamental tradeoff, since it
> leaves essentially the same problem of how to back up the encryption key(s).
>
> To allow users freedom to make this tradeoff, we use Shamir's secret