Re: [bitcoin-dev] A design for Probabilistic Partial Pruning

2021-02-27 Thread Yuval Kogman via bitcoin-dev
On Fri, 26 Feb 2021 at 20:57, Keagan McClelland via bitcoin-dev

> The goals are to increase data redundancy in a way that more uniformly shares 
> the load across nodes, alleviating some of the pressure of full archive nodes 
> on the IBD problem. I am working on a draft BIP for this proposal but figured 
> I would submit it as a high level idea in case anyone had any feedback on the 
> initial design before I go into specification levels of detail.

You might be interested in an approach (henceforth "SeF") by Swanand
Kadhe, Jichan Chung and Kannan Ramchandran which employs fountain
codes, presented at Scaling Bitcoin 2019:

>From a cursory search it appears there is quite a bit of
related/followup work as well.

The simplest fountain code, the Luby Transform (applied in this work)
the encoder divides a message into smaller chunks, and then constructs
an infinite stream of codewords which are XORs of d randomly selected
chunks where d is sampled from the robust soliton distribution. The
simplest decoder simply XORs new k=1 codewords from the relevant k>1
codewords, and the full message can be recovered with overwhelming
probability and in linear time with a sufficiently large random sample
of codewords from the encoded stream. Note that the decoder must know
which chunks went into a codeword, but this is usually addressed using
pseudorandomness, which has other motivations in an adversarial

In SeF, the general idea is that "droplet nodes" are pruning nodes
which retain some number (equivalent to your threshold parameter) of
codewords from the encoding concatenated blocks (to obtain a fixed
message size), and serve these to compatible clients. This is more
robust than retaining a random sample of blocks, and also performs
well according to their simulations.

Even if this paper is not directly applicable to your efforts, the
theory of fountain codes should be of interest (many variants exist),
and there is work on fountain codes. There is also some work on
fountain codes in an adversarial setting (Falcon codes) but I'm only
vaguely familiar with it, and if i'm not mistaken most of the
considerations are either already implicitly addressed by the Bitcoin
protocol or explicitly addressed in the SeF paper, whose results also
take into account malicious droplet nodes.
bitcoin-dev mailing list

Re: [bitcoin-dev] A design for Probabilistic Partial Pruning

2021-02-27 Thread Yuval Kogman via bitcoin-dev
On Sat, 27 Feb 2021 at 22:09, Yuval Kogman  wrote:
> and there is work on fountain codes. There is also some work on

apologies, the first half of this line should have been deleted as it
was worked into the next sentence.
bitcoin-dev mailing list

[bitcoin-dev] WabiSabi: a building block for coordinated CoinJoins

2020-06-11 Thread Yuval Kogman via bitcoin-dev

As part of research into how CoinJoins in general and Wasabi in
particular can be improved, we'd like to share a new building block
we're calling WabiSabi, which utilizes keyed verification anonymous
credentials instead of blind signatures to verify the honest
participation of users in centrally coordinated CoinJoin protocols.

Blind signatures have been used to facilitate centrally coordinated
CoinJoins, but require standard denominations, each associated with a
key, because blind signatures can only convey a single bit of
information from the signer to the verifier (both roles are the
coordinator in this setting). Anonymous credentials carry attributes,
and in our case these are homomorphic value commitments as in
Confidential Transactions.

Note that this is an early draft with a deliberately narrow scope, and
only introduces this abstract building block. At this stage we'd like
to solicit feedback and criticism about our scheme and inputs with
regards to its potential applications before proceeding. We do not not
(yet) address the structure of the CoinJoin transactions, fee
structures, or other implementation details, but discussion of these
aspects is welcome.

The repository is, and the latest
version is available here:
bitcoin-dev mailing list

Re: [bitcoin-dev] Non-equal value CoinJoins. Opinions.

2019-12-29 Thread Yuval Kogman via bitcoin-dev

On Sun, 29 Dec 2019 at 10:23, ZmnSCPxj  wrote:

> Indeed, this is a problem still of equal-valued CoinJoin.
> In theory the ZeroLink protocol fixes this by strongly constraining user
> behavior, but ZeroLink is not "purely" implemented in e.g. Wasabi: Wasabi
> still allows spending pre- and post-mix coins in the same tx (ZeroLink
> disallows this) and any mix change should be considered as still linked to
> the inputs (though could be unlinked from the equal-valued output), i.e.
> returned to pre-mix wallet.

Yes, although since the base denomination size is pretty large this can be
very limiting, possibly forcing these change outputs to be linked to each
other or, worse, with unmixed inputs which is still a serious linkage
concern. This is further complicated due to variability of the denomination
(which makes remixing possible due to the fee structure, but see below)
also leaks some information or requires linking of mixed outputs in
addition (although this resets its notion of anonymity set size, so I don't
consider this unsafe or misleading, just wasteful) or in change being
donated to the coordinator due to not meeting the threshold, depending on
the "phase angle" between a user's past mixes and the coordinator's current

> Equal-valued CoinJoins fix this by using a Chaumian bank, which constrains
> value transfers to specific fixed amounts.
> Since an equal-valued CoinJoin uses a single fixed amount anyway, it is
> not an additional restriction.
> CashFusion cannot use the same technique without dropping into something
> very much like an equal-valued CoinJoin.

I concur.

I need to write a proper account of what I alluded to in my last email, but
here's a summary (allowing myself to keep it in this thread as the subject
was unequal amounts and opinions ;-)

1. introduce another stage between the input/output phases - at input
registration you would receive chaumian reissuable/redenominatable tokens
after deduction of per input fees, which you can then "spend" to create
outputs (instead of the chaumian token itself being an output script)

2. replace the current notion of a mixing round into several sub-types:
  - "decompose" - take an arbitrary amount and produce
popcount(amount-fees) outputs with popcount = 1 (anon set size assumed to
be 1)
  - "mix" - mix popcount == 1 amounts with equal sized outputs - this is
the only round type that can increase anon set size
  - "optimize" - convert mixed, popcount == 1 (e.g. { 2^n} <-> { 2^(n-1),
2^(n-1) } ) - it's not clear to me to what anon set size should be
considered after this, probably reset to somewhere between 1 and the
minimum size of the inputs, depending on degree of linkage
  - "combine" - spend popcount == 1 outputs to produce arbitrary amounts

Note that simultaneous rounds can be merged by the server during the
signing phase, such so that for example a "decompose" output may benefit
from inhabiting the same CoinJoin transaction as a mixing round with the
same denomination, but the coordinator would still be able to categorically
distinguish between these, so this should not be thought of as a robust
privacy improvement (but it does make some other adversary's job harder
given only public data).

In order to preserve the exact denomination size for mixing transactions,
such rounds would need to have their mining fees subsidized - this can be
accomplished by such merging, with the coordinator discounting or
subsidizing input/output registration fees depending on the degree of
mixing (a la Samourai/Whirlpool's mechanism design), or using some sort of
prepaid mechanism (e.g. as part of a mixing round instead of a registered
output you might elect to receive long lived - as in not per round -
chaumian tokens that can be redeemed for fee-less, round denomination
mixing, which can be reissued if the signing phase fails). In both cases
I'm assuming the coordinator includes an input to cover the mining fees.

I find the privacy aspects much easier to think about in this model, and it
addresses many things of zerolink's weaknesses:

1. allows unequal amounts but retains many of the advantages of fixed
denomination - the number of separate mixing pools would be at most
log(2.1e13), with their sizes corresponding to actual amount distribution
being used (for mining & coordination fees too, but more generally any
amounts used for any Bitcoin payment)

2. it can potentially eliminate post mix change (if I want to spend some
amount x = \sum{i=1..~40} a_i*2^i, and i have exactly the combination
specified by the a_i's) which the server can also enforce by restricting
"combine" rounds to require "optimize" rounds before them

3. increases privacy benefit of remixing while still removing Wasabi's
round denomination decreases, especially over long time intervals

The main issue, which I stress is significant, is the bloat - many such
rounds are required, with many inputs and outputs per user per round, and
many UTXOs per user on 

Re: [bitcoin-dev] Non-equal value CoinJoins. Opinions.

2019-12-29 Thread Yuval Kogman via bitcoin-dev
On Sun, 29 Dec 2019, 05:31 Yuval Kogman,  wrote:

> n = # inputs + # indistinguishable outputs

sorry, this is really wrong (although of no consequence to my arguments) -
n is the smaller of these two numbers, not their sum.
bitcoin-dev mailing list

Re: [bitcoin-dev] Non-equal value CoinJoins. Opinions.

2019-12-28 Thread Yuval Kogman via bitcoin-dev

On Sat, 28 Dec 2019 at 01:29, nopara73 via bitcoin-dev <> wrote:

I haven't read the whole thing in detail (and fwiw, I don't think I will by
this point), but I do want to respond to section about the combinatorics as
well as the proof, since both the premises and the implications don't seem
very solid to me, especially in light of the other replies in this thread.

It appears to be a step up from the Knapsack paper in terms of the
specificity of a concrete mixing protocol (which again, I did not
scrutinize, but see below), but a regression in terms of privacy (see other
replies), which even in the Knapsack paper's approach raises some concerns:

Now, there are 100!/(10!)^10 ~= 10^92 ways to partition the inputs into a
> list of 10 sets of 10 inputs, but only a tiny fraction of these partitions
> will produce the precise output list.
In the equal amount case, the search space of possible interpretations with
n = # inputs + # indistinguishable outputs is proportional to the nth Bell
number, i.e. it's exponential in the size of the transaction, which is an
inviting intuition. But this is an *upper* bound on the difficulty of
deanonymization, given no additional information.

This quantitative framing is potentially misleading because:

1. attributing inputs/outputs (sub-transactions in the Knapsack paper's
terminology) is arguably not a search problem, but an optimization problem,
since approximate results are still partly useful to the adversary
2. there are many computational strategies, heuristics, etc that in
practice can make this more efficient than brute force[1], so framing it
that as a security parameter doesn't sit right with me
3. as individual sub-transactions are identified (for example using out of
band information), the computational challenge also *drops* exponentially

Additionally (though is a broader criticism of CoinJoin based privacy and
not specific to unequal amounts, and in particular refers to ZmnSCPxj's
assertion of 0 linkability) I am very worried that perspectives that focus
on linkability information revealed by a single coinjoin transaction in
isolation. This problem was alluded in the document, to but I don't see
that it was addressed. Naively the post/pre mix transaction graph would
seem to present a computationally much harder problem when looking at the
combinatorics through the same lens, but reality it can also be used to
place many constraints on valid partitions/sub-transaction assignments for
a single transaction with equal amounts. The trivial example is post mix
linking of outputs, but there are many other ways to draw inferences or
eliminate possible interpretations of a single transaction based on its
wider context, which in turn may be used to attack other transactions.

> Based on the example above, we can see that not only are there a huge
> number of partitions, but that even with a fast algorithm that could find
> matching partitions, it would produce around 10^20 possible valid
> configurations. With 10^20 possibilities, there is essentially no linkage.
This is a better framing, but still doesn't address my third bullet, since
"Attacks always get better; they never get worse." In other words
"essentially no linkage" due to multiple possible interpretation is still
strictly more meaningful if you can add constraints out of band.

To be fair in equal amount CoinJoins this is also the case, but it's a much
simpler model to consider in the context of other privacy leak vectors
(e.g. transaction graph connectivity beyond a single coinjoin, wallet
fingerprinting, temporal patterns, network privacy leaks, etc etc), since
analyzing your level of exposure is *also* complicated by unequal amounts,
in other words higher chance of privacy leaks due to misuse, or ignorance
of some of the implications under intended use. Thinking through these
implications is much easier when the information content in the amounts is

The Cash Fusion scheme actually extends this obfuscation even further. Not
> only can players bring many inputs, they can also have multiple outputs
And, quoting another section:

Unfortunately, the production of equal-amount coins is impractical for
> various reasons. Foremost, it has a "toxic waste"

I'm still cautiously optimistic about the potential of multiple
inputs/outputs per user (c.f. 3-phase chaumian CoinJoin ideas we've
previously discussed in the context of Wasabi, though I don't recall any
public discussion I can link to, sorry list), but with the additional
assumption of amounts with small popcounts/Hamming weights (e.g. only
amounts that are 2^n sat in size, or based on 1-2-5 series, and for a
rationale see Ethan's reply).

Unfortunately this trades off that "toxic waste" problem for a very large
on chain footprint (e.g. if the popcount of the amount of a wallet is
limited to 1, the number of inputs and change outputs required in the worst
case is proportional to log of the 

Re: [bitcoin-dev] PubRef - Script OP Code For Public Data References

2019-07-19 Thread Yuval Kogman via bitcoin-dev

On Fri, 19 Jul 2019 at 17:49, Mike Brooks  wrote:

> Users will adopt whatever the client accepts - this feature would be
> transparent.

My skepticism was based in an assumption on my part that most such data is
produced by actors with a track record of neglecting transaction
efficiency. I'd be happy to be proven wrong, but considering say usage
rates of native witness outputs, or the fraction of transactions with more
than 2 outputs so far I see little evidence in support of widespread
adoption of cost saving measures. Assuming this is intended as a new script
version, all fully validating nodes need to commit to keeping all data
indefinitely before they can enforce the rules that make transactions
benefiting from this opcode safe to broadcast.

That said, if successful, the main concern is still that of address reuse -
currently there is no incentive in the protocol to do that, and with BIP32
wallets fewer reasons to do it as well, but this proposal introduces such
an incentive for users which would otherwise generate new addresses
(disregarding the ones that already reuse addresses anyway), and this is
problematic for privacy and fungibility.

Since address reuse has privacy concerns, I think it's important to draw a
distinction between clients accepting and producing such transactions, if
the latter were done transparently that would be very concerning IMO, and
even the former would not be transparent for users who run currently
pruning nodes.

I'm not sure how an O(1) time complexity leads to DoS, that seems like a
> very creative jump.

For what it's worth, that was in reference to hypothetical deduplication
only at the p2p layer, similar to compact blocks, but on further reflection
I'd like to retract that, as since both scenarios which I had in mind seem
easily mitigated.

  Based on this response, it makes me want to calculate the savings, what
> if it was a staggering reduction in the tree size?

Assuming past address reuse rates are predictive this only places an upper
bound on the potential size savings, so personally I would not find that
very compelling. Even if the realized size savings would be substantial,
I'm not convinced the benefits outweigh the downsides (increased validation
costs, monotonically growing unprunable data, and direct incentive for
address reuse), especially when compared to other methods/proposals that
can reduce on chain footprint generally improve privacy while reducing
validation costs for others (e.g. batching, lightning, MAST for
sufficiently large scripts, Schnorr signatures (musig, adaptor signatures),
{tap,graft}root, ).

bitcoin-dev mailing list

Re: [bitcoin-dev] PubRef - Script OP Code For Public Data References

2019-07-19 Thread Yuval Kogman via bitcoin-dev

On Fri, 19 Jul 2019 at 14:00, Mike Brooks via bitcoin-dev <> wrote:

Giving scripts the ability to refer to data on the blockchain will reduce
> transaction sizes because key material does not have to be repeated in
> every Script.

Given that address reuse is discouraged, and as far as I know is
predominantly utilized for customer deposit addresses by exchanges, many of
which have not invested resources in batching withdrawals or consolidating
small UTXOs, I am skeptical that such a feature would actually be utilized
by users for whom a potential use exists, especially as mining fees are
usually pushed onto customers anyway.

Unless extensively utilized such that costs outweigh benefits, this change
would impose an externality on validating nodes:

With this list a newly created script can refer to a specific PUSHDATA that
> was used in any previously confirmed block.

This would make pruning impossible, and also relaxes the bounds on
validation costs since it would allow random reads on all historical data
as opposed to just the UTXO set.

Although it would do nothing for block capacity, perhaps this redundancy
might be better addressed as opt-in functionality in the p2p layer? It
might help with IBD, though at least in my experience peer latency (as
opposed to throughput) is the limiting factor, and as far as I can tell
this would increase it. Somewhat relatedly, transaction IDs are another
type of pointer which might benefit from being encoded as a (block height,
offset). However, here too it seems to me like the complexity is
substantial, potentially introducing new DoS vectors, while saving several
bytes per input at most.

bitcoin-dev mailing list

Re: [bitcoin-dev] draft proposal: change forwarding (improved fungibility through wallet interoperability)

2018-12-02 Thread Yuval Kogman via bitcoin-dev

On Sat, 1 Dec 2018 at 05:06, James MacWhyte  wrote:

> Is the intention that someone would open their non-private wallet, and
> choose an option that slowly siphons their funds into a different app?

Yes, that's the idea. And then send them back in a controlled manner.

> Why would anyone want that feature?

Most mobile wallets have no coin control features, which are tricky to
implement (lots of design/UI effort required). Some special purpose wallets
have additional privacy concerns and also lack coin control - for example may inadvertently disclose browsing habits, or bisq arbitration
contracts are easily identifiable on the blockchain. The latter is actually
an inspiration to this, since it has functionality to allow funding
transactions from an external wallet, as well as withdrawing to one.

Conversely, fungibility focused wallets are highly specialized and limited
in scope. As far as I'm aware, JoinMarket and Wasabi are the only
maintained implementations of mixing wallets available today, and both are
desktop apps, with no hardware wallet integration. It is unlikely that e.g.
coinjoin functionality would be added the application specific wallets,
especially as these features require a great deal of care and effort to do
correctly (cf. SharedCoin)

The goal then is to allow people who are privacy conscious to utilize a
specialized wallet automatically, to isolate the activity of wallets which
don't provide a sufficient degree of control in order to achieve that
manually, and reducing the possibility of operator error.

Could you describe what the UX would be like

>From a payment standpoint the main difference is that change outputs would
not be usable, so the spendable balance would drop. The best idea I have
for handling that is to still display that balance but conveying that is
locked. However, I think simply removing it from the balance is also
acceptable. Funds would simply be added to the fungibility wallet similarly
to how they are used manually today.

For setup, the fungibility wallet would need to add functionality to export
these xpub variants, perhaps with a way of annotating what each account is
for (but see concerns about BIP44 recoverability). Like standard xpubs,
these would be easily conveyed by QR code.

The forwarding wallet would then offer an advanced configuration feature,
that allows adding and enabling the alternate change address chain. If the
fungibility wallet derives addresses differently, then the forwarding
wallet should reject the configuration value (which is the main technical
point of the writeup), to ensure funds are not misplaced.

> or how a wallet developer might implement this?

For fungibility wallets, this requires keeping track of these address
chains, and allowing them to be exported. This is similar to any sort of
scanning functionality implemented in a BIP32 capable wallet, plus the UI
to display them.

In the forwarding wallet, derivation of addresses is again already
implemented in any BIP32 capable wallet (i.e. checking for the next free
address), with the main change in the spending path being dependency
injection required to change the address chain parameters (from what I know
most BIP32 implementations are polymorphic with respect to derivations made
from a public extended key vs. a private extended key). The main effort
then is the setup functionality, which obviously will vary considerably
between wallets, but I imagine it would still be a simpler and safer change
than integrating comprehensive privacy features into the spend path

If the user is privacy-conscious, why did they choose the non-private
> wallet to begin with? Why wouldn't they just move all their funds to the
> private wallet so they can continue to use just one app?

Platform limitations, or application specific use cases, see above. More
broadly, the main rationale is that diverse, specialized wallets should be
used in a complementary way, as that is more achievable than expecting all
application specific wallets to have robust privacy features.

And if the user is not privacy-conscious, they would never choose to enable
> this option, so why would the wallet developer even bother to implement it?

I believe this is a low hanging fruit, easier to implement than coin
control, far easier to implement than safe mixing functionality, so wallet
developers (or contributes) would implement this to allow users more
reliable access to privacy features implemented by other wallets.

> From a product standpoint, I can't see how this would be useful, and
> therefore I'm not sure why it needs to be a BIP. If I'm missing something,
> please let me know!

The reason for documenting it in this way is because if deemed desirable
functionality (which itself is something the BIP process can help
determine), different implementations would need to agree on the details. I
hope I've managed to convince you of the usefulness, though I'm still not
sure about 

[bitcoin-dev] draft proposal: change forwarding (improved fungibility through wallet interoperability)

2018-11-06 Thread Yuval Kogman via bitcoin-dev

I would like to propose a method based on BIP32 (and optionally BIP44) for
improving fungibility and on chain privacy with wallets for which this is
not a primary concern, requiring minimal changes to allow such wallets to
safely forward change outputs to more specialized wallets. This is intended
to complement more comprehensive proposals such as BIP79.

Note that this draft is still incomplete, there are open questions about
the particular format to use. In its current form it proposes two viable
options (and two more are included completeness) and though I have a slight
preference for the first option, I remain undecided given the tradeoffs,
and so I am writing the mailing list to solicit inputs/criticism.

Thanks to SirMeow, Adam Ficsor, and Adam Gibson for reviewing earlier
versions and providing valuable feedback and suggestions.

bitcoin-dev mailing list