Re: [Bitcoin-development] Privacy and blockchain data

2014-01-10 Thread Peter Todd
On Tue, Jan 07, 2014 at 10:34:46PM -0800, Jeremy Spilman wrote:
 
 2) Common prefixes: Generate addresses such that for a given wallet they
all share a fixed prefix. The length of that prefix determines the
anonymity set and associated privacy/bandwidth tradeoff, which
remainds a fixed ratio of all transactions for the life of the
wallet.
 
 
 Interesting thought to make the privacy/bandwidth trade-off using
 vanitygen and prefix filters.
 
 But doesn't this effectively expand the universe of potential spies
 from 'the global attacker' who is watching your SPV queries, to
 simply 'the globe' -- anyone with a copy of the blockchain?

It's a trade-off. Most people are going to use public peers for their
SPV nodes - they're not going to run full nodes. They also are going to
want to limit how much bandwidth they use to sync their wallets; if they
don't care the can use a very short, or no, prefix and the problem goes
away.

If you make that bandwidth/privacy trade-off by using very specific
filters and non-specific addresses then you have a situation where those
public peers are learning a lot of potentially valuable data. It's easy
to imagine, say, the IRS being willing to pay for data on how many
Bitcoins people have in their wallets to try to catch tax cheats for
instance, and that can easily fund a lot of fast and high-quality peers
that don't advertise the fact that they're selling data on the contents
of your wallet.

On the other hand if you use non-specific filters, and prefixed
addresses for incoming payments, then you're not leaking high-quality
information to anyone. I think this makes for a more robust Bitcoin
system, especially as we need things like CoinJoin for privacy that make
*everyones* privacy matter to you; CoinJoin could easily be defeated by
aquiring lots of good info on the contents of wallets through SPV
queries.

 Some stats on UTXO set size:  (slightly stale -- as of block 270733)
 
7.4m unspent outputs
2.2m transactions with unspent outputs
2.1m unique unspent scriptPubKeys
Side note: the top 1,000 scriptPubKeys have 10% of all unspent outputs.
 
 Let's say you use an 8-bit prefix (1/256) that would be ~10,000
 transactions in the UTXO you would be monitoring. But if I knew a
 few different days / time-periods you transacted, I could figure out
 your prefix.

Actually UTXO isn't the right way to look at this; prefix filters would
be almost certainly matched against all txouts in blocks. Or put another
way, UTXO isn't the right way to look at it because the attacker will
have some rough idea of the time period, and wants to know about
transactions made.

 Of course, anyone you transact with would know your prefix outright.

Well what good, in your example, is it for the attacker to go from I
know my target gets a paycheck every two weeks for $x to His wallet
prefix is abcd with y% probability? Even once you learn the prefix of
your target's wallet, what funds they actually own is still embedded in
a much larger anonymity set of hundreds to thousands of transactions
that had nothing to do with them.

 Wouldn't this also allow obvious identification of spend versus
 change addresses in a transaction?

No, I specifically said that you don't want to use prefixes with change
txouts for that reason. Fortunately while the set of all scriptPubKey's
ever used for change txouts will grow over time, as long as you are not
watching for new payments on any key in that set you only need to query
for the ones that still have funds on them, and that's only because you
want to be able to detect unauthorized spends of them.

-- 
'peter'[:-1]@petertodd.org
00028a5c9edabc9697fe96405f667be1d6d558d1db21d49b8857


signature.asc
Description: Digital signature
--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development


Re: [Bitcoin-development] Privacy and blockchain data

2014-01-07 Thread Jeremy Spilman

 2) Common prefixes: Generate addresses such that for a given wallet they
all share a fixed prefix. The length of that prefix determines the
anonymity set and associated privacy/bandwidth tradeoff, which
remainds a fixed ratio of all transactions for the life of the
wallet.


Interesting thought to make the privacy/bandwidth trade-off using  
vanitygen and prefix filters.

But doesn't this effectively expand the universe of potential spies from  
'the global attacker' who is watching your SPV queries, to simply 'the  
globe' -- anyone with a copy of the blockchain?

Some stats on UTXO set size:  (slightly stale -- as of block 270733)

7.4m unspent outputs
2.2m transactions with unspent outputs
2.1m unique unspent scriptPubKeys
Side note: the top 1,000 scriptPubKeys have 10% of all unspent outputs.

Let's say you use an 8-bit prefix (1/256) that would be ~10,000  
transactions in the UTXO you would be monitoring. But if I knew a few  
different days / time-periods you transacted, I could figure out your  
prefix.

Of course, anyone you transact with would know your prefix outright.

Wouldn't this also allow obvious identification of spend versus change  
addresses in a transaction?


--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk
___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development