subject:"Re\: \[Bitcoin\-development\] Bloom bait"

Re: [Bitcoin-development] Bloom bait

2014-06-11 Thread Mike Hearn

>
> Is this any different from my bloom filter IO attack code? Nope.
>

It's obviously different; a thin client trying to obtain more privacy is
not attempting to deny service to anyone. You can't simply state that a
feature which uses resources for a legitimate reason is a DoS attack,
that's a spurious definition that would reclassify innocuous things like
web browser prefetching as DoS attacks too.

With respect to the work you're proposing, trying to retroactively make
Bloom filtering an optional part of the protocol is just wasting people's
time at this point:

   - There is no evidence there's an actual problem today.
   - There is no evidence there will be a problem any time soon, given the
   meagre levels of traffic growth we have.
   - It involves a complicated rollout strategy that would create work for
   many people.
   - Gavin, Wladimir and myself have all concluded it's not worth the cost.
   - The only justification you have provided is that two node
   implementations hardly anyone uses can't be bothered to implement Bloom
   filtering, but want to advertise support in their ver message anyway. They
   can simply lower the number they advertise in their ver message.

That said, if you want to implement better support for archival nodes then
that'd be great. The way to do it would be either to implement block ranges
in the addr announcements as has been discussed many times before, or
perhaps introduce a new protocol optimised for serving the chain. Nodes
that are CPU limited won't want to take part in the rest of the P2P
protocol anyway, it's not just Bloom filtering that uses CPU time but also
block and tx relay.

But until you have done these things, please stop attempting to reclassify
any feature you can imagine a more efficient version of as an "attack".
It's just silly.
--
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

2014-06-10 Thread Peter Todd

On Tue, Jun 10, 2014 at 06:38:23PM +0800, Mike Hearn wrote:
> >
> > As I explained in the email you're replying to and didn't quote, bloom
> > filters has O(n) cost per query, so sending different bloom filters to
> > different peers for privacy reasons costs the network significant disk
> > IO resources. If I were to actually implement it it'd look like a DoS
> > attack on the network.
> >
> 
> DoS attack? Nice try.

Suppose I wrote an single address lookup tool for Android that connected
to multiple peers and used bloom filters to find the history of a
specific address. Of course, I don't want to use too much bandwidth
being on mobile, so I'll use as specific a bloom filter as possible. I
might even connect to multiple peers to speed up the lookup.

Is this any different from my bloom filter IO attack code? Nope. Hence,
splitting up bloom filter requests for better privacy will certainly
look like a DoS attack and will certainly greatly increase the load on
the network.

> Now consider a prefix filtering implementation. You need to calculate a
> sorted list of all the data elements and tx hashes in the block, that maps
> to the location in the block where the tx data can be found. These
> per-block indexes take up extra disk space and, realistically, would likely
> be implemented using LevelDB as that's a tool which is designed for
> creating and using these kinds of tables, so then you're both loading the
> block data itself (blocks are sized about right currently to always fit in
> the default kernel readahead window) AND also seeking through the indexes,
> and building them too. A smart implementation might try and pack the index
> next to each block so it's possible to load both at once with a single
> seek, but that would probably be more work, as it'd force building of the
> index to be synchronous with saving the block to disk thus slowing down
> block relay. In contrast a LevelDB based index would do the bulk of the
> index-building work on a separate core.

That's exactly the kinds of optimizations obelisk is implementing to
make its prefix lookup database fast. Also those optimizations are
situation dependent, for instance "packing the index next to each block"
is irrelevant if you put archival blockchain data on a slow HD, and
indexes on a fast SSD, something some obelisk servers do.

More to the point, your showing quite clearly there isn't just one
optimal way to do it. Applying a bloom filter, or a prefix filter, or
some as yet unknown filter, to blockchain data is a service and that
service has different tradeoffs compared to just serving up archival
block history. There is zero reason not to make that service something
you advertise with NODE_BLOOM - after all, you already have the code in
bitcoinj to do the exact same thing by checking the advertised protocol
version.

On Tue, Jun 10, 2014 at 09:02:00AM -0400, Jeff Garzik wrote:
> Most of this description of disk activity is true, but it omits one
> key point:  Total cached data (working set).  It is a binary, first
> order question:  are you hitting pagecache, or the disk?  When nodes
> act as archival data sources, the pagecache pressure is immense.  When
> nodes just primarily serve recent blocks, that data is being served
> out of pagecache. As I directly observed running public nodes, the
> disks were running constantly, impacting all clients, even clients
> downloading only recent blocks.
> 
> Luckily, headers are served out of RAM, so that part of the sync is always 
> fast.
> 
> NODE_BLOOM -- and block download in general -- will tend to be slower
> than it could be, due to the working set almost always being larger
> than available pagecache.  Fix that problem, NODE_BLOOM will always
> operate out of pagecache, and disk activity will not be an issue.
> 
> Once you start hitting the disk, you've already lost.

Yup. I discussed this with Matt Corallo at the financial crypto
conference a few months back and he made the same point. Unfortunately
we'll need an upgrade to let nodes advertise ranges of blocks to begin
to fix that issue, and even then it still shows quite clearly how it's
not optimal if we force everyone to share blockchain data in the same
way.

-- 
'peter'[:-1]@petertodd.org
23c7fc084ed84b891cc2fa90e4a34708d6b2370d3ec1c85d

signature.asc
Description: Digital signature
--
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

2014-06-10 Thread Jeff Garzik

Most of this description of disk activity is true, but it omits one
key point:  Total cached data (working set).  It is a binary, first
order question:  are you hitting pagecache, or the disk?  When nodes
act as archival data sources, the pagecache pressure is immense.  When
nodes just primarily serve recent blocks, that data is being served
out of pagecache. As I directly observed running public nodes, the
disks were running constantly, impacting all clients, even clients
downloading only recent blocks.

Luckily, headers are served out of RAM, so that part of the sync is always fast.

NODE_BLOOM -- and block download in general -- will tend to be slower
than it could be, due to the working set almost always being larger
than available pagecache.  Fix that problem, NODE_BLOOM will always
operate out of pagecache, and disk activity will not be an issue.

Once you start hitting the disk, you've already lost.



On Tue, Jun 10, 2014 at 6:38 AM, Mike Hearn  wrote:
>> As I explained in the email you're replying to and didn't quote, bloom
>> filters has O(n) cost per query, so sending different bloom filters to
>> different peers for privacy reasons costs the network significant disk
>> IO resources. If I were to actually implement it it'd look like a DoS
>> attack on the network.
>
>
> DoS attack? Nice try.
>
> Performance is subtle, disk iops especially so. I suspect you'd find - if
> you implemented it - that for the kinds of loads Bitcoin is processing both
> today and tomorrow prefix filtering either doesn't save any disk seeks or
> actively makes it worse.
>
> Consider a client that is syncing the last 24 hours of chain. bitcoind
> pre-allocates space for blocks in large chunks, so most blocks are laid out
> sequentially on disk. Almost all the cost of a disk read is rotational
> latency. Once the head is in place data can be read in very fast and modern
> kernels will attempt to adaptively read ahead in order to exploit this,
> especially if a program seems to be working through a disk file
> sequentially. The work of Bloom filtering parts of the chain for this client
> boils down to a handful of disk seeks at best and the rest of the work is
> all CPU/memory bound as the block is parsed into objects and tested against
> the filter. A smarter filtering implementation than ours could do SAX-style
> parsing of the block and avoid the overhead of turning it all into objects.
>
> Now consider a prefix filtering implementation. You need to calculate a
> sorted list of all the data elements and tx hashes in the block, that maps
> to the location in the block where the tx data can be found. These per-block
> indexes take up extra disk space and, realistically, would likely be
> implemented using LevelDB as that's a tool which is designed for creating
> and using these kinds of tables, so then you're both loading the block data
> itself (blocks are sized about right currently to always fit in the default
> kernel readahead window) AND also seeking through the indexes, and building
> them too. A smart implementation might try and pack the index next to each
> block so it's possible to load both at once with a single seek, but that
> would probably be more work, as it'd force building of the index to be
> synchronous with saving the block to disk thus slowing down block relay. In
> contrast a LevelDB based index would do the bulk of the index-building work
> on a separate core.
>
> At some block size and client load the additional data storage and increased
> pressure on the page cache would probably make it worthwhile. But I find it
> unlikely to be true at current traffic levels, or double or triple today's
> levels. So I'd rather we spend our very limited collective time on finding
> ways to increase usage rather than worrying about resources which are not
> presently scarce.
>
> (as an aside, some of the above analysis would be invalidated if most nodes
> end up running on SSDs, but I doubt most are. It'd be neat to export storage
> tech via some kind of stats message - LevelDB is designed for HDDs not SSDs
> so at some point a new storage subsystem might make sense if the network
> switched over).
>
>
>
> --
> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
> Find What Matters Most in Your Big Data with HPCC Systems
> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
> http://p.sf.net/sfu/hpccsystems
> ___
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>



-- 
Jeff Garzik
Bitcoin core developer and open source evangelist
BitPay, Inc.  https://bitpay.com/

--
HPCC Systems Open Source Big Data Platform from LexisN

Re: [Bitcoin-development] Bloom bait

2014-06-10 Thread Mike Hearn

>
>  A NODE_BLOOM service bit is a very reasonable
> and simple way to do exactly that, and is defacto what implementations
> that don't support bloom filters do anyway.
>

BTW, I find it curious that any nodes have code to disconnect peers that
send Bloom filters. It shouldn't be necessary. Bitcoinj is the only large
scale user of filtering and it will disconnect itself if a peer advertises
support for a version lower than 7. If a node advertises support for
this version or higher then it is supposed to implement BIP37.

It sounds like some node authors decided to advertise support for a
protocol version they didn't bother implementing, which would be a bug.
--
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

2014-06-10 Thread Mike Hearn

>
> As I explained in the email you're replying to and didn't quote, bloom
> filters has O(n) cost per query, so sending different bloom filters to
> different peers for privacy reasons costs the network significant disk
> IO resources. If I were to actually implement it it'd look like a DoS
> attack on the network.
>

DoS attack? Nice try.

Performance is subtle, disk iops especially so. I suspect you'd find - if
you implemented it - that for the kinds of loads Bitcoin is processing both
today and tomorrow prefix filtering either doesn't save any disk seeks or
actively makes it worse.

Consider a client that is syncing the last 24 hours of chain. bitcoind
pre-allocates space for blocks in large chunks, so most blocks are laid out
sequentially on disk. Almost all the cost of a disk read is rotational
latency. Once the head is in place data can be read in very fast and modern
kernels will attempt to adaptively read ahead in order to exploit this,
especially if a program seems to be working through a disk file
sequentially. The work of Bloom filtering parts of the chain for this
client boils down to a handful of disk seeks at best and the rest of the
work is all CPU/memory bound as the block is parsed into objects and tested
against the filter. A smarter filtering implementation than ours could do
SAX-style parsing of the block and avoid the overhead of turning it all
into objects.

Now consider a prefix filtering implementation. You need to calculate a
sorted list of all the data elements and tx hashes in the block, that maps
to the location in the block where the tx data can be found. These
per-block indexes take up extra disk space and, realistically, would likely
be implemented using LevelDB as that's a tool which is designed for
creating and using these kinds of tables, so then you're both loading the
block data itself (blocks are sized about right currently to always fit in
the default kernel readahead window) AND also seeking through the indexes,
and building them too. A smart implementation might try and pack the index
next to each block so it's possible to load both at once with a single
seek, but that would probably be more work, as it'd force building of the
index to be synchronous with saving the block to disk thus slowing down
block relay. In contrast a LevelDB based index would do the bulk of the
index-building work on a separate core.

At *some* block size and client load the additional data storage and
increased pressure on the page cache would probably make it worthwhile. But
I find it unlikely to be true at current traffic levels, or double or
triple today's levels. So I'd rather we spend our very limited collective
time on finding ways to increase usage rather than worrying about resources
which are not presently scarce.

(as an aside, some of the above analysis would be invalidated if most nodes
end up running on SSDs, but I doubt most are. It'd be neat to export
storage tech via some kind of stats message - LevelDB is designed for HDDs
not SSDs so at some point a new storage subsystem might make sense if the
network switched over).
--
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

2014-06-08 Thread Peter Todd

On Sat, Jun 07, 2014 at 03:44:07PM -0400, Alan Reiner wrote:
> 
> On 06/07/2014 07:22 AM, Mike Hearn wrote:
> >
> > You can send different bloom filters to different peers too, so I'm
> > not sure why you're listing subsetting as a unique advantage of prefix
> > filters.
> >
> >
> 
> Please let me know if we've gone down this path before, but it would
> seem that the more different bloom filters you create, the more
> information you give away.  It would be most useful to create a single
> bloom filter that captures every address you ever intend to use (say a
> look ahead of 1000 addresses), and then only ever communicate that. 
> Once people see multiple filters that you produce, they can start
> looking at the intersection of them to reduce the identity space.  I
> would expect that after enough bloom variants, they could figure out a
> perfect subset of blockchain addresses in your wallet.  (I suppose you
> could intentionally select an extra 20% addresses to include in every
> bloom filter, but it's a hack).
> 
> Similarly, if you keep updating your bloom filter to include more
> addresses, the difference in what passes through the previous one and
> the new one gives away information about new addresses you created.

You're completely correct. You can use the same nTweak value for each
filter and then slice up the filters bitwise, but then you end up
linking every query you make to your identity unless you just used a
fixed nTweak that everyone else uses.  (e.g. zero) If you do that, you
still have the problem that you're greatly increasing the load on the
network.

In any case, all this shows is that in the future bloom filters will
very likely go away eventually, and to make that upgrade path smooth it
only makes sense to define a way for nodes to let others know whether or
not bloom is supported. A NODE_BLOOM service bit is a very reasonable
and simple way to do exactly that, and is defacto what implementations
that don't support bloom filters do anyway.

Note BTW that re: DNS seeds, once the NODE_BLOOM BIP is accepted and the
NODE_BLOOM patch merged into bitcoind, I'll write a patch for sipa's
seeder to make it only return seeds with bloom filter support.

-- 
'peter'[:-1]@petertodd.org
3afb1fdf0867fc063775e69f9ae79870bb8727f25b49e88f

signature.asc
Description: Digital signature
--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

2014-06-08 Thread Peter Todd

On Sat, Jun 07, 2014 at 07:22:56PM +0800, Mike Hearn wrote:
> You can send different bloom filters to different peers too, so I'm not
> sure why you're listing subsetting as a unique advantage of prefix filters.

As I explained in the email you're replying to and didn't quote, bloom
filters has O(n) cost per query, so sending different bloom filters to
different peers for privacy reasons costs the network significant disk
IO resources. If I were to actually implement it it'd look like a DoS
attack on the network.

Essentially with bloom filters you have to make a tradeoff between
scalability and privacy; with prefix filters you don't have to make that
ugly tradeoff. Notably that tradeoff gets worse if we ever increase the
Bitcoin blocksize.

-- 
'peter'[:-1]@petertodd.org
3afb1fdf0867fc063775e69f9ae79870bb8727f25b49e88f

signature.asc
Description: Digital signature
--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

2014-06-07 Thread Alan Reiner

On 06/07/2014 07:22 AM, Mike Hearn wrote:
>
> You can send different bloom filters to different peers too, so I'm
> not sure why you're listing subsetting as a unique advantage of prefix
> filters.
>
>

Please let me know if we've gone down this path before, but it would
seem that the more different bloom filters you create, the more
information you give away.  It would be most useful to create a single
bloom filter that captures every address you ever intend to use (say a
look ahead of 1000 addresses), and then only ever communicate that. 
Once people see multiple filters that you produce, they can start
looking at the intersection of them to reduce the identity space.  I
would expect that after enough bloom variants, they could figure out a
perfect subset of blockchain addresses in your wallet.  (I suppose you
could intentionally select an extra 20% addresses to include in every
bloom filter, but it's a hack).

Similarly, if you keep updating your bloom filter to include more
addresses, the difference in what passes through the previous one and
the new one gives away information about new addresses you created.

--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

2014-06-07 Thread Mike Hearn

You can send different bloom filters to different peers too, so I'm not
sure why you're listing subsetting as a unique advantage of prefix filters.

The main advantage of prefix filters seems to be faster lookups if the node
is calculating a sorted index for each block, and the utxo commitment
stuff, both of those would be cool but involve imposing extra costs on
nodes. We lack models that let us understand the tradeoffs involved in
various indexing schemes, I feel.
--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

2014-06-06 Thread Peter Todd

On Fri, Jun 06, 2014 at 10:10:51AM -0700, Gregory Maxwell wrote:
> On Fri, Jun 6, 2014 at 10:05 AM, Peter Todd  wrote:
> > Again, you *don't* have to use brute-force prefix selection. You can
> > just as easily give your peer multiple prefixes, each of which
> > corresponds at least one address in your wallet with some false positive
> > rate. I explained all this in detail in my original blockchain data
> > privacy writeup months ago.
> 
> I'm not trying to pick nits about all the options,  I just found it
> surprising that you were saying that data published in a super public
> manner is no different than something used between nodes.

Because I was designing a system under the assumption that you were
highly likely to connect to an attacker at some point, and the trade-off
available with the available math was to either give very detailed info
to that attacker, or give away some probabalistic info to everyone.

> > I explained all this in detail in my original blockchain data privacy 
> > writeup months ago.
> 
> Communication is a two way street, Adam and I (and others) are
> earnestly trying— that we're not following your arguments may be a
> suggestion that they need to be communicated somewhat differently.

Quite likely - I think most of this disagreement stems from the fact
that we have different starting assumptions. In particular my assumption
that you are likely to end up connecting to an attacker logging data,
and my desire to have a standard that can be implemented with existing
cryptographic primatives. Remember that I'm spending a lot of time
working with wallet authors; they have approximately zero interest in
standards that require crypto any more fancy than HD wallets do.

> I'm still failing to see the usefulness of having any prefix filtering
> for DH-private outputs. It really complicates the security story— in
> particular you don't know _now_ what activities will turn your prior
> information leaks into compromising ones retrospectivelly, and doesn't
> seem at very necessary for scanning performance.

Scanning performance is different from bandwidth performance. Prefix
brute-forcing was designed to address the latter concern for cases where
you are bandwidth limited and don't have a trusted peer to do the
scanning for you.

-- 
'peter'[:-1]@petertodd.org
1c5e0fca7bd6d96211a37543c1d0cc2f594c15423ee3cdd8

signature.asc
Description: Digital signature
--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

2014-06-06 Thread Gregory Maxwell

On Fri, Jun 6, 2014 at 10:05 AM, Peter Todd  wrote:
> Again, you *don't* have to use brute-force prefix selection. You can
> just as easily give your peer multiple prefixes, each of which
> corresponds at least one address in your wallet with some false positive
> rate. I explained all this in detail in my original blockchain data
> privacy writeup months ago.

I'm not trying to pick nits about all the options,  I just found it
surprising that you were saying that data published in a super public
manner is no different than something used between nodes.

> I explained all this in detail in my original blockchain data privacy writeup 
> months ago.

Communication is a two way street, Adam and I (and others) are
earnestly trying— that we're not following your arguments may be a
suggestion that they need to be communicated somewhat differently.

I'm still failing to see the usefulness of having any prefix filtering
for DH-private outputs. It really complicates the security story— in
particular you don't know _now_ what activities will turn your prior
information leaks into compromising ones retrospectivelly, and doesn't
seem at very necessary for scanning performance.

--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

2014-06-06 Thread Peter Todd

On Fri, Jun 06, 2014 at 09:58:19AM -0700, Gregory Maxwell wrote:
> On Fri, Jun 6, 2014 at 9:46 AM, Peter Todd  wrote:
> > transactions against. Where they differ is that bloom filters has O(n)
> > scaling, where n is the size of a block, and prefix filters have O(log n)
> > scaling with slightly(1) higher k. Again, if you *don't* use brute forcing
> > in conjunction with prefixes they have no different transactional graph
> > privacy than bloom filters,
> 
> Huh? How are you thinking that something that gets put in transactions
> and burned forever into the blockchain that lets you (statically) link
> txout ownership is "no different" from something which is shared
> directly with a couple peers, potentially peers you trust and which
> are run by yourself or your organization?

Again, you *don't* have to use brute-force prefix selection. You can
just as easily give your peer multiple prefixes, each of which
corresponds at least one address in your wallet with some false positive
rate. I explained all this in detail in my original blockchain data
privacy writeup months ago.

-- 
'peter'[:-1]@petertodd.org
29d945c3832c7f4afabce11e6cb1c27b6f5e8c0f2bbb356c


signature.asc
Description: Digital signature
--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

2014-06-06 Thread Gregory Maxwell

On Fri, Jun 6, 2014 at 9:46 AM, Peter Todd  wrote:
> transactions against. Where they differ is that bloom filters has O(n)
> scaling, where n is the size of a block, and prefix filters have O(log n)
> scaling with slightly(1) higher k. Again, if you *don't* use brute forcing
> in conjunction with prefixes they have no different transactional graph
> privacy than bloom filters,

Huh? How are you thinking that something that gets put in transactions
and burned forever into the blockchain that lets you (statically) link
txout ownership is "no different" from something which is shared
directly with a couple peers, potentially peers you trust and which
are run by yourself or your organization?

--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

2014-06-06 Thread Peter Todd

On Fri, Jun 06, 2014 at 12:45:43PM +0200, Adam Back wrote:

(changed subject line as this discussion has nothing to do with
NODE_BLOOM)

> As I recall prefix brute forcing was a bit twiddle saving, ie searching for
> EDH key that has the users public prefix.  That does not improve privacy
> over an explicit prefix, it saves a byte or so at the expense of average 128
> EDH exchanges to send and provides also some probably relatively ineffective
> plausible deniability by enabling the transaction to be indistinguishable
> from some other (not very common) types of transaction.

I think you should re-read my original proposal; there's a whole host of
misunderstandings above, for instance I have no idea where you got the
idea that it has anything to do with "saving a byte" came from, or where
the number 128 came from.

> >don't do that they have privacy equal or better than bloom filters, but
> >with better scalability.
> 
> either its full node only where prefixes are not used, which is less
> scalable than bloom; or prefixes are used explicitly or implicitly
> (brute-force) and either way privacy is weakened by the extra correlation
> hook provided by elimination from the network graph of payments with
> mismatched prefixes.

Again, you have a misunderstanding. Both bloom filters and prefix
filters are just ways of giving a peer a probabalistic filter to match
transactions against. Where they differ is that bloom filters has O(n)
scaling, where n is the size of a block, and prefix filters have O(log n)
scaling with slightly(1) higher k. Again, if you *don't* use brute forcing
in conjunction with prefixes they have no different transactional graph
privacy than bloom filters, but the better scalability lets you do
things like split your queries across multiple peers that give you
better protection against hostile nodes.  Additionally prefix filters
are compatible with future miner committed indexes to make the data
authenticated.

1) see Amir's experience implementing prefix lookup in Obelisk

> We need to consider the two types of privacy involved.  Privacy from the
> full node an SPV client is relying on to find its payments, vs privacy from
> analysis of the public transaction graph.

Agreed

> The latter is more damaging.

Maybe! If adversaries are operating a significant fraction of the peers
you are connecting to the current design of bloom filters + HD wallets
results in situations where those adversaries have better transactional
graph information than the alternative.

> Better to design for privacy against future analysis of
> public info, than
> privacy by argument to select non-hostile nodes.  Tor has changed recently
> to account for the fact that chosing fresh random nodes is actually
> worse. ie you have a probability of identity/address identification
> per route/node,
> and repeatedly selecting routes/nodes just cumulatively increases the chance
> you'll be identified.  Better to pick a random node, identify it and stick
> to it and hope you chose well.

That's basically what Electrum and Obelisk already do - by default you
connect to a relatively small set of blockchain data servers operated by
well known people and use the same server repeatedly.

Applying that to the P2P network however is tricky as there is a huge
amount of churn in the nodes:

#bitcoin-dev/14/06/14-06-06.log:11:18 < hearn> bitcoinj can't use
[service bits] as it relies on DNS seeds and that is unlikely to change
any time soon due to the general churn rate in the network making it
hard to bootstrap quickly using just remembered sets of IPs.

> Maybe other simpler, but yes there was the proof of concept that the math
> exists in the form of the weil pairing.
> 
> https://bitcointalk.org/index.php?topic=431756.new

I know, where can I find running code? Remember that a bug can easily
lose thousands of dollars worth of Bitcoins.

-- 
'peter'[:-1]@petertodd.org
1d2af1653c415b7801ce4c9b18ac7e87bef597e652b203e6

signature.asc
Description: Digital signature
--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech___
Bitcoin-development mailing list
Bitcoin-development@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bitcoin-development

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

Re: [Bitcoin-development] Bloom bait

14 matches

Site Navigation

Mail list logo

Footer information