[Bitcoin-development] Service bits for pruned nodes
Hello all, I think it is time to move forward with pruning nodes, i.e. nodes that fully validate and relay blocks and transactions, but which do not keep (all) historic blocks around, and thus cannot be queried for these. The biggest roadblock is making sure new and old nodes that start up are able to find nodes to synchronize from. To help them find peers, I would like to propose adding two extra service bits to the P2P protocol: * NODE_VALIDATE: relay and validate blocks and transactions, but is only guaranteed to answer getdata requests for (recently) relayed blocks and transactions, and mempool transactions. * NODE_BLOCKS_2016: can be queried for the last 2016 blocks, but without guarantee for relaying/validating new blocks and transactions. * NODE_NETWORK (which existed before) will imply NODE_VALIDATE and guarantee availability of all historic blocks. The idea is to separate the different responsibilities of network nodes into separate bits, so they can - at some point - be implemented independently. Perhaps we want more than just one degree (2016 blocks), maybe also 144 or 21, but those can be added later if necessary. I monitored the frequency of block depths requested from my public node, and got this frequency distribution: http://bitcoin.sipa.be/depth-small.png so it seems 2016 nicely matches the set of frequently-requested blocks (indicating that few nodes are offline for more than 2 weeks consecutively. I'll write a BIP to formalize this, but wanted to get an idea of how much support there is for a change like this. Cheers, -- Pieter -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] Service bits for pruned nodes
I'd imagined that nodes would be able to pick their own ranges to keep rather than have fixed chosen intervals. Everything or two weeks is rather restrictive - presumably node operators are constrained by physical disk space, which means the quantity of blocks they would want to keep can vary with sizes of blocks, cost of storage, etc. Adding new fields to the addr message and relaying those fields to newer nodes means every node could advertise the height at which it pruned. I know it means a longer time before the data is available everywhere vs service bits, but it seems like most nodes won't be pruning right away anyway. There's plenty of time for upgrades. If an old node connected to a new node and getdata-d blocks that had been pruned, immediate disconnection should make the old node go find a different one. It means the combination of old node+not run for a long time might take a while before it can find a node that has what it wants, but that doesn't seem like a big deal. What is the use case for NODE_VALIDATE? Nodes that throw away blocks almost immediately? Why would a node do that? On Sun, Apr 28, 2013 at 5:51 PM, Pieter Wuille pieter.wui...@gmail.comwrote: Hello all, I think it is time to move forward with pruning nodes, i.e. nodes that fully validate and relay blocks and transactions, but which do not keep (all) historic blocks around, and thus cannot be queried for these. The biggest roadblock is making sure new and old nodes that start up are able to find nodes to synchronize from. To help them find peers, I would like to propose adding two extra service bits to the P2P protocol: * NODE_VALIDATE: relay and validate blocks and transactions, but is only guaranteed to answer getdata requests for (recently) relayed blocks and transactions, and mempool transactions. * NODE_BLOCKS_2016: can be queried for the last 2016 blocks, but without guarantee for relaying/validating new blocks and transactions. * NODE_NETWORK (which existed before) will imply NODE_VALIDATE and guarantee availability of all historic blocks. The idea is to separate the different responsibilities of network nodes into separate bits, so they can - at some point - be implemented independently. Perhaps we want more than just one degree (2016 blocks), maybe also 144 or 21, but those can be added later if necessary. I monitored the frequency of block depths requested from my public node, and got this frequency distribution: http://bitcoin.sipa.be/depth-small.png so it seems 2016 nicely matches the set of frequently-requested blocks (indicating that few nodes are offline for more than 2 weeks consecutively. I'll write a BIP to formalize this, but wanted to get an idea of how much support there is for a change like this. Cheers, -- Pieter -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] Service bits for pruned nodes
On Sun, Apr 28, 2013 at 6:29 PM, Mike Hearn m...@plan99.net wrote: I'd imagined that nodes would be able to pick their own ranges to keep rather than have fixed chosen intervals. Everything or two weeks is rather restrictive - presumably node operators are constrained by physical disk space, which means the quantity of blocks they would want to keep can vary with sizes of blocks, cost of storage, etc. Sure, that's why eventually several levels may be useful. Adding new fields to the addr message and relaying those fields to newer nodes means every node could advertise the height at which it pruned. I know it means a longer time before the data is available everywhere vs service bits, but it seems like most nodes won't be pruning right away anyway. There's plenty of time for upgrades. That's a more flexible model, indeed. I'm not sure how important speed of propagation will be though - it may be very slow, given that there are 10s of IPs circulating, and only a few are relayed in one go between nodes. Even then, I'd like to see the relay/validation responsibility split off from the serve historic data one, and have separate service bits for those. If an old node connected to a new node and getdata-d blocks that had been pruned, immediate disconnection should make the old node go find a different one. It means the combination of old node+not run for a long time might take a while before it can find a node that has what it wants, but that doesn't seem like a big deal. Disconnecting in case something is requested that isn't served seems like an acceptable behaviour, yes. A specific message indicating data is pruned may be more flexible, but more complex to handle too. What is the use case for NODE_VALIDATE? Nodes that throw away blocks almost immediately? Why would a node do that? NODE_VALIDATE doesn't say anything about which blocks are available, it just means it relays and validates (and thus is not an SPV node). It can be combined with NODE_BLOCKS_2016 if those blocks are also served. The reason for splitting them is that I think over time these may be handled by different implementations. You could have stupid storage/bandwidth nodes that just keep the blockchain around, and others that validate it. Even if that doesn't happen implementation-wise, I think these are sufficiently independent functions to start thinking about them as such. -- Pieter -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] Service bits for pruned nodes
On Sun, Apr 28, 2013 at 9:29 AM, Mike Hearn m...@plan99.net wrote: I'd imagined that nodes would be able to pick their own ranges to keep rather than have fixed chosen intervals. Everything or two weeks is rather X most recent is special for two reasons: It meshes well with actual demand, and the data is required for reorganization. So whatever we do for historic data, N most recent should be treated specially. But I also agree that its important that everything be splittable into ranges because otherwise when having to choose between serving historic data and— say— 40 GB storage, a great many are going to choose not to serve historic data... and so nodes may be willing to contribute 4-39 GB storage to the network there will be no good way for them to do so and we may end up with too few copies of the historic data available. As can be seen in the graph, once you get past the most recent 4000 blocks the probability is fairly uniform... so N most recent is not a good way to divide load for the older blocks. But simple ranges— perhaps quantized to groups of 100 or 1000 blocks or something— would work fine. This doesn't have to come in the first cut, however— and it needs new addr messages in any case. -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] Service bits for pruned nodes
On Sun, Apr 28, 2013 at 7:57 PM, John Dillon john.dillon...@googlemail.com wrote: Have we considered just leaving that problem to a different protocol such as BitTorrent? Offering up a few GB of storage capacity is a nice idea but it means we would soon have to add structure to the network to allow nodes to find each other to actually get that data. BitTorrent already has that issue thought through carefully with it's DHT support. I think this is not a great idea on a couple levels— Least importantly, our own experience with tracker-less torrents on the bootstrap files that they don't work very well in practice— and thats without someone trying to DOS attack it. More importantly, I think it's very important that the process of offering up more storage not take any more steps. The software could have user overridable defaults based on free disk space to make contributing painless. This isn't possible if it takes extra software, requires opening additional ports.. etc. Also means that someone would have to be constantly creating new torrents, there would be issues with people only seeding the old ones, etc. It's also the case that bittorrent is blocked on many networks and is confused with illicit copying. We would have the same problems with that that we had with IRC being confused with botnets. We already have to worry about nodes finding each other just for basic operation. The only addition this requires is being able to advertise what parts of the chain they have. What are the logistics of either integrating a DHT capable BitTorrent client, or just calling out to some library? We could still use the Bitcoin network to bootstrap the BitTorrent DHT. Using Bitcoin to bootstrap the Bittorrent DHT would probably make it more reliable, but then again it might cause commercial services that are in the business of poisoning the bittorrent DHT to target the Bitcoin network. Integration also brings up the question of network exposed attack surface. Seems like it would be more work than just adding the ability to add ranges to address messages. I think we already want to revise the address message format in order to have signed flags and to support I2P peers. -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] Service bits for pruned nodes
While I like the idea of a client using a DHT blockchain or UTXO list, I don't think that the reference client is the place for it. But it would make for a very interesting experimental project! On 29 April 2013 13:36, Gregory Maxwell gmaxw...@gmail.com wrote: On Sun, Apr 28, 2013 at 7:57 PM, John Dillon john.dillon...@googlemail.com wrote: Have we considered just leaving that problem to a different protocol such as BitTorrent? Offering up a few GB of storage capacity is a nice idea but it means we would soon have to add structure to the network to allow nodes to find each other to actually get that data. BitTorrent already has that issue thought through carefully with it's DHT support. I think this is not a great idea on a couple levels— Least importantly, our own experience with tracker-less torrents on the bootstrap files that they don't work very well in practice— and thats without someone trying to DOS attack it. More importantly, I think it's very important that the process of offering up more storage not take any more steps. The software could have user overridable defaults based on free disk space to make contributing painless. This isn't possible if it takes extra software, requires opening additional ports.. etc. Also means that someone would have to be constantly creating new torrents, there would be issues with people only seeding the old ones, etc. It's also the case that bittorrent is blocked on many networks and is confused with illicit copying. We would have the same problems with that that we had with IRC being confused with botnets. We already have to worry about nodes finding each other just for basic operation. The only addition this requires is being able to advertise what parts of the chain they have. What are the logistics of either integrating a DHT capable BitTorrent client, or just calling out to some library? We could still use the Bitcoin network to bootstrap the BitTorrent DHT. Using Bitcoin to bootstrap the Bittorrent DHT would probably make it more reliable, but then again it might cause commercial services that are in the business of poisoning the bittorrent DHT to target the Bitcoin network. Integration also brings up the question of network exposed attack surface. Seems like it would be more work than just adding the ability to add ranges to address messages. I think we already want to revise the address message format in order to have signed flags and to support I2P peers. -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] Service bits for pruned nodes
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On Mon, Apr 29, 2013 at 3:36 AM, Gregory Maxwell gmaxw...@gmail.com wrote: On Sun, Apr 28, 2013 at 7:57 PM, John Dillon john.dillon...@googlemail.com wrote: Have we considered just leaving that problem to a different protocol such as BitTorrent? Offering up a few GB of storage capacity is a nice idea but it means we would soon have to add structure to the network to allow nodes to find each other to actually get that data. BitTorrent already has that issue thought through carefully with it's DHT support. I think this is not a great idea on a couple levels— Least importantly, our own experience with tracker-less torrents on the bootstrap files that they don't work very well in practice— and thats without someone trying to DOS attack it. Unfortunate. What makes them not work out? DHT torrents seem pretty popular. More importantly, I think it's very important that the process of offering up more storage not take any more steps. The software could have user overridable defaults based on free disk space to make contributing painless. This isn't possible if it takes extra software, requires opening additional ports.. etc. Also means that someone would have to be constantly creating new torrents, there would be issues with people only seeding the old ones, etc. Now don't get me wrong, I'm not proposing we do this if it requires additional steps or other software. I only mean if it is possible in an easy way to integrate the BitTorrent technology into Bitcoin in an automatic fashion. Yes part of that may have to be finding a way to re-use the existing port for instance. We already have to worry about nodes finding each other just for basic operation. The only addition this requires is being able to advertise what parts of the chain they have. Sure I guess my concern is more how do you find the specific part of the chian you need without some structure to the network? Although I guess it may be enough to just add that structure or depend on just walking the nodes advertising themselves until you find what you want. We can build this stuff incrementally I'll agree. It won't be the case that one in a thousand nodes serve up the part of the chain you need overnight. So many I am over engineering the solution with BitTorrent. Using Bitcoin to bootstrap the Bittorrent DHT would probably make it more reliable, but then again it might cause commercial services that are in the business of poisoning the bittorrent DHT to target the Bitcoin network. Good point. Sadly one that may apply to the Tor network too in the future. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) iQEcBAEBCAAGBQJRfe1LAAoJEEWCsU4mNhiPuDgIAM1zz+ohlHgz37RgToQhInRc 1tv4Fnb6uGWyb4+U6UpK24LlXMFvOJsLm2czgbBc1Iz4z4wvb1m5IGw0ubJuV4mT GPUJhM4sNqfeKZlSWRw4Gia6Vk1jTkue+uVYvZn2vBS4SS6vYhYCC3zXIITyb2mp 7CVjcM84bTHKxIaMW1rIgmVJmfslsFdeNOp/cDVvkNl9+WvzWPeJ32BkT522p+pT AcPVFMsEJirYrXYi8HwdtGSeiG+mv0IemTAObJNPRrpw3x04ja6qecqzM51AkQ4t hPems5ShXM9FyDKFQNmtoC6ULpbd3CBBjsiQj0pp55epy6UC0eiUIXP8L9v0giM= =AOj8 -END PGP SIGNATURE- -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] Service bits for pruned nodes
On Mon, Apr 29, 2013 at 03:48:18AM +, John Dillon wrote: We can build this stuff incrementally I'll agree. It won't be the case that one in a thousand nodes serve up the part of the chain you need overnight. So many I am over engineering the solution with BitTorrent. I think that pretty much sums it up. With the block-range served in the anounce message you just need to find an annoucement with the right range, and at worst connect to a few more node to get what you need. It will be a long time before the bandwidth used for finding a node with the part of the chain that you need is a significant fraction of the load required for downloading the data itself. Remember that BitTorrent's DHT is a system giving you access to tens of petabytes worth of data. The Bitcoin blockchain on the other hand simply can't grow more than 57GiB per year. It's a cute idea though. Also, while we're talking about the initial download: http://blockchainbymail.com Lots of options out there. -- 'peter'[:-1]@petertodd.org signature.asc Description: Digital signature -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
[Bitcoin-development] Downloading blockchain
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I would like to download the entire blockchain by hand (kind of), but I can't find software for it. The closest thing I found is Bitcoin-protocol-test-harness code, but it is two years old and seems not to work with current bitcoin network. Python apreciated. Thanks for your help. - -- Jesús Cea Avión _/_/ _/_/_/_/_/_/ j...@jcea.es - http://www.jcea.es/ _/_/_/_/ _/_/_/_/ _/_/ Twitter: @jcea_/_/_/_/ _/_/_/_/_/ jabber / xmpp:j...@jabber.org _/_/ _/_/_/_/ _/_/ _/_/ Things are not so easy _/_/ _/_/_/_/ _/_/_/_/ _/_/ My name is Dump, Core Dump _/_/_/_/_/_/ _/_/ _/_/ El amor es poner tu felicidad en la felicidad de otro - Leibniz -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQCVAwUBUX35M5lgi5GaxT1NAQJGwQP+PtSg/cbqE87u/4axyUsOBDc/MsRB7DYx DfeEHqOqh8uAQ/uZMzxWrCPbp53TJK888AByH3NknkiGy0HNoshHOSmy5JACFt54 fsyWDadiWBR9bu6JV4R1imR0FMzbjGuRZkc26MAsleVbho5KZBdHR1/cPQ0D174h wu4k1yzywYg= =Nprg -END PGP SIGNATURE- -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development