Anecdotally, over the past year I've experienced performance issues running both Ripple and Parity nodes. These issues were generally related to disk I/O and more specifically pointed toward RocksDB being the culprit.
I solved the Ripple I/O issue by changing nodes to use Ripple's implementation-specific "NuDB" The Parity issue got a lot better when they upgraded the version of RocksDB that they were using, though I heard that there is also an initiative to write a Parity specific DB. On Fri, Mar 9, 2018 at 1:29 PM, Ignotus Peverell < igno.pever...@protonmail.com> wrote: > I'm not sure why but RocksDb seems really unpopular and lmdb very popular > these days. Honestly, I didn't put that much thought into RocksDb > originally. When I started on grin, I looked at the code of other Rust > blockchain implementations. Parity was the more advanced one (on Ethereum) > and they were using RocksDb, so I figured it would work out okay and the > bindings would at least be decent. One often overlooked aspects of a > database is the quality of the bindings in your PL, because poorly written > bindings can make all the database guarantees go away. And I was a lot more > worried about the cryptography and the size of range proofs back then. > > I know the opinions of the lmdb author and others regarding atomicity in > storages and frankly, I think they're a little too storage-focused (I've > known some Oracle DBAs with similar positions). In my experience, from an > application standpoint, putting too much trust in storage guarantees is a > bad idea. Everything fails eventually, and when it does storage people are > pretty quick to put the blame on disks (gotta do Raid 60), networks, > language bindings, or you. Btw I'm guilty as well, I have implemented some > simple storages in the past. > > Truth is, it's actually rather easy to write a resilient blockchain node > on a not-so-resilient storage (note: I'm talking about a node here, not > wallets). The data is immutable and can be replayed at will. You messed up > on the last block? Fine, restart on the one before that and just make sure > it's all idempotent. If you're dealing with balances it's a little more > complicated, but a node does not. And with careful design, you can make a > lot of things idempotent. It's also practically impossible for grin to rely > on an atomic storage because we have a separate state (Merkle Mountain > Ranges) that are specifically designed to be easy to store in a flat file, > while very unwieldy and slow to store in a k/v db. They're append-only for > the most part so dealing with failure is also very easy (note: does not > preclude bugs, but those get fixed). And when you squint right, the whole > blockchain storage is append-only. From a storage standpoint, it's hard to > find a more fault-tolerant use case. > > So anyway, I'm definitely not married to RocksDb, but I don't think it > matters enormously either. My biggest beef with it at this point is that > it's a pain to build and has probably 10x the number of features we need. > But swapping it out is just 200 LOC [1]. So maybe it's worth doing it just > for this reason. > > Now I'm going to link to this email on the 10 other places where I've been > asked about this :-) > > - Igno > > [1] https://github.com/mimblewimble/grin/blob/master/store/src/lib.rs > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ > > On 8 March 2018 10:44 PM, Luke Kenneth Casson Leighton <l...@lkcl.net> > wrote: > > > On Thu, Mar 8, 2018 at 8:03 PM, Ignotus Peverell > > > > igno.pever...@protonmail.com wrote: > > > > > > > There is a denial-of-service option when a user downloads the > chain, > > > > > > > > > > the peer can give gigabytes of data and list the wrong unspent > outputs. > > > > > > > > > > The user will see that the result do not add up to 0, but cannot > tell where > > > > > > > > > > the problem is. > > > > > > > which to be honest I do not quite understand. The user normally > downloads > > > > > > > > the chain by requesting blocks from peers, starting with just the > headers > > > > > > > > which can be checked for proof-of-work. > > > > > > The paper here refers to the MimbleWimble-style fast sync (IBD), > > > > hiya igno, > > > > lots of techie TLAs here that clearly tell me you're on the case and > > > > know what you're doing. it'll take me a while to catch up / get to > > > > the point where i could usefully contribute, i must apologise. > > > > in the meantime (switching tracks), one way i can definitely > > > > contribute to the underlying reliability is to ask why rocksdb has > > > > been chosen? > > > > https://www.reddit.com/r/Monero/comments/4rdnrg/lmdb_vs_rocksdb/ > > > > https://github.com/AltSysrq/lmdb-zero > > > > rocksdb is based on leveldb, which was designed to hammer both the > > > > CPU and the storage, on the assumption by google engineers that > > > > everyone will be using leveldb in google data centres, with google's > > > > money, and with google's resources, i.e. CPU is cheap and there will > > > > be nothing else going on. they also didn't do their homework in many > > > > other ways, resulting in an unstable pile of poo. and rocksdb is > > > > based on that. > > > > many people carrying out benchmark tests forget to switch off the > > > > compression, or they forget to compress the key and/or the value being > > > > stored when comparing against lmdb, or bdb, and so on. > > > > so. why was rocksdb chosen? > > > > l. > > > > -- > Mailing list: https://launchpad.net/~mimblewimble > Post to : mimblewimble@lists.launchpad.net > Unsubscribe : https://launchpad.net/~mimblewimble > More help : https://help.launchpad.net/ListHelp >
-- Mailing list: https://launchpad.net/~mimblewimble Post to : mimblewimble@lists.launchpad.net Unsubscribe : https://launchpad.net/~mimblewimble More help : https://help.launchpad.net/ListHelp