Thanks for the feedback Jameson. Obviously we're still in the pre-alpha stage, 
closing-in on alpha, but I think we have a good shot at having a fairly 
reliable first release when we get there. I'm hoping node operators like you 
won't get woken up in the middle of the night because of us too often.

I expect most of our I/O is spent in our MMR storage implementation, we don't 
ask very much from the k/v store (rocksdb until someone hates it enough to step 
up :-)). And that storage, while not being easy to wrap your head around at 
first (or second) is, from a storage standpoint, very simple [1]. Contrast with 
Parity or go-ethereum that store the entire Patricia tree in rocksdb/leveldb.

- Igno


‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On 9 March 2018 7:56 PM, Jameson Lopp <> wrote:

> Anecdotally, over the past year I've experienced performance issues running 
> both Ripple and Parity nodes. These issues were generally related to disk I/O 
> and more specifically pointed toward RocksDB being the culprit.
> I solved the Ripple I/O issue by changing nodes to use Ripple's 
> implementation-specific "NuDB"
> The Parity issue got a lot better when they upgraded the version of RocksDB 
> that they were using, though I heard that there is also an initiative to 
> write a Parity specific DB.
> On Fri, Mar 9, 2018 at 1:29 PM, Ignotus Peverell 
> <> wrote:
>> I'm not sure why but RocksDb seems really unpopular and lmdb very popular 
>> these days. Honestly, I didn't put that much thought into RocksDb 
>> originally. When I started on grin, I looked at the code of other Rust 
>> blockchain implementations. Parity was the more advanced one (on Ethereum) 
>> and they were using RocksDb, so I figured it would work out okay and the 
>> bindings would at least be decent. One often overlooked aspects of a 
>> database is the quality of the bindings in your PL, because poorly written 
>> bindings can make all the database guarantees go away. And I was a lot more 
>> worried about the cryptography and the size of range proofs back then.
>> I know the opinions of the lmdb author and others regarding atomicity in 
>> storages and frankly, I think they're a little too storage-focused (I've 
>> known some Oracle DBAs with similar positions). In my experience, from an 
>> application standpoint, putting too much trust in storage guarantees is a 
>> bad idea. Everything fails eventually, and when it does storage people are 
>> pretty quick to put the blame on disks (gotta do Raid 60), networks, 
>> language bindings, or you. Btw I'm guilty as well, I have implemented some 
>> simple storages in the past.
>> Truth is, it's actually rather easy to write a resilient blockchain node on 
>> a not-so-resilient storage (note: I'm talking about a node here, not 
>> wallets). The data is immutable and can be replayed at will. You messed up 
>> on the last block? Fine, restart on the one before that and just make sure 
>> it's all idempotent. If you're dealing with balances it's a little more 
>> complicated, but a node does not. And with careful design, you can make a 
>> lot of things idempotent. It's also practically impossible for grin to rely 
>> on an atomic storage because we have a separate state (Merkle Mountain 
>> Ranges) that are specifically designed to be easy to store in a flat file, 
>> while very unwieldy and slow to store in a k/v db. They're append-only for 
>> the most part so dealing with failure is also very easy (note: does not 
>> preclude bugs, but those get fixed). And when you squint right, the whole 
>> blockchain storage is append-only. From a storage standpoint, it's hard to 
>> find a more fault-tolerant use case.
>> So anyway, I'm definitely not married to RocksDb, but I don't think it 
>> matters enormously either. My biggest beef with it at this point is that 
>> it's a pain to build and has probably 10x the number of features we need. 
>> But swapping it out is just 200 LOC [1]. So maybe it's worth doing it just 
>> for this reason.
>> Now I'm going to link to this email on the 10 other places where I've been 
>> asked about this :-)
>> - Igno
>> [1]
>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>> On 8 March 2018 10:44 PM, Luke Kenneth Casson Leighton <> wrote:
>>> On Thu, Mar 8, 2018 at 8:03 PM, Ignotus Peverell
>>> wrote:
>>> > > > There is a denial-of-service option when a user downloads the chain,
>>> > > >
>>> > > > the peer can give gigabytes of data and list the wrong unspent 
>>> > > > outputs.
>>> > > >
>>> > > > The user will see that the result do not add up to 0, but cannot tell 
>>> > > > where
>>> > > >
>>> > > > the problem is.
>>> >
>>> > > which to be honest I do not quite understand. The user normally 
>>> > > downloads
>>> > >
>>> > > the chain by requesting blocks from peers, starting with just the 
>>> > > headers
>>> > >
>>> > > which can be checked for proof-of-work.
>>> >
>>> > The paper here refers to the MimbleWimble-style fast sync (IBD),
>>> hiya igno,
>>> lots of techie TLAs here that clearly tell me you're on the case and
>>> know what you're doing. it'll take me a while to catch up / get to
>>> the point where i could usefully contribute, i must apologise.
>>> in the meantime (switching tracks), one way i can definitely
>>> contribute to the underlying reliability is to ask why rocksdb has
>>> been chosen?
>>> rocksdb is based on leveldb, which was designed to hammer both the
>>>> CPU and the storage, on the assumption by google engineers that
>>> everyone will be using leveldb in google data centres, with google's
>>> money, and with google's resources, i.e. CPU is cheap and there will
>>> be nothing else going on. they also didn't do their homework in many
>>> other ways, resulting in an unstable pile of poo. and rocksdb is
>>>> based on that.
>>> many people carrying out benchmark tests forget to switch off the
>>> compression, or they forget to compress the key and/or the value being
>>> stored when comparing against lmdb, or bdb, and so on.
>>> so. why was rocksdb chosen?
>>> l.
>> --
>> Mailing list:
>> Post to     :
>> Unsubscribe :
>> More help   :
Mailing list:
Post to     :
Unsubscribe :
More help   :

Reply via email to