Greetings Ben,

Also, leveldb stores data in "levels".  The very first storage level and the 
runtime data recovery log are not compressed.

That said, I agree with Tom that you are most likely seeing Riak store 3 copies 
of your data versus only one for mongodb.  It is possible to dumb down Riak so 
that it is closer to mongodb:

1.  in app.config, look for the riak_core options, add the following line:

          {default_bucket_props, [{n_val,1}]},

This will default the system to only storing one copy of your data.


2. if you are using Riak 1.3, again in app.config, look for the riak_kv options:

    change this

       {anti_entropy, {on, []}},

    to

      {anti_entropy, {off, []}},

This will disable Riak's automatic detection and correction of data loss / 
corruption.  The feature requires an added 1 to 2% data on disk.


Matthew



On Apr 10, 2013, at 9:01 AM, Tom Santero <[email protected]> wrote:

> Hi Ben,
> 
> First, allow me to welcome to the list! Stick around, I think you'll like it 
> here. :)
> 
> How many nodes of Riak are you running vs how many nodes of Mongo?
> 
> How much more disk space did Riak take?
> 
> Riak is designed to run as a cluster of several nodes, utilizing replication 
> to provide resiliency and high-availability during partial failure. By 
> default Riak stores three replicas of every object you persist. If you are 
> only running a single node of Riak for your testing purposes, I suspect this 
> may explain the significant divergence you're seeing when compared to the 
> disk space used vs a single mongo, as each replica in Riak is being stored to 
> the same disk.
> 
> Also, Snappy is optimizes for speed over disk utility, which will have a 
> negligible impact on total disk usage when compared to other compression 
> libraries such as zlib, etc. That said, for sufficiently large JSON files I 
> know that BSON's prefixes can add significant overhead to object sizes such 
> that BSON is actually heavier than the JSON it represents. What is the 
> average size of the documents you're seeking to store?
> 
> Could you tell us a bit more about what you're trying to achieve with both 
> Riak and Mongo, respectfully?
> 
> Tom
> 
> On Wed, Apr 10, 2013 at 12:39 AM, Ben McCann <[email protected]> wrote:
> Hi, 
> 
> I'm currently storing data in MongoDB and would like to evaluate Riak as an 
> alternative. Riak is appealing to me because LevelDB uses Snappy, so I would 
> expect it to take less disk space to store my data set than MongoDB which 
> does not use compression. However, when I benchmarked it by inserting a few 
> hundred thousand JSON records into each datastore, Riak in fact took far more 
> disk space. I'm wondering if there's something I might be missing here as a 
> newcomer to Riak. E.g. I checked the disk space used by running "du -ch 
> /var/lib/riak/leveldb". Is this perhaps not a good way to check disk space 
> usage because perhaps Riak/LevelDB preallocates files? (I know MongoDB does 
> this and has a built-in db.collection.stats command to provide true disk 
> usage information). Are there any other reasons why Riak might be taking more 
> space or anything I could have screwed up? 
> 
> Thanks, 
> Ben 
> 
> -- 
> about.me/benmccann 
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to