If using LevelDB backend, LevelDB has a nice compression (snappy), including CRC checks and all sort of data corruption checks, I have read on this mail list people that has required to disable snappy compression because it renders ZFS useless (not much to compress after that)

Hence, it is kind of related to using ZFS or not, if you go for ZFS whatever variant you will have to support two sub-systems, if you let LevelDB snappy compression on, you won't have to worry about it.

As for backup, Basho provides a sort of cluster-to-cluster replication tool, we built our own in Java, making backups per storage on every node won't make much sense due to CAP/distributed nature, replicating the keys to another cluster is what will make sense.

Hope that helps and is understandable,

Guido.

On 03/10/13 13:54, Pedram Nimreezi wrote:
Not sure what ZFS has to do with snappy compression, as it's a file system not a compression algorithm.. feature wise, ZFS is quite possibly the most enterprise file system around, including advanced data corruption prevention and remote backing up..

This would be a viable option in BSD/Solaris environments, at least for making snapshots.
Might make a nice write up for the Basho blog..

Backups for riak I think require a bit more consideration then file system snapshot send, and should include provisions for transferring data to smaller/larger clusters, transfer
ring ownerships properly, etc.


On Thu, Oct 3, 2013 at 7:15 AM, Guido Medina <[email protected] <mailto:[email protected]>> wrote:

    And for ZFS? I wouldn't recommend it, after Riak 1.4 snappy
    LevelDB compression does a nice job, why take the risk of yet
    another not so enterprise ready compression algorithms.

    I could be wrong though,

    Guido.


    On 03/10/13 12:11, Guido Medina wrote:
    I have heard some SAN's horrors stories too, Riak nodes are so
    cheap that I don't see the point in even having any mirror on the
    node, here my points:

     1. Erlang interprocess communication brings some network usage,
        why yet another network usage on replicating the data? If the
        whole idea of Riak is have your data replicated in different
        nodes.
     2. If a node goes down or die for whatever reason, bring up
        another node and rebuild it.
     3. If you want to really replicate your cluster Riak offers the
        enterprise replication which I'm quite sure will be less
        expensive than a SAN and will warranty to have your cluster
        ready to go somewhere else as a backup.
     4. I would even go further, SSDs are so cheap and Riak nodes are
        so cheap now adays that I would even build a cluster using
        RAID 0 or RAID 5 SSDs (yes, no mirror with RAID 1, if too
        afraid, RAID 5), that will have a great impact on
        performance. Again, if something goes wrong with 1 node,
        refer to point 2.

    SANs and all those "legacy" backup and replication IMHO are meant
    for other products, like an Oracle money eater DB server.

    HTH,

    Guido.

    On 03/10/13 12:00, Brian Akins wrote:
    So, call me naive, but couldn't ZFS be used as Heinze suggested?

    I have some SAN horror stories - both operationally and from an
    economic perspective.


    _______________________________________________
    riak-users mailing list
    [email protected]  <mailto:[email protected]>
    http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



    _______________________________________________
    riak-users mailing list
    [email protected] <mailto:[email protected]>
    http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




--
/* Sincerely
--------------------------------------------------------------
Pedram Nimreezi - Chief Technology Officer  */

// The hardest part of design … is keeping features out. - Donald Norman



_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to