[jira] [Commented] (CASSANDRA-1657) support in-memory column families

Edward Capriolo (JIRA) Wed, 12 Mar 2014 18:27:12 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932748#comment-13932748
 ]


Edward Capriolo commented on CASSANDRA-1657:
--------------------------------------------


{quote}
"yet you most definitely want all the things that Cassandra offers in terms of 
replication, consistency, durability etc."

"In order to semi-deterministically ensure acceptable performance for such 
data, Cassandra could support in-memory column families. Such an in-memory 
column family would imply that mlock() be used on sstables for this column 
family. On start-up and on compaction completion, they could be mmap():ed with 
MAP_POPULATE (Linux specific) or else just mmap():ed + mlock():ed in such a way 
as to otherwise guarantee it is in-memory (such as userland traversal of the 
entire file)." 
{quote}

I totally understand this prospective of letting cassandra operate as it is 
currently doing and simply keep the data on a ram disk or locked in memory, 
however this seems waistful to me in terms of having to use more memory then 
physical data.

I wonder if AOF from redis fits our needs well. It is durable 
http://redis.io/topics/persistence

{quote}
AOF advantages

    Using AOF Redis is much more durable: you can have different fsync 
policies: no fsync at all, fsync every second, fsync at every query. With the 
default policy of fsync every second write performances are still great (fsync 
is performed using a background thread and the main thread will try hard to 
perform writes when no fsync is in progress.) but you can only lose one second 
worth of writes.
    The AOF log is an append only log, so there are no seeks, nor corruption 
problems if there is a power outage. Even if the log ends with an half-written 
command for some reason (disk full or other reasons) the redis-check-aof tool 
is able to fix it easily.
    Redis is able to automatically rewrite the AOF in background when it gets 
too big. The rewrite is completely safe as while Redis continues appending to 
the old file, a completely new one is produced with the minimal set of 
operations needed to create the current data set, and once this second file is 
ready Redis switches the two and starts appending to the new one.
    AOF contains a log of all the operations one after the other in an easy to 
understand and parse format. You can even easily export an AOF file. For 
instance even if you flushed everything for an error using a FLUSHALL command, 
if no rewrite of the log was performed in the meantime you can still save your 
data set just stopping the server, removing the latest command, and restarting 
Redis again.
{quote}

I feel like we may want to make these parameters to the storage engine. If 
people want to play fast and loose they can tune off the durability.

> support in-memory column families
> ---------------------------------
>
>                 Key: CASSANDRA-1657
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1657
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Peter Schuller
>            Assignee: Edward Capriolo
>            Priority: Minor
>
> Some workloads are such that you absolutely depend on column families being 
> in-memory for performance, yet you most definitely want all the things that 
> Cassandra offers in terms of replication, consistency, durability etc.
> In order to semi-deterministically ensure acceptable performance for such 
> data, Cassandra could support in-memory column families. Such an in-memory 
> column family would imply that mlock() be used on sstables for this column 
> family. On start-up and on compaction completion, they could be mmap():ed 
> with MAP_POPULATE (Linux specific) or else just mmap():ed + mlock():ed in 
> such a way as to otherwise guarantee it is in-memory (such as userland 
> traversal of the entire file).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CASSANDRA-1657) support in-memory column families

Reply via email to