Re: How to avoid flush if the data can fit into memtable

Akhil Mehra Thu, 01 Jun 2017 12:52:22 -0700

Kevin, Stefan thanks for the positive feedback and questions.

Stefan in the blog post I am writing generally based on Apache Cassandra 
defaults. The meltable cleanup threshold is 1/(1+ memtable_flush_writers). By 
default the meltable_flush_writers defaults to two. This comes to 33 percent of 
the allocated memory. I have updated the blog post adding in this missing 
detail :)


In the email I was trying to address the OP’s original question. I mentioned .5 
because the OP had set the memtable_cleanup_threshold to .50. This is 50% of 
the allocated memory. I was also mentioning that clean up triggered when either 
on or off heap memory reaches the clean up threshold. Please refer to 
https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/utils/memory/MemtableCleanerThread.java#L46-L49
 
<https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/utils/memory/MemtableCleanerThread.java#L46-L49>
 .

I hope that helps.

Regards,
Akhil



 
> On 2/06/2017, at 2:04 AM, Stefan Litsche <stefan.lits...@zalando.de> wrote:
> 
> Hello Akhil,
> 
> thanks for your great blog post.
> One thing I cannot bring together:
> In the answer mail you write:
> "Note the cleanup threshold is .50 of 1GB and not a combination of heap and 
> off heap space."
> In your blog post you write:
> "memtable_cleanup_threshold is the default value i.e. 33 percent of the total 
> memtable heap and off heap memory."
> 
> Could you clarify this?
> 
> Thanks
> Stefan
> 
> 
> 2017-05-30 2:43 GMT+02:00 Akhil Mehra <akhilme...@gmail.com>:
> Hi Preetika,
> 
> After thinking about your scenario I believe your small SSTable size might be 
> due to data compression. By default, all tables enable SSTable compression. 
> 
> Let go through your scenario. Let's say you have allocated 4GB to your 
> Cassandra node. Your memtable_heap_space_in_mb and 
> memtable_offheap_space_in_mb  will roughly come to around 1GB. Since you have 
> memtable_cleanup_threshold to .50 table cleanup will be triggered when total 
> allocated memtable space exceeds 1/2GB. Note the cleanup threshold is .50 of 
> 1GB and not a combination of heap and off heap space. This memtable 
> allocation size is the total amount available for all tables on your node. 
> This includes all system related keyspaces. The cleanup process will write 
> the largest memtable to disk.
> 
> For your case, I am assuming that you are on a single node with only one 
> table with insert activity. I do not think the commit log will trigger a 
> flush in this circumstance as by default the commit log has 8192 MB of space 
> unless the commit log is placed on a very small disk. 
> 
> I am assuming your table on disk is smaller than 500MB because of 
> compression. You can disable compression on your table and see if this helps 
> get the desired size.
> 
> I have written up a blog post explaining memtable flushing 
> (http://abiasforaction.net/apache-cassandra-memtable-flush/)
> 
> Let me know if you have any other question. 
> 
> I hope this helps.
> 
> Regards,
> Akhil Mehra 
> 
> 
> On Fri, May 26, 2017 at 6:58 AM, preetika tyagi <preetikaty...@gmail.com> 
> wrote:
> I agree that for such a small data, Cassandra is obviously not needed. 
> However, this is purely an experimental setup by using which I'm trying to 
> understand how and exactly when memtable flush is triggered. As I mentioned 
> in my post, I read the documentation and tweaked the parameters accordingly 
> so that I never hit memtable flush but it is still doing that. As far the the 
> setup is concerned, I'm just using 1 node and running Cassandra using 
> "cassandra -R" option and then running some queries to insert some dummy data.
> 
> I use the schema from CASSANDRA_HOME/tools/cqlstress-insanity-example.yaml 
> and add "durable_writes=false" in the keyspace_definition.
> 
> @Daemeon - The previous post lead to this post but since I was unaware of 
> memtable flush and I assumed memtable flush wasn't happening, the previous 
> post was about something else (throughput/latency etc.). This post is 
> explicitly about exactly when memtable is being dumped to the disk. Didn't 
> want to confuse two different goals that's why posted a new one.
> 
> On Thu, May 25, 2017 at 10:38 AM, Avi Kivity <a...@scylladb.com> wrote:
> It doesn't have to fit in memory. If your key distribution has strong 
> temporal locality, then a larger memtable that can coalesce overwrites 
> greatly reduces the disk I/O load for the memtable flush and subsequent 
> compactions. Of course, I have no idea if the is what the OP had in mind.
> 
> 
> On 05/25/2017 07:14 PM, Jonathan Haddad wrote:
>> Sorry for the confusion.  That was for the OP.  I wrote it quickly right 
>> after waking up.
>> 
>> What I'm asking is why does the OP want to keep his data in the memtable 
>> exclusively?  If the goal is to "make reads fast", then just turn on row 
>> caching.  
>> 
>> If there's so little data that it fits in memory (300MB), and there aren't 
>> going to be any writes past the initial small dataset, why use Cassandra?  
>> It sounds like the wrong tool for this job.  Sounds like something that 
>> could easily be stored in S3 and loaded in memory when the app is fired up.  
>> 
>> On Thu, May 25, 2017 at 8:06 AM Avi Kivity <a...@scylladb.com> wrote:
>> Not sure whether you're asking me or the original poster, but the more times 
>> data gets overwritten in a memtable, the less it has to be compacted later 
>> on (and even without overwrites, larger memtables result in less compaction).
>> 
>> On 05/25/2017 05:59 PM, Jonathan Haddad wrote:
>>> Why do you think keeping your data in the memtable is a what you need to do?
>>> On Thu, May 25, 2017 at 7:16 AM Avi Kivity <a...@scylladb.com> wrote:
>>> Then it doesn't have to (it still may, for other reasons).
>>> 
>>> On 05/25/2017 05:11 PM, preetika tyagi wrote:
>>>> What if the commit log is disabled?
>>>> 
>>>> On May 25, 2017 4:31 AM, "Avi Kivity" <a...@scylladb.com> wrote:
>>>> Cassandra has to flush the memtable occasionally, or the commit log grows 
>>>> without bounds.
>>>> 
>>>> On 05/25/2017 03:42 AM, preetika tyagi wrote:
>>>>> Hi, 
>>>>> 
>>>>> I'm running Cassandra with a very small dataset so that the data can 
>>>>> exist on memtable only. Below are my configurations:
>>>>> In jvm.options:
>>>>> 
>>>>> -Xms4G
>>>>> -Xmx4G
>>>>> 
>>>>> In cassandra.yaml,
>>>>> 
>>>>> memtable_cleanup_threshold: 0.50
>>>>> memtable_allocation_type: heap_buffers
>>>>> 
>>>>> As per the documentation in cassandra.yaml, the memtable_heap_space_in_mb 
>>>>> and memtable_heap_space_in_mb will be set of 1/4 of heap size i.e. 1000MB
>>>>> 
>>>>> According to the documentation here 
>>>>> (http://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__memtable_cleanup_threshold),
>>>>>  the memtable flush will trigger if the total size of memtabl(s) goes 
>>>>> beyond (1000+1000)*0.50=1000MB.
>>>>> 
>>>>> Now if I perform several write requests which results in almost ~300MB of 
>>>>> the data, memtable still gets flushed since I see sstables being created 
>>>>> on file system (Data.db etc.) and I don't understand why.
>>>>> 
>>>>> Could anyone explain this behavior and point out if I'm missing something 
>>>>> here?
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Preetika
>>>>> 
>>>> 
>>> 
>> 
> 
> 
> 
> 
> 
> 
> -- 
> Stefan Litsche | Mobile: +49 176 12759436 E-Mail: stefan.lits...@zalando.de
> 
>

Re: How to avoid flush if the data can fit into memtable

Reply via email to