Great explanation and the blog post, Akhil.
Sorry for the delayed response (somehow didn't notice the email in my
inbox), but this is what I concluded as well.
In addition to compression, I believe the sstable is serialized as well and
the combination of both results into much smaller sstable
On 2017-05-24 17:42 (-0700), preetika tyagi wrote:
> Hi,
>
> I'm running Cassandra with a very small dataset so that the data can exist
> on memtable only. Below are my configurations:
>
> In jvm.options:
>
> -Xms4G
> -Xmx4G
>
> In cassandra.yaml,
>
>
Kevin, Stefan thanks for the positive feedback and questions.
Stefan in the blog post I am writing generally based on Apache Cassandra
defaults. The meltable cleanup threshold is 1/(1+ memtable_flush_writers). By
default the meltable_flush_writers defaults to two. This comes to 33 percent of
Hello Akhil,
thanks for your great blog post.
One thing I cannot bring together:
In the answer mail you write:
"Note the cleanup threshold is .50 of 1GB and not a combination of heap and
off heap space."
In your blog post you write:
"memtable_cleanup_threshold is the default value i.e. 33 percent
Great post Akhil! Thanks for explaining that.
On Mon, May 29, 2017 at 5:43 PM, Akhil Mehra wrote:
> Hi Preetika,
>
> After thinking about your scenario I believe your small SSTable size might
> be due to data compression. By default, all tables enable SSTable
>
Hi Preetika,
After thinking about your scenario I believe your small SSTable size might
be due to data compression. By default, all tables enable SSTable
compression.
Let go through your scenario. Let's say you have allocated 4GB to your
Cassandra node. Your *memtable_heap_space_in_mb* and
I agree that for such a small data, Cassandra is obviously not needed.
However, this is purely an experimental setup by using which I'm trying to
understand how and exactly when memtable flush is triggered. As I mentioned
in my post, I read the documentation and tweaked the parameters accordingly
It doesn't have to fit in memory. If your key distribution has strong
temporal locality, then a larger memtable that can coalesce overwrites
greatly reduces the disk I/O load for the memtable flush and subsequent
compactions. Of course, I have no idea if the is what the OP had in mind.
On
This sounds exactly like a previous post that ended when I asked the person
to document the number of nodes ec2 instance type and size. I suspected a
single nose you system. So the poster reposts? Hmm.
“All men dream, but not equally. Those who dream by night in the dusty
recesses of their minds
Sorry for the confusion. That was for the OP. I wrote it quickly right
after waking up.
What I'm asking is why does the OP want to keep his data in the memtable
exclusively? If the goal is to "make reads fast", then just turn on row
caching.
If there's so little data that it fits in memory
Not sure whether you're asking me or the original poster, but the more
times data gets overwritten in a memtable, the less it has to be
compacted later on (and even without overwrites, larger memtables result
in less compaction).
On 05/25/2017 05:59 PM, Jonathan Haddad wrote:
Why do you
Why do you think keeping your data in the memtable is a what you need to do?
On Thu, May 25, 2017 at 7:16 AM Avi Kivity wrote:
> Then it doesn't have to (it still may, for other reasons).
>
> On 05/25/2017 05:11 PM, preetika tyagi wrote:
>
> What if the commit log is disabled?
Then it doesn't have to (it still may, for other reasons).
On 05/25/2017 05:11 PM, preetika tyagi wrote:
What if the commit log is disabled?
On May 25, 2017 4:31 AM, "Avi Kivity" > wrote:
Cassandra has to flush the memtable occasionally, or
What if the commit log is disabled?
On May 25, 2017 4:31 AM, "Avi Kivity" wrote:
> Cassandra has to flush the memtable occasionally, or the commit log grows
> without bounds.
>
> On 05/25/2017 03:42 AM, preetika tyagi wrote:
>
> Hi,
>
> I'm running Cassandra with a very small
Cassandra has to flush the memtable occasionally, or the commit log
grows without bounds.
On 05/25/2017 03:42 AM, preetika tyagi wrote:
Hi,
I'm running Cassandra with a very small dataset so that the data can
exist on memtable only. Below are my configurations:
In jvm.options:
|-Xms4G
15 matches
Mail list logo