Dean, what is your row size approximately?

We've been using ii = 512 for a long time because of memory issues, but now - as bloom filter is kept off-heap and memory is not an issue anymore - I've reverted it to 128 to see if this improves anything. It seems it doesn't (except that I have less connections resets reported by Munin's netstat plugin, but I'm not 100% sure if it's related to lower ii, as I don't really believe that disk scan delay difference with ii = 512 may be so huge to timeout connections), but I'm just curious how "far" are we from the point where it will matter to know if this might be an issue soon (our rows are growing in time - not very fast, but they do), so I'm looking for some "reference" / comparison ;-)

Currently, according to cfhistograms, vast majority (~70%) of our rows' size is up to 20KB and the rest is up to 50KB. I wonder if it's the size that really matters in terms of ii value.

M.


W dniu 20.03.2013 13:54, Hiller, Dean pisze:
Oh, and to give you an idea of memory savings, we had a node at 10G RAM
usage...we had upped a few nodes to 16G from 8G as we don't have our new
nodes ready yet(we know we should be at 8G but we would have a dead
cluster if we did that).

On startup, the initial RAM is around 6-8G.  Startup with
index_interval=512 resulted in a 2.5G-2.8G initial RAM and I have seen it
grow to 3.3G and back down to 2.8G.  We just rolled this out an hour ago.
Our website response time is the same as before as well.

We rolled to only 2 nodes(out of 6) in our cluster so far to test it out
and let it soak a bit.  We will slowly roll to more nodes monitoring the
performance as we go.  Also, since dynamic snitch is not working with
SimpleSnitch, we know that just one slow node affects our website(from
personal pain/experience of nodes hitting RAM limit and slowing down
causing website to get real slow).

Dean

On 3/20/13 6:41 AM, "Andras Szerdahelyi"
<andras.szerdahe...@ignitionone.com> wrote:

2. Upping index_interval from 128 to 512 (this seemed to reduce our memory
usage significantly!!!)


I'd be very careful with that as a one-stop improvement solution for two
reasons AFAIK
1) you have to rebuild stables ( not an issue if you are evaluating, doing
test writes.. Etc, not so much in production )
2) it can affect reads ( number of sstable reads to serve a read )
especially if your key/row cache is ineffective

On 20/03/13 13:34, "Hiller, Dean" <dean.hil...@nrel.gov> wrote:

Also, look at the cassandra logs.  I bet you see the typicalŠblah blah is
at 0.85, doing memory cleanup which is not exactly GC but cassandra
memory
managementŠ..and of course, you have GC on top of that.

If you need to get your memory down, there are multiple ways
1. Switching size tiered compaction to leveled compaction(with 1 billion
narrow rows, this helped us quite a bit)
2. Upping index_interval from 128 to 512 (this seemed to reduce our
memory
usage significantly!!!)
3. Just add more nodes as moving the rows to other servers reduces memory
>from #1 and #2 above since the server would have less rows

Later,
Dean

On 3/20/13 6:29 AM, "Andras Szerdahelyi"
<andras.szerdahe...@ignitionone.com> wrote:


I'd say GC. Please fill in form CASS-FREEZE-001 below and get back to us
:-) ( sorry )

How big is your JVM heap ? How many CPUs ?
Garbage collection taking long ? ( look for log lines from GCInspector)
Running out of heap ? ( "heap is .. full" log lines )
Any tasks backing up / being dropped ? ( nodetool tpstats and "..
dropped
in last .. ms" log lines )
Are writes really slow? ( nodetool cfhistograms Keyspace ColumnFamily )

How much is lots of data? Wide or skinny rows? Mutations/sec ?
Which Compaction Strategy are you using? Output of show schema (
cassandra-cli ) for the relevant Keyspace/CF might help as well

What consistency are you doing your writes with ? I assume ONE or ANY if
you have a single node.

What are the values for these settings in cassandra.yaml

memtable_total_space_in_mb:
memtable_flush_writers:
memtable_flush_queue_size:
compaction_throughput_mb_per_sec:

concurrent_writes:



Which version of Cassandra?



Regards,
Andras

From:  Joel Samuelsson <samuelsson.j...@gmail.com>
Reply-To:  "user@cassandra.apache.org" <user@cassandra.apache.org>
Date:  Wednesday 20 March 2013 13:06
To:  "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject:  Cassandra freezes


Hello,

I've been trying to load test a one node cassandra cluster. When I add
lots of data, the Cassandra node freezes for 4-5 minutes during which
neither reads nor writes are served.
During this time, Cassandra takes 100% of a single CPU core.
My initial thought was that this was Cassandra flushing memtables to the
disk, however, the disk i/o is very low during this time.
Any idea what my problem could be?
I'm running in a virtual environment in which I have no control of
drives.
So commit log and data directory is (probably) on the same drive.

Best regards,
Joel Samuelsson




Reply via email to