Re: index_interval

2018-02-03 Thread Jonathan Haddad
I would also optimize for your worst case, which is hitting zero caches.
If you're using the default settings when creating a table, you're going to
get compression settings that are terrible for reads.  If you've got memory
to spare, I suggest changing your chunk_length_in_kb to 4 and disabling
readahead on your drives entirely.  I've seen 50-100x improvement in read
latency and throughput just by changing those settings.  I just did a talk
on this topic last week, slides are here:
https://www.slideshare.net/JonHaddad/performance-tuning-86995333

Jon

On Wed, Jul 12, 2017 at 2:03 PM Jeff Jirsa <jji...@apache.org> wrote:

>
>
> On 2017-07-12 12:03 (-0700), Fay Hou [Storage Service] ­ <
> fay...@coupang.com> wrote:
> > First, a big thank to Jeff who spent endless time to help this mailing
> list.
> > Agreed that we should tune the key cache. In my case, my key cache hit
> rate
> > is about 20%. mainly because we do random read. We just going to leave
> the
> > index_interval as is for now.
> >
>
> That's pretty painful. If you can up that a bit, it'll probably help you
> out. You can adjust the index intervals, too, but I'd significantly
> increase key cache size first if it were my cluster.
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: index_interval

2017-07-12 Thread Jeff Jirsa


On 2017-07-12 12:03 (-0700), Fay Hou [Storage Service] ­ <fay...@coupang.com> 
wrote: 
> First, a big thank to Jeff who spent endless time to help this mailing list.
> Agreed that we should tune the key cache. In my case, my key cache hit rate
> is about 20%. mainly because we do random read. We just going to leave the
> index_interval as is for now.
> 

That's pretty painful. If you can up that a bit, it'll probably help you out. 
You can adjust the index intervals, too, but I'd significantly increase key 
cache size first if it were my cluster.


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: index_interval

2017-07-12 Thread Fay Hou [Storage Service] ­
First, a big thank to Jeff who spent endless time to help this mailing list.
Agreed that we should tune the key cache. In my case, my key cache hit rate
is about 20%. mainly because we do random read. We just going to leave the
index_interval as is for now.

On Mon, Jul 10, 2017 at 8:47 PM, Jeff Jirsa <jji...@apache.org> wrote:

>
>
> On 2017-07-10 15:09 (-0700), Fay Hou [Storage Service] ­ <
> fay...@coupang.com> wrote:
> > BY defaults:
> >
> > AND max_index_interval = 2048
> > AND memtable_flush_period_in_ms = 0
> > AND min_index_interval = 128
> >
> > "Cassandra maintains index offsets per partition to speed up the lookup
> > process in the case of key cache misses (see cassandra read path overview
> > <http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/
> dml_about_reads_c.html>).
> > By default it samples a subset of keys, somewhat similar to a skip list.
> > The sampling interval is configurable with min_index_interval and
> > max_index_interval CQL schema attributes (see describe table). For
> > relatively large blobs like HTML pages we seem to get better read
> latencies
> > by lowering the sampling interval from 128 min / 2048 max to 64 min / 512
> > max. For large tables like parsoid HTML with ~500G load per node this
> > change adds a modest ~25mb off-heap memory."
> >
> > I wonder if any one has experience on working with max and min
> index_interval
> > to increase the read speed.
>
> It's usually more efficient to try to tune the key cache, and hope you
> never have to hit the partition index at all. Do you have reason to believe
> you're spending an inordinate amount of IO scanning the partition index? Do
> you know what your key cache hit rate is?
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: index_interval

2017-07-10 Thread Jeff Jirsa


On 2017-07-10 15:09 (-0700), Fay Hou [Storage Service] ­ <fay...@coupang.com> 
wrote: 
> BY defaults:
> 
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> 
> "Cassandra maintains index offsets per partition to speed up the lookup
> process in the case of key cache misses (see cassandra read path overview
> <http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_about_reads_c.html>).
> By default it samples a subset of keys, somewhat similar to a skip list.
> The sampling interval is configurable with min_index_interval and
> max_index_interval CQL schema attributes (see describe table). For
> relatively large blobs like HTML pages we seem to get better read latencies
> by lowering the sampling interval from 128 min / 2048 max to 64 min / 512
> max. For large tables like parsoid HTML with ~500G load per node this
> change adds a modest ~25mb off-heap memory."
> 
> I wonder if any one has experience on working with max and min index_interval
> to increase the read speed.

It's usually more efficient to try to tune the key cache, and hope you never 
have to hit the partition index at all. Do you have reason to believe you're 
spending an inordinate amount of IO scanning the partition index? Do you know 
what your key cache hit rate is? 


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



index_interval

2017-07-10 Thread Fay Hou [Storage Service] ­
BY defaults:

AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128

"Cassandra maintains index offsets per partition to speed up the lookup
process in the case of key cache misses (see cassandra read path overview
<http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_about_reads_c.html>).
By default it samples a subset of keys, somewhat similar to a skip list.
The sampling interval is configurable with min_index_interval and
max_index_interval CQL schema attributes (see describe table). For
relatively large blobs like HTML pages we seem to get better read latencies
by lowering the sampling interval from 128 min / 2048 max to 64 min / 512
max. For large tables like parsoid HTML with ~500G load per node this
change adds a modest ~25mb off-heap memory."

I wonder if any one has experience on working with max and min index_interval
to increase the read speed.

Thanks,
Fay


Re: index_interval

2013-06-17 Thread Robert Coli
On Mon, May 13, 2013 at 9:19 PM, Bryan Talbot btal...@aeriagames.com wrote:
 Can the index sample storage be treated more like key cache or row cache
 where the total space used can be limited to something less than all
 available system ram, and space is recycled using an LRU (or configurable)
 algorithm?

Treating it with LRU doesn't seem to make that much sense, but there's
seemingly-trivial ways to prune an Index Sample [1] like
delete-every-other-key.

Brief conversation with driftx suggests a lack of enthusiasm for the
scale of win potential from active pruning of the Index Sample,
especially given the relative size of bloom filters compared to the
Index Sample.

However if you are interested in this as a potential improvement, feel
free to file a JIRA! :D

=Rob

[1] New terminology Partition Summary per jbellis keynote @ summit2013


Re: index_interval

2013-06-17 Thread Thomas Bernhardt





 From: Robert Coli rc...@eventbrite.com
To: user@cassandra.apache.org 
Sent: Monday, June 17, 2013 3:28 PM
Subject: Re: index_interval
 

On Mon, May 13, 2013 at 9:19 PM, Bryan Talbot btal...@aeriagames.com wrote:
 Can the index sample storage be treated more like key cache or row cache
 where the total space used can be limited to something less than all
 available system ram, and space is recycled using an LRU (or configurable)
 algorithm?

Treating it with LRU doesn't seem to make that much sense, but there's
seemingly-trivial ways to prune an Index Sample [1] like
delete-every-other-key.

Brief conversation with driftx suggests a lack of enthusiasm for the
scale of win potential from active pruning of the Index Sample,
especially given the relative size of bloom filters compared to the
Index Sample.

However if you are interested in this as a potential improvement, feel
free to file a JIRA! :D

=Rob

[1] New terminology Partition Summary per jbellis keynote @ summit2013

Re: index_interval

2013-05-13 Thread Bryan Talbot
So will cassandra provide a way to limit its off-heap usage to avoid
unexpected OOM kills?  I'd much rather have performance degrade when 100%
of the index samples no longer fit in memory rather than the process being
killed with no way to stabilize it without adding hardware or removing data.

-Bryan


On Fri, May 10, 2013 at 7:44 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 If you use your off heap memory linux has an OOM killer, that will kill a
 random tasks.


 On Fri, May 10, 2013 at 11:34 AM, Bryan Talbot btal...@aeriagames.comwrote:

 If off-heap memory (for indes samples, bloom filters, row caches, key
 caches, etc) is exhausted, will cassandra experience a memory allocation
 error and quit?  If so, are there plans to make the off-heap usage more
 dynamic to allow less used pages to be replaced with hot data and the
 paged-out / cold data read back in again on demand?





Re: index_interval

2013-05-10 Thread Bryan Talbot
If off-heap memory (for indes samples, bloom filters, row caches, key
caches, etc) is exhausted, will cassandra experience a memory allocation
error and quit?  If so, are there plans to make the off-heap usage more
dynamic to allow less used pages to be replaced with hot data and the
paged-out / cold data read back in again on demand?

-Bryan



On Wed, May 8, 2013 at 4:24 PM, Jonathan Ellis jbel...@gmail.com wrote:

 index_interval won't be going away, but you won't need to change it as
 often in 2.0: https://issues.apache.org/jira/browse/CASSANDRA-5521

 On Mon, May 6, 2013 at 12:27 PM, Hiller, Dean dean.hil...@nrel.gov
 wrote:
  I heard a rumor that index_interval is going away?  What is the
 replacement for this?  (we have been having to play with this setting a lot
 lately as too big and it gets slow yet too small and cassandra uses way too
 much RAM…we are still trying to find the right balance with this setting).
 
  Thanks,
  Dean



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder, http://www.datastax.com
 @spyced



Re: index_interval

2013-05-10 Thread Edward Capriolo
If you use your off heap memory linux has an OOM killer, that will kill a
random tasks.


On Fri, May 10, 2013 at 11:34 AM, Bryan Talbot btal...@aeriagames.comwrote:

 If off-heap memory (for indes samples, bloom filters, row caches, key
 caches, etc) is exhausted, will cassandra experience a memory allocation
 error and quit?  If so, are there plans to make the off-heap usage more
 dynamic to allow less used pages to be replaced with hot data and the
 paged-out / cold data read back in again on demand?

 -Bryan



 On Wed, May 8, 2013 at 4:24 PM, Jonathan Ellis jbel...@gmail.com wrote:

 index_interval won't be going away, but you won't need to change it as
 often in 2.0: https://issues.apache.org/jira/browse/CASSANDRA-5521

 On Mon, May 6, 2013 at 12:27 PM, Hiller, Dean dean.hil...@nrel.gov
 wrote:
  I heard a rumor that index_interval is going away?  What is the
 replacement for this?  (we have been having to play with this setting a lot
 lately as too big and it gets slow yet too small and cassandra uses way too
 much RAM…we are still trying to find the right balance with this setting).
 
  Thanks,
  Dean



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder, http://www.datastax.com
 @spyced





Re: index_interval

2013-05-08 Thread Jonathan Ellis
index_interval won't be going away, but you won't need to change it as
often in 2.0: https://issues.apache.org/jira/browse/CASSANDRA-5521

On Mon, May 6, 2013 at 12:27 PM, Hiller, Dean dean.hil...@nrel.gov wrote:
 I heard a rumor that index_interval is going away?  What is the replacement 
 for this?  (we have been having to play with this setting a lot lately as too 
 big and it gets slow yet too small and cassandra uses way too much RAM…we are 
 still trying to find the right balance with this setting).

 Thanks,
 Dean



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced


Re: index_interval

2013-05-06 Thread aaron morton
This is the closest I can find in Jira 
https://issues.apache.org/jira/browse/CASSANDRA-4478

It's a pretty handy tool to have in your tool kit, specially when you start to 
have over 1 billion rows per node.

A
 
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 7/05/2013, at 5:27 AM, Hiller, Dean dean.hil...@nrel.gov wrote:

 I heard a rumor that index_interval is going away?  What is the replacement 
 for this?  (we have been having to play with this setting a lot lately as too 
 big and it gets slow yet too small and cassandra uses way too much RAM…we are 
 still trying to find the right balance with this setting).
 
 Thanks,
 Dean



Re: index_interval file size is the same after modifying 128 to 512?

2013-03-26 Thread Michal Michalski
Dean, as I can see you are satisfied with the result of increasing ii 
from 128 to 512, didn't you observed any drawbacks of this change? I 
remember you mentioned no change in Read Latency and a significant drop 
of heap size, but did you check any other metrics?


I did the opposite (512 - 128; before we've had problems with heap 
size, now we can revert it, so I check if it makes sense) and I do not 
see almost any difference in Read Latency too, but I can see that the 
number of dropped READ messages has decreased significantly (it's 1 or 
even 2 orders of magnitude lower for the nodes I set ii = 128 comparing 
to the nodes with ii = 512; the exact value is about 0.005 / sec. 
comparing to about 0.01 - 0.2 for other nodes) and I have much less 
connection resets reported by netstat's Munin plugin. In other words, as 
I understand it - there's much less timeouts which should improve 
overall C* performance, even if I can't see it in read latency graph for 
CFs (unluckily I don't have a graph for StorageProxy latencies to easily 
check it).


To make sure about the reason of this differences and its effect on C* 
performance, I'm looking for some references in other people's 
experience / observations :-)


M.

W dniu 22.03.2013 17:17, Hiller, Dean pisze:

I was just curious.  Our RAM has significantly reduced but the *Index.db files 
are the same size size as before.

Any ideas why this would be the case?

Basically, Why is our disk size not reduced since RAM is way lower?  We are 
running strong now with 512 index_interval for past 2-3 days and RAM never 
looked better.  We were pushing 10G before and now we are 2G slowing increasing 
to 8G before gc compacts the long lived stuff which goes back down to 2G 
again…..very pleased with LCS in our system!

Thanks,
Dean





Re: index_interval file size is the same after modifying 128 to 512?

2013-03-26 Thread Hiller, Dean
We only look at our program's response time at the high level and have a
scatter plot.  The scatter plot shows no real differences so even though
what you say may be true, our end users are not seeing any differences.  I
have not checked the any further because the high level use cases look
great.  

Dean

On 3/26/13 2:35 AM, Michal Michalski mich...@opera.com wrote:

Dean, as I can see you are satisfied with the result of increasing ii
from 128 to 512, didn't you observed any drawbacks of this change? I
remember you mentioned no change in Read Latency and a significant drop
of heap size, but did you check any other metrics?

I did the opposite (512 - 128; before we've had problems with heap
size, now we can revert it, so I check if it makes sense) and I do not
see almost any difference in Read Latency too, but I can see that the
number of dropped READ messages has decreased significantly (it's 1 or
even 2 orders of magnitude lower for the nodes I set ii = 128 comparing
to the nodes with ii = 512; the exact value is about 0.005 / sec.
comparing to about 0.01 - 0.2 for other nodes) and I have much less
connection resets reported by netstat's Munin plugin. In other words, as
I understand it - there's much less timeouts which should improve
overall C* performance, even if I can't see it in read latency graph for
CFs (unluckily I don't have a graph for StorageProxy latencies to easily
check it).

To make sure about the reason of this differences and its effect on C*
performance, I'm looking for some references in other people's
experience / observations :-)

M.

W dniu 22.03.2013 17:17, Hiller, Dean pisze:
 I was just curious.  Our RAM has significantly reduced but the
*Index.db files are the same size size as before.

 Any ideas why this would be the case?

 Basically, Why is our disk size not reduced since RAM is way lower?  We
are running strong now with 512 index_interval for past 2-3 days and RAM
never looked better.  We were pushing 10G before and now we are 2G
slowing increasing to 8G before gc compacts the long lived stuff which
goes back down to 2G againŠ..very pleased with LCS in our system!

 Thanks,
 Dean





index_interval file size is the same after modifying 128 to 512?

2013-03-22 Thread Hiller, Dean
I was just curious.  Our RAM has significantly reduced but the *Index.db files 
are the same size size as before.

Any ideas why this would be the case?

Basically, Why is our disk size not reduced since RAM is way lower?  We are 
running strong now with 512 index_interval for past 2-3 days and RAM never 
looked better.  We were pushing 10G before and now we are 2G slowing increasing 
to 8G before gc compacts the long lived stuff which goes back down to 2G 
again…..very pleased with LCS in our system!

Thanks,
Dean


Re: index_interval file size is the same after modifying 128 to 512?

2013-03-22 Thread Yuki Morishita
Index.db file always contains *all* position of the keys in data file.
index_interval is the rate that the position of the key in index file is store 
in memory.
So that C* can begin scanning index file from closest position.



On Friday, March 22, 2013 at 11:17 AM, Hiller, Dean wrote:

 I was just curious. Our RAM has significantly reduced but the *Index.db files 
 are the same size size as before.
  
 Any ideas why this would be the case?
  
 Basically, Why is our disk size not reduced since RAM is way lower? We are 
 running strong now with 512 index_interval for past 2-3 days and RAM never 
 looked better. We were pushing 10G before and now we are 2G slowing 
 increasing to 8G before gc compacts the long lived stuff which goes back down 
 to 2G again…..very pleased with LCS in our system!
  
 Thanks,
 Dean
  
  




Re: index_interval memory savings in our case(if you are curious)Š (and performance result)...

2013-03-21 Thread Andras Szerdahelyi
Wow. SO LCS with bloom filter fp chance of 0.1 and an index sampling rate
of 512 on a column family of 1.7billion rows each node yields 100% result
on first sstable reads? That sounds amazing. And I assume this is
cfhistograms output from a node that has been on 512 for a while? ( I
still think its unlikely 1.2x re-samples sstables on startup -- I'm on on
1.1x though ) For LCS, same fp chance and sampling rate, with 300-500mil
rows per node ( 300-400GB ) on 1.1x my sstable reads for a single read got
pretty much out of control.

On 20/03/13 14:35, Hiller, Dean dean.hil...@nrel.gov wrote:

I am using LCS so bloom filter fp default for 1.2.2 is 0.1 so my
bloomfilter size is 1.27G RAM(nodetool cfstats)1.7 billion rows each
node.

My cfstats for this CF is attached(Since cut and paste screwed up the
formatting).  During testing in QA, we were not sure if index_interval
change was working so we dug into the code to find out, it basically seems
to immediately convert on startup though doesn't log anything except at a
debug level which we don't have on.

Dean



On 3/20/13 6:58 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:

I am curious, thanks. ( I am in the same situation, big nodes choking
under 300-400G data load, 500mil keys )

How does your cfhistograms Keyspace CF output look like? How many
sstable reads ?
What is your bloom filter fp chance ?

Regards,
Andras

On 20/03/13 13:54, Hiller, Dean dean.hil...@nrel.gov wrote:

Oh, and to give you an idea of memory savings, we had a node at 10G RAM
usage...we had upped a few nodes to 16G from 8G as we don't have our new
nodes ready yet(we know we should be at 8G but we would have a dead
cluster if we did that).

On startup, the initial RAM is around 6-8G.  Startup with
index_interval=512 resulted in a 2.5G-2.8G initial RAM and I have seen
it
grow to 3.3G and back down to 2.8G.  We just rolled this out an hour
ago.
Our website response time is the same as before as well.

We rolled to only 2 nodes(out of 6) in our cluster so far to test it out
and let it soak a bit.  We will slowly roll to more nodes monitoring the
performance as we go.  Also, since dynamic snitch is not working with
SimpleSnitch, we know that just one slow node affects our website(from
personal pain/experience of nodes hitting RAM limit and slowing down
causing website to get real slow).

Dean

On 3/20/13 6:41 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:

2. Upping index_interval from 128 to 512 (this seemed to reduce our
memory
usage significantly!!!)


I'd be very careful with that as a one-stop improvement solution for
two
reasons AFAIK
1) you have to rebuild stables ( not an issue if you are evaluating,
doing
test writes.. Etc, not so much in production )
2) it can affect reads ( number of sstable reads to serve a read )
especially if your key/row cache is ineffective

On 20/03/13 13:34, Hiller, Dean dean.hil...@nrel.gov wrote:

Also, look at the cassandra logs.  I bet you see the typicalŠblah blah
is
at 0.85, doing memory cleanup which is not exactly GC but cassandra
memory
managementŠ..and of course, you have GC on top of that.

If you need to get your memory down, there are multiple ways
1. Switching size tiered compaction to leveled compaction(with 1
billion
narrow rows, this helped us quite a bit)
2. Upping index_interval from 128 to 512 (this seemed to reduce our
memory
usage significantly!!!)
3. Just add more nodes as moving the rows to other servers reduces
memory
from #1 and #2 above since the server would have less rows

Later,
Dean

On 3/20/13 6:29 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:


I'd say GC. Please fill in form CASS-FREEZE-001 below and get back to
us
:-) ( sorry )

How big is your JVM heap ? How many CPUs ?
Garbage collection taking long ? ( look for log lines from
GCInspector)
Running out of heap ? ( heap is .. full log lines )
Any tasks backing up / being dropped ? ( nodetool tpstats and ..
dropped
in last .. ms log lines )
Are writes really slow? ( nodetool cfhistograms Keyspace ColumnFamily
)

How much is lots of data? Wide or skinny rows? Mutations/sec ?
Which Compaction Strategy are you using? Output of show schema (
cassandra-cli ) for the relevant Keyspace/CF might help as well

What consistency are you doing your writes with ? I assume ONE or ANY
if
you have a single node.

What are the values for these settings in cassandra.yaml

memtable_total_space_in_mb:
memtable_flush_writers:
memtable_flush_queue_size:
compaction_throughput_mb_per_sec:

concurrent_writes:



Which version of Cassandra?



Regards,
Andras

From:  Joel Samuelsson samuelsson.j...@gmail.com
Reply-To:  user@cassandra.apache.org user@cassandra.apache.org
Date:  Wednesday 20 March 2013 13:06
To:  user@cassandra.apache.org user@cassandra.apache.org
Subject:  Cassandra freezes


Hello,

I've been trying to load test a one node cassandra cluster. When I
add
lots of data, the Cassandra node freezes for 4-5 minutes during which

Re: index_interval memory savings in our case(if you are curious)Š (and performance result)...

2013-03-21 Thread Michal Michalski

Dean, what is your row size approximately?

We've been using ii = 512 for a long time because of memory issues, but 
now - as bloom filter is kept off-heap and memory is not an issue 
anymore - I've reverted it to 128 to see if this improves anything. It 
seems it doesn't (except that I have less connections resets reported by 
Munin's netstat plugin, but I'm not 100% sure if it's related to lower 
ii, as I don't really believe that disk scan delay difference with ii = 
512 may be so huge to timeout connections), but I'm just curious how 
far are we from the point where it will matter to know if this might 
be an issue soon (our rows are growing in time - not very fast, but they 
do), so I'm looking for some reference / comparison ;-)


Currently, according to cfhistograms, vast majority (~70%) of our rows' 
size is up to 20KB and the rest is up to 50KB. I wonder if it's the size 
that really matters in terms of ii value.


M.


W dniu 20.03.2013 13:54, Hiller, Dean pisze:

Oh, and to give you an idea of memory savings, we had a node at 10G RAM
usage...we had upped a few nodes to 16G from 8G as we don't have our new
nodes ready yet(we know we should be at 8G but we would have a dead
cluster if we did that).

On startup, the initial RAM is around 6-8G.  Startup with
index_interval=512 resulted in a 2.5G-2.8G initial RAM and I have seen it
grow to 3.3G and back down to 2.8G.  We just rolled this out an hour ago.
Our website response time is the same as before as well.

We rolled to only 2 nodes(out of 6) in our cluster so far to test it out
and let it soak a bit.  We will slowly roll to more nodes monitoring the
performance as we go.  Also, since dynamic snitch is not working with
SimpleSnitch, we know that just one slow node affects our website(from
personal pain/experience of nodes hitting RAM limit and slowing down
causing website to get real slow).

Dean

On 3/20/13 6:41 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:


2. Upping index_interval from 128 to 512 (this seemed to reduce our memory
usage significantly!!!)


I'd be very careful with that as a one-stop improvement solution for two
reasons AFAIK
1) you have to rebuild stables ( not an issue if you are evaluating, doing
test writes.. Etc, not so much in production )
2) it can affect reads ( number of sstable reads to serve a read )
especially if your key/row cache is ineffective

On 20/03/13 13:34, Hiller, Dean dean.hil...@nrel.gov wrote:


Also, look at the cassandra logs.  I bet you see the typicalŠblah blah is
at 0.85, doing memory cleanup which is not exactly GC but cassandra
memory
managementŠ..and of course, you have GC on top of that.

If you need to get your memory down, there are multiple ways
1. Switching size tiered compaction to leveled compaction(with 1 billion
narrow rows, this helped us quite a bit)
2. Upping index_interval from 128 to 512 (this seemed to reduce our
memory
usage significantly!!!)
3. Just add more nodes as moving the rows to other servers reduces memory

from #1 and #2 above since the server would have less rows


Later,
Dean

On 3/20/13 6:29 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:



I'd say GC. Please fill in form CASS-FREEZE-001 below and get back to us
:-) ( sorry )

How big is your JVM heap ? How many CPUs ?
Garbage collection taking long ? ( look for log lines from GCInspector)
Running out of heap ? ( heap is .. full log lines )
Any tasks backing up / being dropped ? ( nodetool tpstats and ..
dropped
in last .. ms log lines )
Are writes really slow? ( nodetool cfhistograms Keyspace ColumnFamily )

How much is lots of data? Wide or skinny rows? Mutations/sec ?
Which Compaction Strategy are you using? Output of show schema (
cassandra-cli ) for the relevant Keyspace/CF might help as well

What consistency are you doing your writes with ? I assume ONE or ANY if
you have a single node.

What are the values for these settings in cassandra.yaml

memtable_total_space_in_mb:
memtable_flush_writers:
memtable_flush_queue_size:
compaction_throughput_mb_per_sec:

concurrent_writes:



Which version of Cassandra?



Regards,
Andras

From:  Joel Samuelsson samuelsson.j...@gmail.com
Reply-To:  user@cassandra.apache.org user@cassandra.apache.org
Date:  Wednesday 20 March 2013 13:06
To:  user@cassandra.apache.org user@cassandra.apache.org
Subject:  Cassandra freezes


Hello,

I've been trying to load test a one node cassandra cluster. When I add
lots of data, the Cassandra node freezes for 4-5 minutes during which
neither reads nor writes are served.
During this time, Cassandra takes 100% of a single CPU core.
My initial thought was that this was Cassandra flushing memtables to the
disk, however, the disk i/o is very low during this time.
Any idea what my problem could be?
I'm running in a virtual environment in which I have no control of
drives.
So commit log and data directory is (probably) on the same drive.

Best regards,
Joel Samuelsson









Re: index_interval memory savings in our case(if you are curious)Š (and performance result)...

2013-03-21 Thread Michal Michalski
Argh, now I think that row size has nothing to do with the ii-based 
index size/efficiency (I was thinking about the need of reading 
index_interval / 2 entries in average from index file before finding the 
proper one, but it should not have nothing to do with row size) - forget 
the question; need to get a second coffee ;-)


M.

W dniu 21.03.2013 09:29, Michal Michalski pisze:

Dean, what is your row size approximately?

We've been using ii = 512 for a long time because of memory issues, but
now - as bloom filter is kept off-heap and memory is not an issue
anymore - I've reverted it to 128 to see if this improves anything. It
seems it doesn't (except that I have less connections resets reported by
Munin's netstat plugin, but I'm not 100% sure if it's related to lower
ii, as I don't really believe that disk scan delay difference with ii =
512 may be so huge to timeout connections), but I'm just curious how
far are we from the point where it will matter to know if this might
be an issue soon (our rows are growing in time - not very fast, but they
do), so I'm looking for some reference / comparison ;-)

Currently, according to cfhistograms, vast majority (~70%) of our rows'
size is up to 20KB and the rest is up to 50KB. I wonder if it's the size
that really matters in terms of ii value.

M.


W dniu 20.03.2013 13:54, Hiller, Dean pisze:

Oh, and to give you an idea of memory savings, we had a node at 10G RAM
usage...we had upped a few nodes to 16G from 8G as we don't have our new
nodes ready yet(we know we should be at 8G but we would have a dead
cluster if we did that).

On startup, the initial RAM is around 6-8G.  Startup with
index_interval=512 resulted in a 2.5G-2.8G initial RAM and I have seen it
grow to 3.3G and back down to 2.8G.  We just rolled this out an hour ago.
Our website response time is the same as before as well.

We rolled to only 2 nodes(out of 6) in our cluster so far to test it out
and let it soak a bit.  We will slowly roll to more nodes monitoring the
performance as we go.  Also, since dynamic snitch is not working with
SimpleSnitch, we know that just one slow node affects our website(from
personal pain/experience of nodes hitting RAM limit and slowing down
causing website to get real slow).

Dean

On 3/20/13 6:41 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:


2. Upping index_interval from 128 to 512 (this seemed to reduce our
memory
usage significantly!!!)


I'd be very careful with that as a one-stop improvement solution for two
reasons AFAIK
1) you have to rebuild stables ( not an issue if you are evaluating,
doing
test writes.. Etc, not so much in production )
2) it can affect reads ( number of sstable reads to serve a read )
especially if your key/row cache is ineffective

On 20/03/13 13:34, Hiller, Dean dean.hil...@nrel.gov wrote:


Also, look at the cassandra logs.  I bet you see the typicalŠblah
blah is
at 0.85, doing memory cleanup which is not exactly GC but cassandra
memory
managementŠ..and of course, you have GC on top of that.

If you need to get your memory down, there are multiple ways
1. Switching size tiered compaction to leveled compaction(with 1
billion
narrow rows, this helped us quite a bit)
2. Upping index_interval from 128 to 512 (this seemed to reduce our
memory
usage significantly!!!)
3. Just add more nodes as moving the rows to other servers reduces
memory

from #1 and #2 above since the server would have less rows


Later,
Dean

On 3/20/13 6:29 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:



I'd say GC. Please fill in form CASS-FREEZE-001 below and get back
to us
:-) ( sorry )

How big is your JVM heap ? How many CPUs ?
Garbage collection taking long ? ( look for log lines from
GCInspector)
Running out of heap ? ( heap is .. full log lines )
Any tasks backing up / being dropped ? ( nodetool tpstats and ..
dropped
in last .. ms log lines )
Are writes really slow? ( nodetool cfhistograms Keyspace
ColumnFamily )

How much is lots of data? Wide or skinny rows? Mutations/sec ?
Which Compaction Strategy are you using? Output of show schema (
cassandra-cli ) for the relevant Keyspace/CF might help as well

What consistency are you doing your writes with ? I assume ONE or
ANY if
you have a single node.

What are the values for these settings in cassandra.yaml

memtable_total_space_in_mb:
memtable_flush_writers:
memtable_flush_queue_size:
compaction_throughput_mb_per_sec:

concurrent_writes:



Which version of Cassandra?



Regards,
Andras

From:  Joel Samuelsson samuelsson.j...@gmail.com
Reply-To:  user@cassandra.apache.org user@cassandra.apache.org
Date:  Wednesday 20 March 2013 13:06
To:  user@cassandra.apache.org user@cassandra.apache.org
Subject:  Cassandra freezes


Hello,

I've been trying to load test a one node cassandra cluster. When I add
lots of data, the Cassandra node freezes for 4-5 minutes during which
neither reads nor writes are served.
During this time, Cassandra takes 100% of a single CPU core.
My initial

Re: index_interval memory savings in our case(if you are curious)Š (and performance result)...

2013-03-21 Thread Hiller, Dean
It had only been running for 2 hours back then, but it has been a full 24
hours now and our read ping program is still showing the same read times
pretty consistently.

Dean

On 3/21/13 1:51 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:

Wow. SO LCS with bloom filter fp chance of 0.1 and an index sampling rate
of 512 on a column family of 1.7billion rows each node yields 100% result
on first sstable reads? That sounds amazing. And I assume this is
cfhistograms output from a node that has been on 512 for a while? ( I
still think its unlikely 1.2x re-samples sstables on startup -- I'm on on
1.1x though ) For LCS, same fp chance and sampling rate, with 300-500mil
rows per node ( 300-400GB ) on 1.1x my sstable reads for a single read got
pretty much out of control.

On 20/03/13 14:35, Hiller, Dean dean.hil...@nrel.gov wrote:

I am using LCS so bloom filter fp default for 1.2.2 is 0.1 so my
bloomfilter size is 1.27G RAM(nodetool cfstats)1.7 billion rows each
node.

My cfstats for this CF is attached(Since cut and paste screwed up the
formatting).  During testing in QA, we were not sure if index_interval
change was working so we dug into the code to find out, it basically
seems
to immediately convert on startup though doesn't log anything except at a
debug level which we don't have on.

Dean



On 3/20/13 6:58 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:

I am curious, thanks. ( I am in the same situation, big nodes choking
under 300-400G data load, 500mil keys )

How does your cfhistograms Keyspace CF output look like? How many
sstable reads ?
What is your bloom filter fp chance ?

Regards,
Andras

On 20/03/13 13:54, Hiller, Dean dean.hil...@nrel.gov wrote:

Oh, and to give you an idea of memory savings, we had a node at 10G RAM
usage...we had upped a few nodes to 16G from 8G as we don't have our
new
nodes ready yet(we know we should be at 8G but we would have a dead
cluster if we did that).

On startup, the initial RAM is around 6-8G.  Startup with
index_interval=512 resulted in a 2.5G-2.8G initial RAM and I have seen
it
grow to 3.3G and back down to 2.8G.  We just rolled this out an hour
ago.
Our website response time is the same as before as well.

We rolled to only 2 nodes(out of 6) in our cluster so far to test it
out
and let it soak a bit.  We will slowly roll to more nodes monitoring
the
performance as we go.  Also, since dynamic snitch is not working with
SimpleSnitch, we know that just one slow node affects our website(from
personal pain/experience of nodes hitting RAM limit and slowing down
causing website to get real slow).

Dean

On 3/20/13 6:41 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:

2. Upping index_interval from 128 to 512 (this seemed to reduce our
memory
usage significantly!!!)


I'd be very careful with that as a one-stop improvement solution for
two
reasons AFAIK
1) you have to rebuild stables ( not an issue if you are evaluating,
doing
test writes.. Etc, not so much in production )
2) it can affect reads ( number of sstable reads to serve a read )
especially if your key/row cache is ineffective

On 20/03/13 13:34, Hiller, Dean dean.hil...@nrel.gov wrote:

Also, look at the cassandra logs.  I bet you see the typicalŠblah
blah
is
at 0.85, doing memory cleanup which is not exactly GC but cassandra
memory
managementŠ..and of course, you have GC on top of that.

If you need to get your memory down, there are multiple ways
1. Switching size tiered compaction to leveled compaction(with 1
billion
narrow rows, this helped us quite a bit)
2. Upping index_interval from 128 to 512 (this seemed to reduce our
memory
usage significantly!!!)
3. Just add more nodes as moving the rows to other servers reduces
memory
from #1 and #2 above since the server would have less rows

Later,
Dean

On 3/20/13 6:29 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:


I'd say GC. Please fill in form CASS-FREEZE-001 below and get back
to
us
:-) ( sorry )

How big is your JVM heap ? How many CPUs ?
Garbage collection taking long ? ( look for log lines from
GCInspector)
Running out of heap ? ( heap is .. full log lines )
Any tasks backing up / being dropped ? ( nodetool tpstats and ..
dropped
in last .. ms log lines )
Are writes really slow? ( nodetool cfhistograms Keyspace
ColumnFamily
)

How much is lots of data? Wide or skinny rows? Mutations/sec ?
Which Compaction Strategy are you using? Output of show schema (
cassandra-cli ) for the relevant Keyspace/CF might help as well

What consistency are you doing your writes with ? I assume ONE or
ANY
if
you have a single node.

What are the values for these settings in cassandra.yaml

memtable_total_space_in_mb:
memtable_flush_writers:
memtable_flush_queue_size:
compaction_throughput_mb_per_sec:

concurrent_writes:



Which version of Cassandra?



Regards,
Andras

From:  Joel Samuelsson samuelsson.j...@gmail.com
Reply-To:  user@cassandra.apache.org user@cassandra.apache.org
Date:  Wednesday 20

index_interval memory savings in our case(if you are curious)Š (and performance result)...

2013-03-20 Thread Hiller, Dean
Oh, and to give you an idea of memory savings, we had a node at 10G RAM
usage...we had upped a few nodes to 16G from 8G as we don't have our new
nodes ready yet(we know we should be at 8G but we would have a dead
cluster if we did that).

On startup, the initial RAM is around 6-8G.  Startup with
index_interval=512 resulted in a 2.5G-2.8G initial RAM and I have seen it
grow to 3.3G and back down to 2.8G.  We just rolled this out an hour ago.
Our website response time is the same as before as well.

We rolled to only 2 nodes(out of 6) in our cluster so far to test it out
and let it soak a bit.  We will slowly roll to more nodes monitoring the
performance as we go.  Also, since dynamic snitch is not working with
SimpleSnitch, we know that just one slow node affects our website(from
personal pain/experience of nodes hitting RAM limit and slowing down
causing website to get real slow).

Dean

On 3/20/13 6:41 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:

2. Upping index_interval from 128 to 512 (this seemed to reduce our memory
usage significantly!!!)


I'd be very careful with that as a one-stop improvement solution for two
reasons AFAIK
1) you have to rebuild stables ( not an issue if you are evaluating, doing
test writes.. Etc, not so much in production )
2) it can affect reads ( number of sstable reads to serve a read )
especially if your key/row cache is ineffective

On 20/03/13 13:34, Hiller, Dean dean.hil...@nrel.gov wrote:

Also, look at the cassandra logs.  I bet you see the typicalŠblah blah is
at 0.85, doing memory cleanup which is not exactly GC but cassandra
memory
managementŠ..and of course, you have GC on top of that.

If you need to get your memory down, there are multiple ways
1. Switching size tiered compaction to leveled compaction(with 1 billion
narrow rows, this helped us quite a bit)
2. Upping index_interval from 128 to 512 (this seemed to reduce our
memory
usage significantly!!!)
3. Just add more nodes as moving the rows to other servers reduces memory
from #1 and #2 above since the server would have less rows

Later,
Dean

On 3/20/13 6:29 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:


I'd say GC. Please fill in form CASS-FREEZE-001 below and get back to us
:-) ( sorry )

How big is your JVM heap ? How many CPUs ?
Garbage collection taking long ? ( look for log lines from GCInspector)
Running out of heap ? ( heap is .. full log lines )
Any tasks backing up / being dropped ? ( nodetool tpstats and ..
dropped
in last .. ms log lines )
Are writes really slow? ( nodetool cfhistograms Keyspace ColumnFamily )

How much is lots of data? Wide or skinny rows? Mutations/sec ?
Which Compaction Strategy are you using? Output of show schema (
cassandra-cli ) for the relevant Keyspace/CF might help as well

What consistency are you doing your writes with ? I assume ONE or ANY if
you have a single node.

What are the values for these settings in cassandra.yaml

memtable_total_space_in_mb:
memtable_flush_writers:
memtable_flush_queue_size:
compaction_throughput_mb_per_sec:

concurrent_writes:



Which version of Cassandra?



Regards,
Andras

From:  Joel Samuelsson samuelsson.j...@gmail.com
Reply-To:  user@cassandra.apache.org user@cassandra.apache.org
Date:  Wednesday 20 March 2013 13:06
To:  user@cassandra.apache.org user@cassandra.apache.org
Subject:  Cassandra freezes


Hello,

I've been trying to load test a one node cassandra cluster. When I add
lots of data, the Cassandra node freezes for 4-5 minutes during which
neither reads nor writes are served.
During this time, Cassandra takes 100% of a single CPU core.
My initial thought was that this was Cassandra flushing memtables to the
disk, however, the disk i/o is very low during this time.
Any idea what my problem could be?
I'm running in a virtual environment in which I have no control of
drives.
So commit log and data directory is (probably) on the same drive.

Best regards,
Joel Samuelsson






Re: index_interval memory savings in our case(if you are curious)Š (and performance result)...

2013-03-20 Thread Hiller, Dean
I am using LCS so bloom filter fp default for 1.2.2 is 0.1 so my
bloomfilter size is 1.27G RAM(nodetool cfstats)1.7 billion rows each
node.

My cfstats for this CF is attached(Since cut and paste screwed up the
formatting).  During testing in QA, we were not sure if index_interval
change was working so we dug into the code to find out, it basically seems
to immediately convert on startup though doesn't log anything except at a
debug level which we don't have on.

Dean



On 3/20/13 6:58 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:

I am curious, thanks. ( I am in the same situation, big nodes choking
under 300-400G data load, 500mil keys )

How does your cfhistograms Keyspace CF output look like? How many
sstable reads ?
What is your bloom filter fp chance ?

Regards,
Andras

On 20/03/13 13:54, Hiller, Dean dean.hil...@nrel.gov wrote:

Oh, and to give you an idea of memory savings, we had a node at 10G RAM
usage...we had upped a few nodes to 16G from 8G as we don't have our new
nodes ready yet(we know we should be at 8G but we would have a dead
cluster if we did that).

On startup, the initial RAM is around 6-8G.  Startup with
index_interval=512 resulted in a 2.5G-2.8G initial RAM and I have seen it
grow to 3.3G and back down to 2.8G.  We just rolled this out an hour ago.
Our website response time is the same as before as well.

We rolled to only 2 nodes(out of 6) in our cluster so far to test it out
and let it soak a bit.  We will slowly roll to more nodes monitoring the
performance as we go.  Also, since dynamic snitch is not working with
SimpleSnitch, we know that just one slow node affects our website(from
personal pain/experience of nodes hitting RAM limit and slowing down
causing website to get real slow).

Dean

On 3/20/13 6:41 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:

2. Upping index_interval from 128 to 512 (this seemed to reduce our
memory
usage significantly!!!)


I'd be very careful with that as a one-stop improvement solution for two
reasons AFAIK
1) you have to rebuild stables ( not an issue if you are evaluating,
doing
test writes.. Etc, not so much in production )
2) it can affect reads ( number of sstable reads to serve a read )
especially if your key/row cache is ineffective

On 20/03/13 13:34, Hiller, Dean dean.hil...@nrel.gov wrote:

Also, look at the cassandra logs.  I bet you see the typicalŠblah blah
is
at 0.85, doing memory cleanup which is not exactly GC but cassandra
memory
managementŠ..and of course, you have GC on top of that.

If you need to get your memory down, there are multiple ways
1. Switching size tiered compaction to leveled compaction(with 1
billion
narrow rows, this helped us quite a bit)
2. Upping index_interval from 128 to 512 (this seemed to reduce our
memory
usage significantly!!!)
3. Just add more nodes as moving the rows to other servers reduces
memory
from #1 and #2 above since the server would have less rows

Later,
Dean

On 3/20/13 6:29 AM, Andras Szerdahelyi
andras.szerdahe...@ignitionone.com wrote:


I'd say GC. Please fill in form CASS-FREEZE-001 below and get back to
us
:-) ( sorry )

How big is your JVM heap ? How many CPUs ?
Garbage collection taking long ? ( look for log lines from
GCInspector)
Running out of heap ? ( heap is .. full log lines )
Any tasks backing up / being dropped ? ( nodetool tpstats and ..
dropped
in last .. ms log lines )
Are writes really slow? ( nodetool cfhistograms Keyspace ColumnFamily
)

How much is lots of data? Wide or skinny rows? Mutations/sec ?
Which Compaction Strategy are you using? Output of show schema (
cassandra-cli ) for the relevant Keyspace/CF might help as well

What consistency are you doing your writes with ? I assume ONE or ANY
if
you have a single node.

What are the values for these settings in cassandra.yaml

memtable_total_space_in_mb:
memtable_flush_writers:
memtable_flush_queue_size:
compaction_throughput_mb_per_sec:

concurrent_writes:



Which version of Cassandra?



Regards,
Andras

From:  Joel Samuelsson samuelsson.j...@gmail.com
Reply-To:  user@cassandra.apache.org user@cassandra.apache.org
Date:  Wednesday 20 March 2013 13:06
To:  user@cassandra.apache.org user@cassandra.apache.org
Subject:  Cassandra freezes


Hello,

I've been trying to load test a one node cassandra cluster. When I add
lots of data, the Cassandra node freezes for 4-5 minutes during which
neither reads nor writes are served.
During this time, Cassandra takes 100% of a single CPU core.
My initial thought was that this was Cassandra flushing memtables to
the
disk, however, the disk i/o is very low during this time.
Any idea what my problem could be?
I'm running in a virtual environment in which I have no control of
drives.
So commit log and data directory is (probably) on the same drive.

Best regards,
Joel Samuelsson






Offset  SSTables Write Latency  Read Latency  Row Size  
Column Count
1  15137 0 0

configurable index_interval per keyspace

2011-11-10 Thread Radim Kolar
It would be good to have index_interval configurable per keyspace. 
Preferably in cassandra.yaml because i use it as tuning on nodes running 
out of memory without affecting performance noticeably.