Re: Is it possible to bootstrap the 1st node of a new DC?

2015-09-10 Thread horschi
Hi Samuel, thanks a lot for the jira link. Another reason to upgrade to 2.1 :-) regards, Christian On Thu, Sep 10, 2015 at 1:28 PM, Samuel CARRIERE wrote: > Hi Christian, > The problem you mention (violation of constency) is a true one. If I have > understood

Re: High CPU usage on some of nodes

2015-09-10 Thread Robert Wille
It sounds like its probably GC. Grep for GC in system.log to verify. If it is GC, there are a myriad of issues that could cause it, but at least you’ve narrowed it down. On Sep 9, 2015, at 11:05 PM, Roman Tkachenko wrote: > Hey guys, > > We've been having issues in the

Re: Is it possible to bootstrap the 1st node of a new DC?

2015-09-10 Thread Samuel CARRIERE
Hi Christian, The problem you mention (violation of constency) is a true one. If I have understood correctly, it is resolved in cassandra 2.1 (see CASSANDRA-2434). Regards, Samuel horschi a écrit sur 10/09/2015 12:41:41 : > De : horschi > A :

Re: Is it possible to bootstrap the 1st node of a new DC?

2015-09-10 Thread horschi
Hi Rob, regarding 1-3: Thank you for the step-by-step explanation :-) My mistake was to use join_ring=false during the inital start already. It now works for me as its supposed to. Nevertheless it does not what I want, as it does not take writes during the time of repair/rebuild: Running an 8

Re: High CPU usage on some of nodes

2015-09-10 Thread Samuel CARRIERE
Hi Roman, If it affects only a subset of nodes and it's always the same ones, it could be a "problem" with your data model : maybe some (too) wide rows on theses nodes. If one of your row is too wide, the deserialisation of the columns index of this row can take a lot of resources (disk, RAM,

Re: Network / GC / Latency spike

2015-09-10 Thread Alain RODRIGUEZ
Hi, just wanted to drop the follow up here. I finally figure out that bigdata guys were basically hammering the cluster by reading 2 month of data as fast as possible on one table at boot time to cache it. As this table is storing 12 MB blobs (Bloom Filters), even if the number of reads was not

Re: Should replica placement change after a topology change?

2015-09-10 Thread Richard Dawe
Hi Robert, Firstly, thank you very much for you help. I have some comments inline below. On 10/09/2015 01:26, "Robert Coli" > wrote: On Wed, Sep 9, 2015 at 7:52 AM, Richard Dawe

Re: High CPU usage on some of nodes

2015-09-10 Thread Roman Tkachenko
Thanks for the responses guys. I also suspected GC and I guess it could be it, since during the spikes logs are filled with messages like "GC for ConcurrentMarkSweep: 5908 ms for 1 collections, 1986282520 used; max is 8375238656", often right before messages about dropped queries, unlike other,

Re: High CPU usage on some of nodes

2015-09-10 Thread Jeff Jirsa
With a 5s collection, the problem is almost certainly GC. GC pressure can be caused by a number of things, including normal read/write loads, but ALSO compaction calculation (pre-2.1.9 / #9882) and very large partitions (trying to load a very large partition with something like row cache in

Re: High CPU usage on some of nodes

2015-09-10 Thread Robert Coli
On Thu, Sep 10, 2015 at 10:54 AM, Roman Tkachenko wrote: > > [5 second CMS GC] Is my best shot to play with JVM settings trying to tune > garbage collection then? > Yep. As a minor note, if the machines are that beefy, they probably have a lot of RAM, you might wish to

Re: Should replica placement change after a topology change?

2015-09-10 Thread Robert Coli
On Thu, Sep 10, 2015 at 8:55 AM, Richard Dawe wrote: > So if you have a topology that would change if you switched from > SimpleStrategy to NetworkTopologyStrategy plus multiple racks, it sounds > like a different migration strategy would be needed? > > I am

Re: High CPU usage on some of nodes

2015-09-10 Thread Graham Sanderson
Haven’t been following this thread, but we run beefy machines with 8gig new gen, 12 gig old gen (down from 16g since moving memtables off heap, we can probably go lower)… Apart from making sure you have all the latest -XX: flags from cassandra-env.sh (and MALLOC_ARENA_MAX), I personally would

Re: confusion about nodetool cfstats

2015-09-10 Thread Chris Lohfink
All metrics reported in cfstats are for just the one node (its pulled from jmx). To see cluster aggregates its best to use a tool for monitoring like opscenter, graphite, influxdb, nagios etc. Its a good idea to have one of these something like this setup for many reasons anyway. If you are using

confusion about nodetool cfstats

2015-09-10 Thread Shuo Chen
Hi! I want to monitor columnfamily space used with nodetool cfstats. The document says, Space used (live), bytes:9592399Space that is measured depends on operating system

Re: confusion about nodetool cfstats

2015-09-10 Thread Shuo Chen
Sorry to send the previous message. I want to monitor columnfamily space used with nodetool cfstats. The document says, Space used (live), bytes:9592399Space that is measured depends on operating system Is this metric shows space used on one nodes or on the whole cluster? If it is just one

Re: Should replica placement change after a topology change?

2015-09-10 Thread Robert Coli
On Thu, Sep 10, 2015 at 12:33 PM, Nate McCall wrote: > I can confirm that the above process works (definitely include Rob's >> repair suggestion, though). It is really the only way we've found to safely >> go from SimpleSnitch to rack-aware NTS. >> > > The same process

Re: Should replica placement change after a topology change?

2015-09-10 Thread Nate McCall
> > > So if you have a topology that would change if you switched from >> SimpleStrategy to NetworkTopologyStrategy plus multiple racks, it sounds >> like a different migration strategy would be needed? >> >> I am imagining: >> >>1. Switch to a different snitch, and the keyspace from

Is it normal to see a node version handshake with itself?

2015-09-10 Thread Eric Plowe
I noticed in the system.log of one of my nodes INFO [HANDSHAKE-mia1-cas-001.bongojuice.com/172.16.245.1] 2015-09-10 16:00:37,748 OutboundTcpConnection.java:485 - Handshaking version with mia1-cas-001.bongojuice.com/172.16.245.1 The machine I am on is mia1-cas-001. If it's nothing, never mind,