Re: nodetool status shows large numbers of up nodes are down

2015-02-09 Thread Carlos Rolo
Hi Cheng,

Are all machines configured with NTP and all clocks in sync? If that is not
the case do it.

If your clocks are not in sync it causes some weird issues like the ones
you see, but also schema disagreements and in some cases corrupted data.

Regards,

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
*
Tel: 1649
www.pythian.com

On Tue, Feb 10, 2015 at 3:40 AM, Cheng Ren  wrote:

> Hi,
> We have a two-dc cluster with 21 nodes and 27 nodes in each DC. Over the
> past few months, we have seen nodetool status marks 4-8 nodes down while
> they are actually functioning. Particularly today we noticed that running
> nodetool status on some nodes shows higher number of nodes are down than
> before while they are actually up and serving requests.
> For example, on one node it shows 42 nodes are down.
>
> phi_convict_threshold of all nodes are set as 12, and we are running
> cassandra 2.0.4 on AWS EC2 machines.
>
> Does anyone have recommendation on identifying the root cause of this?
> Will this cause any consequences?
>
> Thanks,
> Cheng
>

-- 


--





nodetool status shows large numbers of up nodes are down

2015-02-09 Thread Cheng Ren
Hi,
We have a two-dc cluster with 21 nodes and 27 nodes in each DC. Over the
past few months, we have seen nodetool status marks 4-8 nodes down while
they are actually functioning. Particularly today we noticed that running
nodetool status on some nodes shows higher number of nodes are down than
before while they are actually up and serving requests.
For example, on one node it shows 42 nodes are down.

phi_convict_threshold of all nodes are set as 12, and we are running
cassandra 2.0.4 on AWS EC2 machines.

Does anyone have recommendation on identifying the root cause of this? Will
this cause any consequences?

Thanks,
Cheng


Re: about "insert into table with IF NOT EXISTS" error

2015-02-09 Thread 严超
How about delete the previous inserted lines? Then, insert it again?

*Best Regards!*


*Chao Yan--**My twitter:Andy Yan @yanchao727
*


*My Weibo:http://weibo.com/herewearenow
--*

2015-02-10 9:52 GMT+08:00 Alex Popescu :

> Tom, this question would have better chances to be answered on the Node.js
> driver mailing list
> https://groups.google.com/a/lists.datastax.com/forum/#!forum/nodejs-driver-user
>
> On Mon, Feb 9, 2015 at 5:38 PM, tom  wrote:
>
>> Hi:
>>
>>   I setup one node cassandra server, and using node.js driver cql to
>> query db.
>>
>>   But, when insert into table with "IF NOT EXISTS" statement, it report
>> error as below:
>>
>> :ResponseError: Cannot achieve consistency level QUORUM
>>
>> And, I try set nodejs cql query with consistency to ONE, still see that
>> error.
>>
>> If I remove  "IF NOT EXISTS" from cql, insert passed.
>>
>>  Please advice. thx.
>>
>> best regards
>>   Tom
>>
>
>
>
> --
>
> [:>-a)
>
> Alex Popescu
> Sen. Product Manager @ DataStax
> @al3xandru
>


Re: about "insert into table with IF NOT EXISTS" error

2015-02-09 Thread Alex Popescu
Tom, this question would have better chances to be answered on the Node.js
driver mailing list
https://groups.google.com/a/lists.datastax.com/forum/#!forum/nodejs-driver-user

On Mon, Feb 9, 2015 at 5:38 PM, tom  wrote:

> Hi:
>
>   I setup one node cassandra server, and using node.js driver cql to query
> db.
>
>   But, when insert into table with "IF NOT EXISTS" statement, it report
> error as below:
>
> :ResponseError: Cannot achieve consistency level QUORUM
>
> And, I try set nodejs cql query with consistency to ONE, still see that
> error.
>
> If I remove  "IF NOT EXISTS" from cql, insert passed.
>
>  Please advice. thx.
>
> best regards
>   Tom
>



-- 

[:>-a)

Alex Popescu
Sen. Product Manager @ DataStax
@al3xandru


about "insert into table with IF NOT EXISTS" error

2015-02-09 Thread tom
Hi:

  I setup one node cassandra server, and using node.js driver cql to query
db.

  But, when insert into table with "IF NOT EXISTS" statement, it report
error as below:

:ResponseError: Cannot achieve consistency level QUORUM

And, I try set nodejs cql query with consistency to ONE, still see that
error.

If I remove  "IF NOT EXISTS" from cql, insert passed.

 Please advice. thx.

best regards
  Tom


Re: Question about adding nodes to a cluster

2015-02-09 Thread Seth Edwards
I see what you are saying. So basically take whatever existing token I have
and divide it by 2, give or take a couple of tokens?

On Mon, Feb 9, 2015 at 5:17 PM, Robert Coli  wrote:

> On Mon, Feb 9, 2015 at 4:59 PM, Seth Edwards  wrote:
>
>> We are choosing to double our cluster from six to twelve. I ran the token
>> generator. Based on what I read in the documentation, I expected to see the
>> same first six tokens and six new tokens. Instead I see almost the same
>> tokens but off by a few numbers. Is this expected? Should I change the
>> similar tokens to the new ones? Am I doing it wrong?
>>
>
> In your existing cluster, your first token is at
> 28356863910078205288614550619314017621, which ends in an odd number.
>
> You cannot therefore choose a new token which exactly bisects its range,
> because a node cannot own the token 28356863910078205288614550619314017621
> /2 =
> 14178431955039102644307275309657008810.5 ... because tokens are integers.
>
> You will however notice that floor() of your current token divided by two
> is your new token (14178431955039102644307275309657008810).
>
> I would personally keep my existing 6 tokens and do the simple math myself
> of bisecting their ranges, not move my existing tokens around by one or two
> tokens.
>
> =Rob
>
>
>
>
>
>
>


Re[2]: Question about adding nodes to a cluster

2015-02-09 Thread Plotnik, Alexey
Sorry, No - you are not doing it wrong ^)


Yes, Cassandra partitioner is based on hash ring. Doubling number of nodes is 
the best cluster exctending policy I've ever seen, because it's zero-overhead.

Hashring - you get MD5 max (2^128-1), divide it by number of nodes (partitions) 
getting N points and then evenly distribute them across you ring. You can open 
Python script you used to generate the following output ans see how it works.



I am on Cassandra 1.2.19 and I am following the documentation for adding 
existing nodes to a 
cluster.

We are choosing to double our cluster from six to twelve. I ran the token 
generator. Based on what I read in the documentation, I expected to see the 
same first six tokens and six new tokens. Instead I see almost the same tokens 
but off by a few numbers. Is this expected? Should I change the similar tokens 
to the new ones? Am I doing it wrong?


Here is the output I am dealing with.

With six:

DC #1:
  Node #1:0
  Node #2:   28356863910078205288614550619314017621
  Node #3:   56713727820156410577229101238628035242
  Node #4:   85070591730234615865843651857942052863
  Node #5:  113427455640312821154458202477256070484
  Node #6:  141784319550391026443072753096570088105

With twelve:

DC #1:
  Node #01:0
  Node #02:   14178431955039102644307275309657008810
  Node #03:   28356863910078205288614550619314017620
  Node #04:   42535295865117307932921825928971026430
  Node #05:   56713727820156410577229101238628035240
  Node #06:   70892159775195513221536376548285044050
  Node #07:   85070591730234615865843651857942052860
  Node #08:   99249023685273718510150927167599061670
  Node #09:  113427455640312821154458202477256070480
  Node #10:  127605887595351923798765477786913079290
  Node #11:  141784319550391026443072753096570088100
  Node #12:  155962751505430129087380028406227096910


Re: Question about adding nodes to a cluster

2015-02-09 Thread Robert Coli
On Mon, Feb 9, 2015 at 4:59 PM, Seth Edwards  wrote:

> We are choosing to double our cluster from six to twelve. I ran the token
> generator. Based on what I read in the documentation, I expected to see the
> same first six tokens and six new tokens. Instead I see almost the same
> tokens but off by a few numbers. Is this expected? Should I change the
> similar tokens to the new ones? Am I doing it wrong?
>

In your existing cluster, your first token is at
28356863910078205288614550619314017621, which ends in an odd number.

You cannot therefore choose a new token which exactly bisects its range,
because a node cannot own the token 28356863910078205288614550619314017621
/2 =
14178431955039102644307275309657008810.5 ... because tokens are integers.

You will however notice that floor() of your current token divided by two
is your new token (14178431955039102644307275309657008810).

I would personally keep my existing 6 tokens and do the simple math myself
of bisecting their ranges, not move my existing tokens around by one or two
tokens.

=Rob


Re: Question about adding nodes to a cluster

2015-02-09 Thread Plotnik, Alexey
Yes, Cassandra partitioner is based on hash ring. Doubling number of nodes is 
the best cluster exctending policy I've ever seen, because it's zero-overhead.

Hashring - you get MD5 max (2^128-1), divide it by number of nodes (partitions) 
getting N points and then evenly distribute them across you ring. You can open 
Python script you used to generate the following output ans see how it works.



I am on Cassandra 1.2.19 and I am following the documentation for adding 
existing nodes to a 
cluster.

We are choosing to double our cluster from six to twelve. I ran the token 
generator. Based on what I read in the documentation, I expected to see the 
same first six tokens and six new tokens. Instead I see almost the same tokens 
but off by a few numbers. Is this expected? Should I change the similar tokens 
to the new ones? Am I doing it wrong?


Here is the output I am dealing with.

With six:

DC #1:
  Node #1:0
  Node #2:   28356863910078205288614550619314017621
  Node #3:   56713727820156410577229101238628035242
  Node #4:   85070591730234615865843651857942052863
  Node #5:  113427455640312821154458202477256070484
  Node #6:  141784319550391026443072753096570088105

With twelve:

DC #1:
  Node #01:0
  Node #02:   14178431955039102644307275309657008810
  Node #03:   28356863910078205288614550619314017620
  Node #04:   42535295865117307932921825928971026430
  Node #05:   56713727820156410577229101238628035240
  Node #06:   70892159775195513221536376548285044050
  Node #07:   85070591730234615865843651857942052860
  Node #08:   99249023685273718510150927167599061670
  Node #09:  113427455640312821154458202477256070480
  Node #10:  127605887595351923798765477786913079290
  Node #11:  141784319550391026443072753096570088100
  Node #12:  155962751505430129087380028406227096910


Question about adding nodes to a cluster

2015-02-09 Thread Seth Edwards
I am on Cassandra 1.2.19 and I am following the documentation for adding
existing nodes to a cluster

.

We are choosing to double our cluster from six to twelve. I ran the token
generator. Based on what I read in the documentation, I expected to see the
same first six tokens and six new tokens. Instead I see almost the same
tokens but off by a few numbers. Is this expected? Should I change the
similar tokens to the new ones? Am I doing it wrong?


Here is the output I am dealing with.

With six:

DC #1:
  Node #1:0
  Node #2:   28356863910078205288614550619314017621
  Node #3:   56713727820156410577229101238628035242
  Node #4:   85070591730234615865843651857942052863
  Node #5:  113427455640312821154458202477256070484
  Node #6:  141784319550391026443072753096570088105

With twelve:

DC #1:
  Node #01:0
  Node #02:   14178431955039102644307275309657008810
  Node #03:   28356863910078205288614550619314017620
  Node #04:   42535295865117307932921825928971026430
  Node #05:   56713727820156410577229101238628035240
  Node #06:   70892159775195513221536376548285044050
  Node #07:   85070591730234615865843651857942052860
  Node #08:   99249023685273718510150927167599061670
  Node #09:  113427455640312821154458202477256070480
  Node #10:  127605887595351923798765477786913079290
  Node #11:  141784319550391026443072753096570088100
  Node #12:  155962751505430129087380028406227096910


Re: Fastest way to map/parallel read all values in a table?

2015-02-09 Thread graham sanderson
Depending on whether you have deletes/updates, if this is an ad-hoc thing, you 
might want to just read the ss tables directly.

> On Feb 9, 2015, at 12:56 PM, Kevin Burton  wrote:
> 
> I had considered using spark for this but:
> 
> 1.  we tried to deploy spark only to find out that it was missing a number of 
> key things we need.  
> 
> 2.  our app needs to shut down to release threads and resources.  Spark 
> doesn’t have support for this so all the workers would have stale thread 
> leaking afterwards.  Though I guess if I can get workers to fork then I 
> should be ok.
> 
> 3.  Spark SQL actually returned invalid data to our queries… so that was kind 
> of a red flag and a non-starter
> 
> On Mon, Feb 9, 2015 at 2:24 AM, Marcelo Valle (BLOOMBERG/ LONDON) 
> mailto:mvallemil...@bloomberg.net>> wrote:
> Just for the record, I was doing the exact same thing in an internal 
> application in the start up I used to work. We have had the need of writing 
> custom code process in parallel all rows of a column family. Normally we 
> would use Spark for the job, but in our case the logic was a little more 
> complicated, so we wrote custom code. 
> 
> What we did was to run N process in M machines (N cores in each), each one 
> processing tasks. The tasks were created by splitting the range -2^ 63 to 2^ 
> 63 -1 in N*M*10 tasks. Even if data was not completely distributed along the 
> tasks, no machines were idle, as when some task was completed another one was 
> taken from the task pool.
> 
> It was fast enough for us, but I am interested in knowing if there is a 
> better way of doing it.
> 
> For your specific case, here is a tool we had opened as open source and can 
> be useful for simpler tests: https://github.com/s1mbi0se/cql_record_processor 
> 
> 
> Also, I guess you probably know that, but I would consider using Spark for 
> doing this.
> 
> Best regards,
> Marcelo.
> 
> From: user@cassandra.apache.org  
> Subject: Re:Fastest way to map/parallel read all values in a table?
> What’s the fastest way to map/parallel read all values in a table?
> 
> Kind of like a mini map only job.
> 
> I’m doing this to compute stats across our entire corpus.
> 
> What I did to begin with was use token() and then spit it into the number of 
> splits I needed.
> 
> So I just took the total key range space which is -2^63 to 2^63 - 1 and broke 
> it into N parts.
> 
> Then the queries come back as:
> 
> select * from mytable where token(primaryKey) >= x and token(primaryKey) < y
> 
> From reading on this list I thought this was the correct way to handle this 
> problem.
> 
> However, I’m seeing horrible performance doing this.  After about 1% it just 
> flat out locks up.
> 
> Could it be that I need to randomize the token order so that it’s not 
> contiguous?  Maybe it’s all mapping on the first box to begin with.
> 
> 
> 
> -- 
> 
> Founder/CEO Spinn3r.com 
> Location: San Francisco, CA
> blog: http://burtonator.wordpress.com 
> … or check out my Google+ profile 
> 
>  
> 
> 
> 
> -- 
> 
> Founder/CEO Spinn3r.com 
> Location: San Francisco, CA
> blog: http://burtonator.wordpress.com 
> … or check out my Google+ profile 
> 
>  
> 



smime.p7s
Description: S/MIME cryptographic signature


Re: Fastest way to map/parallel read all values in a table?

2015-02-09 Thread Kevin Burton
I had considered using spark for this but:

1.  we tried to deploy spark only to find out that it was missing a number
of key things we need.

2.  our app needs to shut down to release threads and resources.  Spark
doesn’t have support for this so all the workers would have stale thread
leaking afterwards.  Though I guess if I can get workers to fork then I
should be ok.

3.  Spark SQL actually returned invalid data to our queries… so that was
kind of a red flag and a non-starter

On Mon, Feb 9, 2015 at 2:24 AM, Marcelo Valle (BLOOMBERG/ LONDON) <
mvallemil...@bloomberg.net> wrote:

> Just for the record, I was doing the exact same thing in an internal
> application in the start up I used to work. We have had the need of writing
> custom code process in parallel all rows of a column family. Normally we
> would use Spark for the job, but in our case the logic was a little more
> complicated, so we wrote custom code.
>
> What we did was to run N process in M machines (N cores in each), each one
> processing tasks. The tasks were created by splitting the range -2^ 63 to
> 2^ 63 -1 in N*M*10 tasks. Even if data was not completely distributed along
> the tasks, no machines were idle, as when some task was completed another
> one was taken from the task pool.
>
> It was fast enough for us, but I am interested in knowing if there is a
> better way of doing it.
>
> For your specific case, here is a tool we had opened as open source and
> can be useful for simpler tests:
> https://github.com/s1mbi0se/cql_record_processor
>
> Also, I guess you probably know that, but I would consider using Spark for
> doing this.
>
> Best regards,
> Marcelo.
>
> From: user@cassandra.apache.org
> Subject: Re:Fastest way to map/parallel read all values in a table?
>
> What’s the fastest way to map/parallel read all values in a table?
>
> Kind of like a mini map only job.
>
> I’m doing this to compute stats across our entire corpus.
>
> What I did to begin with was use token() and then spit it into the number
> of splits I needed.
>
> So I just took the total key range space which is -2^63 to 2^63 - 1 and
> broke it into N parts.
>
> Then the queries come back as:
>
> select * from mytable where token(primaryKey) >= x and token(primaryKey) <
> y
>
> From reading on this list I thought this was the correct way to handle
> this problem.
>
> However, I’m seeing horrible performance doing this.  After about 1% it
> just flat out locks up.
>
> Could it be that I need to randomize the token order so that it’s not
> contiguous?  Maybe it’s all mapping on the first box to begin with.
>
>
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
> 
>
>
>


-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile




Re: How to remove obsolete error message in Datastax Opscenter?

2015-02-09 Thread Nick Bailey
To clarify what Chris said, restarting opscenter will remove the
notification, but we also have a bug filed to make that behavior a little
better and allow dismissing that notification without a restart. Thanks for
reporting the issue!

-Nick

On Mon, Feb 9, 2015 at 9:00 AM, Chris Lohfink  wrote:

> Restarting opscenter service will get rid of it.
>
> Chris
>
> On Mon, Feb 9, 2015 at 3:01 AM, Björn Hachmann  > wrote:
>
>> Good morning,
>>
>> unfortunately my last rolling restart of our Cassandra cluster issued
>> from OpsCenter (5.0.2) failed. No big deal, but since then OpsCenter is
>> showing an error message at the top of its screen:
>> "Error restarting cluster: Timed out waiting for Cassandra to start.".
>>
>> Does anybody know how to remove that message permanently?
>>
>> Thank you very much in advance!
>>
>> Kind regards
>> Björn Hachmann
>>
>
>


Re: Adding more nodes causes performance problem

2015-02-09 Thread Dominic Letz
Can you copy and example of your read and write queries? Are they both
degrading in the same way performance wise?

On Mon, Feb 9, 2015 at 8:39 PM, Laing, Michael 
wrote:

> Use token-awareness so you don't have as much coordinator overhead.
>
> ml
>
> On Mon, Feb 9, 2015 at 5:32 AM, Marcelo Valle (BLOOMBERG/ LONDON) <
> mvallemil...@bloomberg.net> wrote:
>
>> AFAIK, if you were using RF 3 in a 3 node cluster, so all your nodes had
>> all your data.
>> When the number of nodes started to grow, this assumption stopped being
>> true.
>> I think Cassandra will scale linearly from 9 nodes on, but comparing a
>> situation where all your nodes hold all your data is not really fair, as in
>> this situation Cassandra will behave as a database with two more replicas,
>> for reads.
>> I can be wrong, but this is my call.
>>
>> From: user@cassandra.apache.org
>> Subject: Re:Adding more nodes causes performance problem
>>
>> I have a cluster with 3 nodes, the only keyspace is with replication
>> factor of 3,
>> the application read/write UUID-keyed data. I use CQL (casssandra-python),
>> most writes are done by execute_async, most read are done with consistency
>> level of ONE, overall performance in this setup is better than I expected.
>>
>> Then I test 6-nodes cluster and 9-nodes. The performance (both read and
>> write) was getting worse and worse. Roughly speaking, 6-nodes is about 2~3
>> times slower than 3-nodes, and 9-nodes is about 5~6 times slower than
>> 3-nodes. All tests were done with same data set, same test program, same
>> client machines, for multiple times. I'm running Cassandra 2.1.2 with
>> default
>> configuration.
>>
>> What I observed, is that with 6-nodes and 9-nodes, the Cassandra servers
>> were doing OK with IO, but CPU utilization was about 60%~70% higher than
>> 3-nodes.
>>
>> I'd like to get suggestion how to troubleshoot this, as this is totally
>> against
>> what I read, that Cassandra is scaled linearly.
>>
>>
>>
>>
>


-- 
Dominic Letz
Director of R&D
Exosite 


Re: High GC activity on node with 4TB on data

2015-02-09 Thread Chris Lohfink
 - number of tombstones - how can I reliably find it out?
https://github.com/spotify/cassandra-opstools
https://github.com/cloudian/support-tools

If not getting much compression it may be worth trying to disable it, it
may contribute but its very unlikely that its the cause of the gc pressure
itself.

7000 sstables but STCS? Sounds like compactions couldn't keep up.  Do you
have a lot of pending compactions (nodetool)?  You may want to increase
your compaction throughput (nodetool) to see if you can catch up a little,
it would cause a lot of heap overhead to do reads with that many.  May even
need to take more drastic measures if it cant catch back up.

May also be good to check `nodetool cfstats` for very wide partitions.

Theres a good chance if under load and you have over 8gb heap your GCs
could use tuning.  The bigger the nodes the more manual tweaking it will
require to get the most out of them
https://issues.apache.org/jira/browse/CASSANDRA-8150 also has some ideas.

Chris

On Mon, Feb 9, 2015 at 2:00 AM, Jiri Horky  wrote:

>  Hi all,
>
> thank you all for the info.
>
> To answer the questions:
>  - we have 2 DCs with 5 nodes in each, each node has 256G of memory, 24x1T
> drives, 2x Xeon CPU - there are multiple cassandra instances running for
> different project. The node itself is powerful enough.
>  - there 2 keyspaces, one with 3 replicas per DC, one with 1 replica per
> DC (because of amount of data and because it serves more or less like a
> cache)
>  - there are about 4k/s Request-response, 3k/s Read and 2k/s Mutation
> requests  - numbers are sum of all nodes
>  - we us STCS (LCS would be quite IO have for this amount of data)
>  - number of tombstones - how can I reliably find it out?
>  - the biggest CF (3.6T per node) has 7000 sstables
>
> Now, I understand that the best practice for Cassandra is to run "with the
> minimum size of heap which is enough" which for this case we thought is
> about 12G - there is always 8G consumbed by the SSTable readers. Also, I
> though that high number of tombstones create pressure in the new space
> (which can then cause pressure in old space as well), but this is not what
> we are seeing. We see continuous GC activity in Old generation only.
>
> Also, I noticed that the biggest CF has Compression factor of 0.99 which
> basically means that the data come compressed already. Do you think that
> turning off the compression should help with memory consumption?
>
> Also, I think that tuning CMSInitiatingOccupancyFraction=75 might help
> here, as it seems that 8G is something that Cassandra needs for bookkeeping
> this amount of data and that this was sligtly above the 75% limit which
> triggered the CMS again and again.
>
> I will definitely have a look at the presentation.
>
> Regards
> Jiri Horky
>
>
> On 02/08/2015 10:32 PM, Mark Reddy wrote:
>
> Hey Jiri,
>
>  While I don't have any experience running 4TB nodes (yet), I would
> recommend taking a look at a presentation by Arron Morton on large nodes:
> http://planetcassandra.org/blog/cassandra-community-webinar-videoslides-large-nodes-with-cassandra-by-aaron-morton/
> to see if you can glean anything from that.
>
>  I would note that at the start of his talk he mentions that in version
> 1.2 we can now talk about nodes around 1 - 3 TB in size, so if you are
> storing anything more than that you are getting into very specialised use
> cases.
>
>  If you could provide us with some more information about your cluster
> setup (No. of CFs, read/write patterns, do you delete / update often, etc.)
> that may help in getting you to a better place.
>
>
>  Regards,
> Mark
>
> On 8 February 2015 at 21:10, Kevin Burton  wrote:
>
>> Do you have a lot of individual tables?  Or lots of small compactions?
>>
>>  I think the general consensus is that (at least for Cassandra), 8GB
>> heaps are ideal.
>>
>>  If you have lots of small tables it’s a known anti-pattern (I believe)
>> because the Cassandra internals could do a better job on handling the in
>> memory metadata representation.
>>
>>  I think this has been improved in 2.0 and 2.1 though so the fact that
>> you’re on 1.2.18 could exasperate the issue.  You might want to consider an
>> upgrade (though that has its own issues as well).
>>
>> On Sun, Feb 8, 2015 at 12:44 PM, Jiri Horky  wrote:
>>
>>> Hi all,
>>>
>>> we are seeing quite high GC pressure (in old space by CMS GC Algorithm)
>>> on a node with 4TB of data. It runs C* 1.2.18 with 12G of heap memory
>>> (2G for new space). The node runs fine for couple of days when the GC
>>> activity starts to raise and reaches about 15% of the C* activity which
>>> causes dropped messages and other problems.
>>>
>>> Taking a look at heap dump, there is about 8G used by SSTableReader
>>> classes in org.apache.cassandra.io.compress.CompressedRandomAccessReader.
>>>
>>> Is this something expected and we have just reached the limit of how
>>> many data a single Cassandra instance can handle or it is possible to
>>> tune it be

Re: How to remove obsolete error message in Datastax Opscenter?

2015-02-09 Thread Chris Lohfink
Restarting opscenter service will get rid of it.

Chris

On Mon, Feb 9, 2015 at 3:01 AM, Björn Hachmann 
wrote:

> Good morning,
>
> unfortunately my last rolling restart of our Cassandra cluster issued from
> OpsCenter (5.0.2) failed. No big deal, but since then OpsCenter is showing
> an error message at the top of its screen:
> "Error restarting cluster: Timed out waiting for Cassandra to start.".
>
> Does anybody know how to remove that message permanently?
>
> Thank you very much in advance!
>
> Kind regards
> Björn Hachmann
>


Re: Adding more nodes causes performance problem

2015-02-09 Thread Laing, Michael
Use token-awareness so you don't have as much coordinator overhead.

ml

On Mon, Feb 9, 2015 at 5:32 AM, Marcelo Valle (BLOOMBERG/ LONDON) <
mvallemil...@bloomberg.net> wrote:

> AFAIK, if you were using RF 3 in a 3 node cluster, so all your nodes had
> all your data.
> When the number of nodes started to grow, this assumption stopped being
> true.
> I think Cassandra will scale linearly from 9 nodes on, but comparing a
> situation where all your nodes hold all your data is not really fair, as in
> this situation Cassandra will behave as a database with two more replicas,
> for reads.
> I can be wrong, but this is my call.
>
> From: user@cassandra.apache.org
> Subject: Re:Adding more nodes causes performance problem
>
> I have a cluster with 3 nodes, the only keyspace is with replication
> factor of 3,
> the application read/write UUID-keyed data. I use CQL (casssandra-python),
> most writes are done by execute_async, most read are done with consistency
> level of ONE, overall performance in this setup is better than I expected.
>
> Then I test 6-nodes cluster and 9-nodes. The performance (both read and
> write) was getting worse and worse. Roughly speaking, 6-nodes is about 2~3
> times slower than 3-nodes, and 9-nodes is about 5~6 times slower than
> 3-nodes. All tests were done with same data set, same test program, same
> client machines, for multiple times. I'm running Cassandra 2.1.2 with
> default
> configuration.
>
> What I observed, is that with 6-nodes and 9-nodes, the Cassandra servers
> were doing OK with IO, but CPU utilization was about 60%~70% higher than
> 3-nodes.
>
> I'd like to get suggestion how to troubleshoot this, as this is totally
> against
> what I read, that Cassandra is scaled linearly.
>
>
>
>


Re:Adding more nodes causes performance problem

2015-02-09 Thread Marcelo Valle (BLOOMBERG/ LONDON)
AFAIK, if you were using RF 3 in a 3 node cluster, so all your nodes had all 
your data. 
When the number of nodes started to grow, this assumption stopped being true.
I think Cassandra will scale linearly from 9 nodes on, but comparing a 
situation where all your nodes hold all your data is not really fair, as in 
this situation Cassandra will behave as a database with two more replicas, for 
reads.
I can be wrong, but this is my call.
From: user@cassandra.apache.org 
Subject: Re:Adding more nodes causes performance problem

I have a cluster with 3 nodes, the only keyspace is with replication factor of 
3,
the application read/write UUID-keyed data. I use CQL (casssandra-python),
most writes are done by execute_async, most read are done with consistency
level of ONE, overall performance in this setup is better than I expected.

Then I test 6-nodes cluster and 9-nodes. The performance (both read and
write) was getting worse and worse. Roughly speaking, 6-nodes is about 2~3
times slower than 3-nodes, and 9-nodes is about 5~6 times slower than
3-nodes. All tests were done with same data set, same test program, same
client machines, for multiple times. I'm running Cassandra 2.1.2 with default
configuration.

What I observed, is that with 6-nodes and 9-nodes, the Cassandra servers
were doing OK with IO, but CPU utilization was about 60%~70% higher than
3-nodes.

I'd like to get suggestion how to troubleshoot this, as this is totally against
what I read, that Cassandra is scaled linearly.




Re:Fastest way to map/parallel read all values in a table?

2015-02-09 Thread Marcelo Valle (BLOOMBERG/ LONDON)
Just for the record, I was doing the exact same thing in an internal 
application in the start up I used to work. We have had the need of writing 
custom code process in parallel all rows of a column family. Normally we would 
use Spark for the job, but in our case the logic was a little more complicated, 
so we wrote custom code. 

What we did was to run N process in M machines (N cores in each), each one 
processing tasks. The tasks were created by splitting the range -2^ 63 to 2^ 63 
-1 in N*M*10 tasks. Even if data was not completely distributed along the 
tasks, no machines were idle, as when some task was completed another one was 
taken from the task pool.

It was fast enough for us, but I am interested in knowing if there is a better 
way of doing it.

For your specific case, here is a tool we had opened as open source and can be 
useful for simpler tests: https://github.com/s1mbi0se/cql_record_processor

Also, I guess you probably know that, but I would consider using Spark for 
doing this.

Best regards,
Marcelo.

From: user@cassandra.apache.org 
Subject: Re:Fastest way to map/parallel read all values in a table?

What’s the fastest way to map/parallel read all values in a table?

Kind of like a mini map only job.

I’m doing this to compute stats across our entire corpus.

What I did to begin with was use token() and then spit it into the number of 
splits I needed.

So I just took the total key range space which is -2^63 to 2^63 - 1 and broke 
it into N parts.

Then the queries come back as:

select * from mytable where token(primaryKey) >= x and token(primaryKey) < y

From reading on this list I thought this was the correct way to handle this 
problem.

However, I’m seeing horrible performance doing this.  After about 1% it just 
flat out locks up.

Could it be that I need to randomize the token order so that it’s not 
contiguous?  Maybe it’s all mapping on the first box to begin with.


-- 

Founder/CEO Spinn3r.com
Location: San Francisco, CA
blog: http://burtonator.wordpress.com
… or check out my Google+ profile




Re: How to remove obsolete error message in Datastax Opscenter?

2015-02-09 Thread Colin
Stop using opscenter?

:)

Sorry, couldnt resist...

--
Colin Clark 
+1 612 859 6129
Skype colin.p.clark

> On Feb 9, 2015, at 3:01 AM, Björn Hachmann  wrote:
> 
> Good morning,
> 
> unfortunately my last rolling restart of our Cassandra cluster issued from 
> OpsCenter (5.0.2) failed. No big deal, but since then OpsCenter is showing an 
> error message at the top of its screen:
> "Error restarting cluster: Timed out waiting for Cassandra to start.".
> 
> Does anybody know how to remove that message permanently?
> 
> Thank you very much in advance!
> 
> Kind regards
> Björn Hachmann


How to remove obsolete error message in Datastax Opscenter?

2015-02-09 Thread Björn Hachmann
Good morning,

unfortunately my last rolling restart of our Cassandra cluster issued from
OpsCenter (5.0.2) failed. No big deal, but since then OpsCenter is showing
an error message at the top of its screen:
"Error restarting cluster: Timed out waiting for Cassandra to start.".

Does anybody know how to remove that message permanently?

Thank you very much in advance!

Kind regards
Björn Hachmann


Re: High GC activity on node with 4TB on data

2015-02-09 Thread Jiri Horky
Hi all,

thank you all for the info.

To answer the questions:
 - we have 2 DCs with 5 nodes in each, each node has 256G of memory,
24x1T drives, 2x Xeon CPU - there are multiple cassandra instances
running for different project. The node itself is powerful enough.
 - there 2 keyspaces, one with 3 replicas per DC, one with 1 replica per
DC (because of amount of data and because it serves more or less like a
cache)
 - there are about 4k/s Request-response, 3k/s Read and 2k/s Mutation
requests  - numbers are sum of all nodes
 - we us STCS (LCS would be quite IO have for this amount of data)
 - number of tombstones - how can I reliably find it out?
 - the biggest CF (3.6T per node) has 7000 sstables

Now, I understand that the best practice for Cassandra is to run "with
the minimum size of heap which is enough" which for this case we thought
is about 12G - there is always 8G consumbed by the SSTable readers.
Also, I though that high number of tombstones create pressure in the new
space (which can then cause pressure in old space as well), but this is
not what we are seeing. We see continuous GC activity in Old generation
only.

Also, I noticed that the biggest CF has Compression factor of 0.99 which
basically means that the data come compressed already. Do you think that
turning off the compression should help with memory consumption?

Also, I think that tuning CMSInitiatingOccupancyFraction=75 might help
here, as it seems that 8G is something that Cassandra needs for
bookkeeping this amount of data and that this was sligtly above the 75%
limit which triggered the CMS again and again.

I will definitely have a look at the presentation.

Regards
Jiri Horky

On 02/08/2015 10:32 PM, Mark Reddy wrote:
> Hey Jiri, 
>
> While I don't have any experience running 4TB nodes (yet), I would
> recommend taking a look at a presentation by Arron Morton on large
> nodes: 
> http://planetcassandra.org/blog/cassandra-community-webinar-videoslides-large-nodes-with-cassandra-by-aaron-morton/
> to see if you can glean anything from that.
>
> I would note that at the start of his talk he mentions that in version
> 1.2 we can now talk about nodes around 1 - 3 TB in size, so if you are
> storing anything more than that you are getting into very specialised
> use cases.
>
> If you could provide us with some more information about your cluster
> setup (No. of CFs, read/write patterns, do you delete / update often,
> etc.) that may help in getting you to a better place.
>
>
> Regards,
> Mark
>
> On 8 February 2015 at 21:10, Kevin Burton  > wrote:
>
> Do you have a lot of individual tables?  Or lots of small
> compactions?
>
> I think the general consensus is that (at least for Cassandra),
> 8GB heaps are ideal.  
>
> If you have lots of small tables it’s a known anti-pattern (I
> believe) because the Cassandra internals could do a better job on
> handling the in memory metadata representation.
>
> I think this has been improved in 2.0 and 2.1 though so the fact
> that you’re on 1.2.18 could exasperate the issue.  You might want
> to consider an upgrade (though that has its own issues as well).
>
> On Sun, Feb 8, 2015 at 12:44 PM, Jiri Horky  > wrote:
>
> Hi all,
>
> we are seeing quite high GC pressure (in old space by CMS GC
> Algorithm)
> on a node with 4TB of data. It runs C* 1.2.18 with 12G of heap
> memory
> (2G for new space). The node runs fine for couple of days when
> the GC
> activity starts to raise and reaches about 15% of the C*
> activity which
> causes dropped messages and other problems.
>
> Taking a look at heap dump, there is about 8G used by
> SSTableReader
> classes in
> org.apache.cassandra.io.compress.CompressedRandomAccessReader.
>
> Is this something expected and we have just reached the limit
> of how
> many data a single Cassandra instance can handle or it is
> possible to
> tune it better?
>
> Regards
> Jiri Horky
>
>
>
>
> -- 
> Founder/CEO Spinn3r.com 
> Location: *San Francisco, CA*
> blog:* *http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
> 
>
>