Compacted_at timestamp

2015-02-08 Thread Havere Shanmukhappa, Santhosh
When I run nodetool compactionhistory command, it displays 'compacted_at' timestamp in non-readable format. Any way to read that column in readable format? I am using c*2.0.11 version. Thanks, Santo

Re: Mutable primary key in a table

2015-02-08 Thread Colin
Another way to do this is to use a time based uuid for the primary key (partition key) and to store the user name with that uuid. In addition, you'll need 2 additonal tables, one that is used to get the uuid by user name and another to track user name changes over time which would be organized

Re: Mutable primary key in a table

2015-02-08 Thread Eric Stevens
It sounds like changing user names is the kind of thing which doesn't happen often, in which case you probably don't have to worry too much about the additional overhead of using logged batches (not like you're going to be doing hundreds to thousands of these per second). You probably also want

Re: Mutable primary key in a table

2015-02-08 Thread Jack Krupansky
What is your full primary key? Specifically, what is the partition key, as opposed to clustering columns? The point is that the partition key for a row is hashed to determine the token for the partition, which in turn determines which node of the cluster owns that partition. Changing the

Re: Mutable primary key in a table

2015-02-08 Thread Colin Clark
No need for CAS in my suggestion - I would try to avoid the use of CAS if at all possible. It’s better in a distributed environment to reduce dimensionality and isolate write/read paths (event sourcing and CQRS patterns). Also, just in general, changing the primary key on an update is

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Kevin Burton
Do you have a lot of individual tables? Or lots of small compactions? I think the general consensus is that (at least for Cassandra), 8GB heaps are ideal. If you have lots of small tables it’s a known anti-pattern (I believe) because the Cassandra internals could do a better job on handling the

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Mark Reddy
Hey Jiri, While I don't have any experience running 4TB nodes (yet), I would recommend taking a look at a presentation by Arron Morton on large nodes: http://planetcassandra.org/blog/cassandra-community-webinar-videoslides-large-nodes-with-cassandra-by-aaron-morton/ to see if you can glean

High GC activity on node with 4TB on data

2015-02-08 Thread Jiri Horky
Hi all, we are seeing quite high GC pressure (in old space by CMS GC Algorithm) on a node with 4TB of data. It runs C* 1.2.18 with 12G of heap memory (2G for new space). The node runs fine for couple of days when the GC activity starts to raise and reaches about 15% of the C* activity which

Fastest way to map/parallel read all values in a table?

2015-02-08 Thread Kevin Burton
What’s the fastest way to map/parallel read all values in a table? Kind of like a mini map only job. I’m doing this to compute stats across our entire corpus. What I did to begin with was use token() and then spit it into the number of splits I needed. So I just took the total key range space

Re: Compacted_at timestamp

2015-02-08 Thread Mark Reddy
Hi Santo, If you are seeing the compacted_at value as timestamp and want to convert it to a human readable date, this is not possible via nodetool. You will always write a script to make the compactionhistory call and then convert the output (the fourth column - compacted_at) to a readable date.

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Colin
The most data I put on a node with spinning disk is 1TB. What are the machine specs? Cpu, memory, etc and what is the read/write pattern-heavy ingest rate/heavy read rate and how ling do you keep data in the cluster? -- Colin Clark +1 612 859 6129 Skype colin.p.clark On Feb 8, 2015, at 2:44

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Francois Richard
Hi Jiri, We do run multiple nodes with 2TB to 4TB of data and we will usually see GC pressure when we create a lot of tombstones. With Cassandra 2.0.x you would be able to see a log with the following pattern: WARN [ReadStage:7] 2015-02-08 22:55:09,621 SliceQueryFilter.java (line 225) Read 939

RE: Compacted_at timestamp

2015-02-08 Thread Andreas Finke
I created a small script recently converting this timestamp into a human readable string and sort all entries ascending. nodetool compactionhistory |awk '{timestamp = strftime(%a %b %e %H:%M:%S %Z %Y,$4 / 1000);in_m=$5/1024/1024;out_m=$6/1024/1024;

Adding more nodes causes performance problem

2015-02-08 Thread C . B .
I have a cluster with 3 nodes, the only keyspace is with replication factor of 3, the application read/write UUID-keyed data. I use CQL (casssandra-python), most writes are done by execute_async, most read are done with consistency level of ONE, overall performance in this setup is better than I