Question about read consistency level

2013-10-08 Thread graham sanderson
Apologies if this is an obvious question, I have looked but not seen too much (particularly about what exactly latest version means when there is no data on a node for a key - though I'd assume it has to be treated as unknown since you couldn't tell if the data had never been created or the

Question about consistency levels

2013-11-09 Thread graham sanderson
I’m trying to be more succinct this time since no answers on my last attempt. We are currently using 2.0.2 in test (no C* in production yet), and use (LOCAL_)QUORUM CL on read and writes which guarantees (if successful) that we read latest data. That said, it is highly likely that (LOCAL_)ONE

Re: Question about consistency levels

2013-11-10 Thread graham sanderson
- From: graham sanderson [mailto:gra...@vast.com] Sent: 10 November 2013 06:12 To: user@cassandra.apache.org Subject: Question about consistency levels I'm trying to be more succinct this time since no answers on my last attempt. We are currently using 2.0.2 in test (no C

Disaster recovery question

2013-11-16 Thread graham sanderson
We are currently looking to deploy on the 2.0 line of cassandra, but obviously are watching for bugs (we are currently on 2.0.2) - we are aware of a couple of interesting known bugs to be fixed in 2.0.3 and one in 2.1, but none have been observed (in production use cases) or are likely to

Re: Disaster recovery question

2013-11-16 Thread graham sanderson
:13 PM, Mikhail Stepura mikhail.step...@outlook.com wrote: Looks like someone has the same (1-4) questions: https://issues.apache.org/jira/browse/CASSANDRA-6364 -M graham sanderson wrote in message news:7161e7e0-cf24-4b30-b9ca-2faafb0c4...@vast.com... We are currently looking to deploy

Re: What is the fastest way to get data into Cassandra 2 from a Java application?

2013-12-10 Thread graham sanderson
I should probably give you a number which is about 300 meg / s via thrift api and use 1mb batches On Dec 10, 2013, at 5:14 AM, graham sanderson gra...@vast.com wrote: Perhaps not the way forward, however I can bulk insert data via astyanax at a rate that maxes out our (fast) networks

Re: What is the fastest way to get data into Cassandra 2 from a Java application?

2013-12-10 Thread graham sanderson
of favour and the CQL interface is in. Where does that leave Astyanax? On Tue, Dec 10, 2013 at 1:14 PM, graham sanderson gra...@vast.com wrote: Perhaps not the way forward, however I can bulk insert data via astyanax at a rate that maxes out our (fast) networks. That said for our next release

Re: Clarification on how multi-DC replication works

2014-02-11 Thread graham sanderson
slightly off topic, but does anyone know off the top of their head what happens if data is being written at LOCAL_QUORUM to a multi data center setup faster than the inter data center link can handle… something has to block, throw an exception, die, or have unbounded growth (memory, threads, on

Re: Consistency Level One Question

2014-02-20 Thread graham sanderson
Writing at a consistency level of ONE means that your write will be acknowledged as soon as one replica confirms that it has made the write to memtable and the commit log (might not be quite synced to disk, but that’s a separate issue). All the writes are submitted in parallel, so it is very

Re: Consistency Level One Question

2014-02-20 Thread graham sanderson
Note also; that reading at ONE there will be no read repair, since the coordinator does not know that another replica has stale data (remember at ONE, basically only one node is asked for the answer). In practice for our use cases, we always write at LOCAL_QUORUM (failing the whole update if

Re: Consistency Level One Question

2014-02-21 Thread graham sanderson
carry out read repair by getting data from all the nodes. */ On Feb 21, 2014, at 3:10 AM, Duncan Sands duncan.sa...@gmail.com wrote: Hi Graham, On 21/02/14 07:54, graham sanderson wrote: Note also; that reading at ONE there will be no read repair, since the coordinator does not know

Re: read one -- internal behavior

2014-03-08 Thread graham sanderson
Note that article pretty much covers it all; the nice thing about rapid-read protection is that the dynamic snitch works on a per node statistics level to pick which node(s) (in this case one), so a single poorly performing table (perhaps corrupted SSTables on that node causing no responses and

binary protocol server side sockets

2014-04-08 Thread graham sanderson
Is there a way to configure KEEPALIVE on the server end sockets of the binary protocol. rpc_keepalive only affects thrift. This is on 2.0.5 Thanks, Graham smime.p7s Description: S/MIME cryptographic signature

Re: binary protocol server side sockets

2014-04-09 Thread graham sanderson
DOAN On Wed, Apr 9, 2014 at 12:59 AM, graham sanderson gra...@vast.com wrote: Is there a way to configure KEEPALIVE on the server end sockets of the binary protocol. rpc_keepalive only affects thrift. This is on 2.0.5 Thanks, Graham smime.p7s Description: S/MIME

Re: binary protocol server side sockets

2014-04-09 Thread graham sanderson
be the VPNs fault in this case… that said and maybe this is a dev list question, it seems like the option to set keepalive should exist. On Apr 9, 2014, at 12:25 PM, Michael Shuler mich...@pbandjelly.org wrote: On 04/09/2014 11:39 AM, graham sanderson wrote: Thanks, but I would think that just sets

Re: binary protocol server side sockets

2014-04-09 Thread graham sanderson
particular harm to setting keepalive) On Apr 9, 2014, at 1:34 PM, Michael Shuler mich...@pbandjelly.org wrote: On 04/09/2014 12:41 PM, graham sanderson wrote: Michael, it is not that the connections are being dropped, it is that the connections are not being dropped. Thanks for the clarification

Re: Question about READS in a multi DC environment.

2014-05-11 Thread graham sanderson
You have a read_repair_chance of 1.0 which is probably why your query is hitting all data centers. On May 11, 2014, at 3:44 PM, Mark Farnan devm...@petrolink.com wrote: Im trying to understand READ load in Cassandra across a multi-datacenter cluster. (Specifically why it seems to be

Re: Cyclop - CQL web based editor has been released!

2014-05-11 Thread graham sanderson
Looks cool - giving it a try now (note FYI when building, TestDataConverter.java line 46 assumes a specific time zone) On May 11, 2014, at 12:41 AM, Maciej Miklas mac.mik...@gmail.com wrote: Hi everybody, I am aware that this mailing list is meant for Cassandra users, but I’ve developed

Re: Question about READS in a multi DC environment.

2014-05-15 Thread graham sanderson
at 2:07 AM, graham sanderson gra...@vast.com wrote: You have a read_repair_chance of 1.0 which is probably why your query is hitting all data centers. On May 11, 2014, at 3:44 PM, Mark Farnan devm...@petrolink.com wrote: Im trying to understand READ load in Cassandra across a multi

Re: Efficient bulk range deletions without compactions by dropping SSTables.

2014-05-16 Thread graham sanderson
Just a few data points from our experience One of our use cases involves storing a periodic full base state for millions of records, then fairly frequent delta updates to subsets of the records in between. C* is great for this because we can read the whole row (or up to the clustering

Re: Dynamic Columns in Cassandra 2.X

2014-06-13 Thread graham sanderson
My 2 cents… A motivation for CQL3 AFAIK was to make Cassandra more familiar to SQL users. This is a valid goal, and works well in many cases. Equally there are use cases (that some might find ugly) where Cassandra is chosen explicitly because of the sorts of things you can do at the thrift

Re: Dynamic Columns in Cassandra 2.X

2014-06-13 Thread graham sanderson
, e.g. schema changes - in general, 'triggers' become possible. ml On Fri, Jun 13, 2014 at 6:21 PM, graham sanderson gra...@vast.com wrote: My 2 cents… A motivation for CQL3 AFAIK was to make Cassandra more familiar to SQL users. This is a valid goal, and works well in many cases

Re: Pattern to store maps of maps...

2014-06-13 Thread graham sanderson
My personal opinion is that unless you are doing map operations on a CQL3 map and will always intend to read the whole thing (you don’t have any choice today), don’t use one at all - use a blob of whatever variety makes sense (e.g. Json, AVRO, Protobuf etc) On Jun 13, 2014, at 7:17 PM, Kevin

Re: Write Inconsistency to update a row

2014-07-03 Thread graham sanderson
What is your keyspace replication_factor? What consistency level are you reading/writing with? Does the data show up eventually? I’m assuming you don’t have any errors (timeouts etc) on the write site On Jul 3, 2014, at 7:55 AM, Sávio S. Teles de Oliveira savio.te...@cuia.com.br wrote: I

Re: ghost table is breaking compactions and won't go away… even during a drop.

2014-07-16 Thread graham sanderson
Known issue deleting and recreating a CF with the same name, fixed in 2.1 (manifests in lots of ways) https://issues.apache.org/jira/browse/CASSANDRA-5202 On Jul 16, 2014, at 8:53 PM, Kevin Burton bur...@spinn3r.com wrote: looks like a restart of cassandra and a nodetool compact fixed this…

Re: All writes fail with ONE consistency level when adding second node to cluster?

2014-07-22 Thread graham sanderson
base on growing from 1-2 nodes? frustrating :-( On Tue, Jul 22, 2014 at 8:13 PM, graham sanderson gra...@vast.com wrote: Incorrect, ONE does not refer to the number of “other nodes, it just refers to the number of nodes. so ONE under normal circumstances would only require one node

Re: All writes fail with ONE consistency level when adding second node to cluster?

2014-07-23 Thread graham sanderson
Hey now; it is GREAT for a 100% write only use case ;-) On Jul 23, 2014, at 12:15 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Jul 22, 2014 at 7:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them,

Re: All writes fail with ONE consistency level when adding second node to cluster?

2014-07-23 Thread graham sanderson
I was being a little tongue in cheek! On Jul 23, 2014, at 3:20 PM, Jack Krupansky j...@basetechnology.com wrote: Granted, for “normal” apps it is unlikely to be appropriate but... From an old post by Jonathan: --- Extreme write availability For applications that want Cassandra to

Re: Does SELECT … IN () use parallel dispatch?

2014-07-25 Thread Graham Sanderson
Of course the driver in question is allowed to be smarter and can do so if use use a ? parameter for a list or even individual elements I'm not sure which if any drivers currently do this but we plan to combine this with token aware routing in our scala driver in the future Sent from my

Strange slow schema agreement on 2.0.9 ... anyone seen this?

2014-08-08 Thread graham sanderson
We recently upgraded C* from 2.0.5 to 2.0.9 We have some data that is partitioned in tables created periodically (once a day). This morning, this automated process timed out because the schema did not reach agreement quickly enough after we created a new empty table. I was able to reproduce

Re: Delete By Partition Key Implementation

2014-08-08 Thread graham sanderson
A deletion of an entire row is a single row tombstone, and yes there are range tombstones for marking deletion of a range of columns also On Aug 8, 2014, at 2:17 PM, Kevin Burton bur...@spinn3r.com wrote: This is a good question.. I'd love to find out the answer. Seems like a tombstone with

Re: Strange slow schema agreement on 2.0.9 ... anyone seen this?

2014-08-08 Thread graham sanderson
Ok thanks - I guess I can at least enable the debug logging added for that issue to see if it is deliberately choosing not to pull the schema… no repro case, but it may happen again! On Aug 8, 2014, at 4:21 PM, Robert Coli rc...@eventbrite.com wrote: On Fri, Aug 8, 2014 at 1:45 PM, graham

Re: Strange slow schema agreement on 2.0.9 ... anyone seen this?

2014-08-08 Thread graham sanderson
… if it happens again, I’ll have some more context to dig deeper, before just getting in and fixing the problem by restarting the nodes which I did today. On Aug 8, 2014, at 4:37 PM, graham sanderson gra...@vast.com wrote: Ok thanks - I guess I can at least enable the debug logging added for that issue

Re: Is per-table memory overhead due to SSTables or tables?

2014-08-08 Thread graham sanderson
See https://issues.apache.org/jira/browse/CASSANDRA-5935 2.1 has a radically different implementation that side steps this (with off heap memtables), but if you really want lots of tables now you can do so as a trade off against GC behavior. The problem is not SSTables per se, but more

Re: Is per-table memory overhead due to SSTables or tables?

2014-08-08 Thread graham sanderson
, graham sanderson gra...@vast.com wrote: See https://issues.apache.org/jira/browse/CASSANDRA-5935 2.1 has a radically different implementation that side steps this (with off heap memtables), but if you really want lots of tables now you can do so as a trade off against GC behavior

Re: Strange slow schema agreement on 2.0.9 ... anyone seen this? - knowsVersion may get stuck as false?

2014-08-10 Thread graham sanderson
version information for node B. On Aug 8, 2014, at 5:06 PM, graham sanderson gra...@vast.com wrote: Actually I think it is a different issue (or a freak issue)… the invocation in InternalResponseStage is part of the “schema pull” mechanism this ticket relates to, and in my case

Re: OOM(Java heap space) on start-up during commit log replaying

2014-08-12 Thread graham sanderson
Agreed need more details; and just start by increasing heap because that may wells solve the problem. I have just observed (which makes sense when you think about it) while testing fix for https://issues.apache.org/jira/browse/CASSANDRA-7546, that if you are replaying a commit log which has a

Re: update static column using partition key

2014-09-07 Thread graham sanderson
Presumably you meant unread_ids to be a static column (it isn’t in your table definition) On Sep 7, 2014, at 10:14 AM, tommaso barbugli tbarbu...@gmail.com wrote: Hi, I am trying to use a couple of static columns; I am using cassandra 2.0.7 and when I try to set a value using the partition

Re: update static column using partition key

2014-09-07 Thread graham sanderson
Note also (though you are likely not hitting them) there were a bunch of static column related edge cases fixed in 2.0.10 On Sep 7, 2014, at 1:18 PM, graham sanderson gra...@vast.com wrote: Presumably you meant unread_ids to be a static column (it isn’t in your table definition) On Sep 7

Re: Storage: upsert vs. delete + insert

2014-09-10 Thread graham sanderson
delete inserts a tombstone which is likely smaller than the original record (though still (currently) has overhead of cost for full key/column name the data for the insert after a delete would be identical to the data if you just inserted/updated no real benefit I can think of for doing the

Re: Storage: upsert vs. delete + insert

2014-09-10 Thread graham sanderson
one op more to compute resulting row. cheers, Olek 2014-09-10 22:18 GMT+02:00 graham sanderson gra...@vast.com: delete inserts a tombstone which is likely smaller than the original record (though still (currently) has overhead of cost for full key/column name the data for the insert after

Re: ava.lang.OutOfMemoryError: unable to create new native thread

2014-09-17 Thread graham sanderson
Are you running on a 32 bit JVM? On Sep 17, 2014, at 9:43 AM, Yatong Zhang bluefl...@gmail.com wrote: Hi there, I am using leveled compaction strategy and have many sstable files. The error was during the startup, so any idea about this? ERROR [FlushWriter:4] 2014-09-17 22:36:59,383

Re: Unable to query with token range.. unable to make long from ‘...'

2014-09-28 Thread graham sanderson
It is expecting a 64 bit value … murmer3 partitioner uses 64 bit long tokens… where did you get your 128 bit long from, and what partitioner are you using? On Sep 28, 2014, at 1:39 PM, Kevin Burton bur...@spinn3r.com wrote: I’m trying to query an entire table in parallel by splitting it up in

Re: Unable to query with token range.. unable to make long from ‘...'

2014-09-28 Thread graham sanderson
throughout the entire possible range of tokens (0 to 2127 -1) so it would need to be 2^63 -1 or 2^127-1 On Sun, Sep 28, 2014 at 1:19 PM, graham sanderson gra...@vast.com wrote: It is expecting a 64 bit value … murmer3 partitioner uses 64 bit long tokens… where did you get your 128 bit

Re: best practice for waiting for schema changes to propagate

2014-09-30 Thread graham sanderson
Also be aware of https://issues.apache.org/jira/browse/CASSANDRA-7734 if you are using C* 2.0.6+ (2.0.6 introduced a change that can sometimes causes initial schema propagation not to happen, introducing potentially long delays until some other code path repairs it later) On Sep 30, 2014, at

Re: Bitmaps

2014-10-06 Thread graham sanderson
You certainly have plenty of freedom to trade off size vs access granularity using multiple blobs. It really depends on how mutable the data is, how you intend to read it, whether it is highly sparse and or highly dense (in which case you perhaps don’t need to store every bit) etc. On Oct 6,

Re: describe tables… and vertical formatting?

2014-10-12 Thread graham sanderson
select keyspace_name, columnfamily_name from system.schema_columns; ? On Oct 12, 2014, at 10:29 AM, Kevin Burton bur...@spinn3r.com wrote: It seems annoying that I can’t get “describe tables” to vertical. maybe there’s some option I’m missing? Kevin -- Founder/CEO Spinn3r.com

Re: LOCAL_* consistency levels

2014-10-14 Thread graham sanderson
There were some versions of C* that didn’t allow you to use LOCAL_* and a single DC NetworkTopologyStrategy, or with SimpleTopologyStrategy. https://issues.apache.org/jira/browse/CASSANDRA-6238 I think You should use a NetworkTopologyStrategy with one DC for now. On Oct 14, 2014, at 7:39 AM,

Re: describe tables… and vertical formatting?

2014-10-14 Thread graham sanderson
. The problem now is that there are multiple entries per table... On Sun, Oct 12, 2014 at 10:39 AM, graham sanderson gra...@vast.com wrote: select keyspace_name, columnfamily_name from system.schema_columns; ? On Oct 12, 2014, at 10:29 AM, Kevin Burton bur...@spinn3r.com wrote: It seems

Re: Intermittent long application pauses on nodes

2014-10-24 Thread graham sanderson
Actually - there is -XX:+SafepointTimeout which will print out offending threads (assuming you reach a 10 second pause)… That is probably your best bet. On Oct 24, 2014, at 2:38 PM, graham sanderson gra...@vast.com wrote: This certainly sounds like a JVM bug. We are running C* 2.0.9

Re: Intermittent long application pauses on nodes

2014-10-24 Thread graham sanderson
And -XX:SafepointTimeoutDelay=xxx to set how long before it dumps output (defaults to 1 I believe)… Note it doesn’t actually timeout by default, it just prints the problematic threads after that time and keeps on waiting On Oct 24, 2014, at 2:44 PM, graham sanderson gra...@vast.com wrote

Re: Intermittent long application pauses on nodes

2014-10-31 Thread graham sanderson
if that gives us anything to act on. On Fri, Oct 24, 2014 at 3:52 PM, graham sanderson gra...@vast.com mailto:gra...@vast.com wrote: And -XX:SafepointTimeoutDelay=xxx to set how long before it dumps output (defaults to 1 I believe)… Note it doesn’t actually timeout by default, it just

Re: Client-side compression, cassandra or both?

2014-11-03 Thread graham sanderson
I wouldn’t do both. Unless a little server CPU or (and you’d have to measure it - I imagine it is probably not significant - as you say C* has more context, and hopefully most things can compress “0, “ repeatedly) disk space are an issue, I wouldn’t bother to compress yourself. Compression

Re: Why is one query 10 times slower than the other?

2014-11-05 Thread graham sanderson
In your “lookup_code” example “type” is not a clustercolumn it is the partition key, and hence the first query only hits one partition The second query is a range slice across all possible keys, so the sub-ranges are farmed out to nodes with the data. You are likely at CL_ONE, so it only needs

Re: What actually causing java.lang.OutOfMemoryError: unable to create new native thread

2014-11-10 Thread graham sanderson
First question are you running 32bit or 64bit… on 32bit you can easily run out of virtual address space for thread stacks. On Nov 10, 2014, at 8:25 AM, Jason Wee peich...@gmail.com wrote: Hello people, below is an extraction from cassandra system log. ERROR [Thread-273] 2012-04-10

Re: Trying to build Cassandra for FreeBSD 10.1

2014-11-17 Thread graham sanderson
Only thing I can see from looking at the exception, is that it looks like - I didn’t disassemble the code from hex - that the “peer” value in the RefCountedMemory object is probably 0 Given that Unsafe.allocateMemory should not return 0 even on allocation failure (which should throw OOM) -

Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts

2014-11-28 Thread graham sanderson
Your GC settings would be helpful, though you can see guesstimate by eyeballing (assuming settings are the same across all 4 images) Bursty load can be a big cause of old gen fragmentation (as small working set objects tends to get spilled (promoted) along with memtable slabs which aren’t

Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts

2014-11-28 Thread graham sanderson
, 2014, at 6:54 PM, graham sanderson gra...@vast.com wrote: Your GC settings would be helpful, though you can see guesstimate by eyeballing (assuming settings are the same across all 4 images) Bursty load can be a big cause of old gen fragmentation (as small working set objects tends

Re: Error when dropping keyspaces; One row required, 0 found

2014-12-02 Thread graham sanderson
I don’t know what it is but I also saw “empty” keyspaces via CQL while migrating an existing test cluster from 2.0.9 to 2.1.0 (final release bits prior to labelling). Since I was doing this manually (and had cqlsh problems due to python change) I figured it might have been me. My observation

Re: No schema agreement from live replicas?

2015-02-03 Thread graham sanderson
What version of C* are you using; you could be seeing https://issues.apache.org/jira/browse/CASSANDRA-7734 https://issues.apache.org/jira/browse/CASSANDRA-7734 which I think affects 2.0.7 thru 2.0.10 On Feb 3, 2015, at 9:47 AM, Clint Kelly clint.ke...@gmail.com wrote: FWIW increasing the

Re: Versioning in cassandra while indexing ?

2015-01-21 Thread graham sanderson
I believe you can use “USING TIMESTAMP XXX” with your inserts which will set the actual cell write times to the timestamp you provide. Then at least on read you’ll get the “latest” value… you may or may not incur an actual write of the old data to disk, but either way it’ll get cleaned up for

Re: Startup failure (Core dump) in Solaris 11 + JDK 1.8.0

2015-01-13 Thread graham sanderson
This might well be https://issues.apache.org/jira/browse/CASSANDRA-8325 https://issues.apache.org/jira/browse/CASSANDRA-8325 try the latest patch for that if you can. On Jan 13, 2015, at 4:50 AM, Bernardino Mota bernardino.m...@inovaworks.com wrote: Hi, Yes, with JDK1.7 it works but

Re: Fastest way to map/parallel read all values in a table?

2015-02-09 Thread graham sanderson
Depending on whether you have deletes/updates, if this is an ad-hoc thing, you might want to just read the ss tables directly. On Feb 9, 2015, at 12:56 PM, Kevin Burton bur...@spinn3r.com wrote: I had considered using spark for this but: 1. we tried to deploy spark only to find out that

Re: Upgrade from 2.0.9 to 2.1.3

2015-03-06 Thread graham sanderson
:15 PM, Robert Coli rc...@eventbrite.com wrote: On Fri, Mar 6, 2015 at 6:25 AM, graham sanderson gra...@vast.com mailto:gra...@vast.com wrote: I would definitely wait for at least 2.1.4 +1 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ https

Re: best practices for time-series data with massive amounts of records

2015-03-06 Thread graham sanderson
Note that using static column(s) for the “head” value, and trailing TTLed values behind is something we’re considering. Note this is especially nice if your head state includes say a map which is updated by small deltas (individual keys) We have not yet studied the effect of static columns on

Re: What are the reasons for holding off on 2.1.x at this point?

2015-03-09 Thread graham sanderson
2.1.3 has a few memory leaks/issues, resource management race conditions. That is horribly vague, however looking at some of the fixes in 2.1.4 I’d be tempted to wait on that. 2.1.3 is fine for testing though. On Mar 9, 2015, at 6:42 PM, Jacob Rhoden jacob.rho...@me.com wrote: I notice

Re: Upgrade from 2.0.9 to 2.1.3

2015-03-06 Thread graham sanderson
I would definitely wait for at least 2.1.4 On Mar 6, 2015, at 8:13 AM, Fredrik Larsson Stigbäck fredrik.l.stigb...@sitevision.se wrote: So no upgradeSSTables are required? /Fredrik 6 mar 2015 kl. 15:11 skrev Carlos Rolo r...@pythian.com mailto:r...@pythian.com: I would not

Re: Disastrous profusion of SSTables

2015-03-26 Thread graham sanderson
you may be seeing https://issues.apache.org/jira/browse/CASSANDRA-8860 https://issues.apache.org/jira/browse/CASSANDRA-8860 https://issues.apache.org/jira/browse/CASSANDRA-8635 https://issues.apache.org/jira/browse/CASSANDRA-8635 related issues (which ends up with excessive numbers of

Re: OOM and high SSTables count

2015-03-04 Thread graham sanderson
We can confirm a problem on 2.1.3 (sadly our beta sstable state obviously did not match our production ones in some critical way) We have about 20k sstables on each of 6 nodes right now; actually a quick glance shows 15k of those are from OpsCenter, which may have something to do with

DateTieredCompactionStrategy and static columns

2015-04-30 Thread graham sanderson
I have a potential use case I haven’t had a chance to prototype yet, which would normally be a good candidate for DTCS (i.e. data delivered in order and a fixed TTL), however with every write we’d also be updating some static cells (namely a few key/values in a static maptext.text CQL column).

cassanulldra 2.2

2015-05-11 Thread graham sanderson
I think vast may have changed the release schedule of cassandra. I talk a lot with one of their key developers, and 3.0 was going to drop off heap memtables for several releases due to a rewrite of the storage engine to be more CQL friendly. 2.2 will take all of the improvements in 3.0 but not

Re: Uderstanding Read after update

2015-04-13 Thread Graham Sanderson
Yes it will look in each sstable that according to the bloom filter may have data for that partition key and use time stamps to figure out the latest version (or none in case of newer tombstone) to return for each clustering key Sent from my iPhone On Apr 12, 2015, at 11:18 PM, Anishek

Re: Huge number of sstables after adding server to existing cluster

2015-04-03 Thread graham sanderson
As does 2.1.3 On Apr 3, 2015, at 5:36 PM, Robert Coli rc...@eventbrite.com wrote: On Fri, Apr 3, 2015 at 1:04 PM, Thomas Borg Salling tbsall...@tbsalling.dk mailto:tbsall...@tbsalling.dk wrote: I agree with Pranay. I have experienced exactly the same on C* 2.1.2. 2.1.2 had a serious bug

Re: Astyanax Thrift Frame Size Hardcoded - Breaks Ring Describe

2015-04-03 Thread graham sanderson
It is very stable for us; we don’t use it in many cases (generally older stuff where it was the best choice), but I think it is a little harsh to write it off On Apr 3, 2015, at 1:55 PM, Robert Coli rc...@eventbrite.com wrote: On Fri, Apr 3, 2015 at 11:16 AM, Eric Stevens migh...@gmail.com

Re: Huge number of sstables after adding server to existing cluster

2015-04-04 Thread graham sanderson
number for sstables for normally operating cassandra node? Best regards Mantas On Sat, Apr 4, 2015 at 4:47 AM, graham sanderson gra...@vast.com mailto:gra...@vast.com wrote: As does 2.1.3 On Apr 3, 2015, at 5:36 PM, Robert Coli rc...@eventbrite.com mailto:rc...@eventbrite.com wrote

Re: Throttle Heavy Read / Write Loads

2015-06-05 Thread Graham Sanderson
Are you doing large batch inserts via thrift - you need to be careful there Sent from my iPhone On Jun 4, 2015, at 11:37 PM, Anishek Agarwal anis...@gmail.com wrote: may be just increase the read and write timeouts at cassandra currently at 5 sec i think. i think the datastax java client

Re: Question about consistency in cassandra 2.0.9

2015-06-11 Thread graham sanderson
It looks (I’m guessing with entirely not enough info) that you only have two nodes in DC4, and are probably writing at QUORUM reading at LOCAL_ONE. But please specify your configuration On Jun 11, 2015, at 7:01 PM, K F kf200...@yahoo.com wrote: Hi, I am running a cassandra cluster with 4

Re: Cassandra 2.2, 3.0, and beyond

2015-06-11 Thread graham sanderson
I think the point is that 2.2 will replace 2.1.x + (i.e. the done/safe bits of 3.0 are included in 2.2).. so 2.2.x and 2.1.x are somewhat synonymous. On Jun 11, 2015, at 8:14 PM, Mohammed Guller moham...@glassbeam.com wrote: Considering that 2.1.6 was just released and it is the first

Re: GC pauses affecting entire cluster.

2015-06-01 Thread graham sanderson
Yes native_objects is the way to go… you can tell if memtables are you problem because you’ll see promotion failures of objects sized 131074 dwords. If your h/w is fast enough make your young gen as big as possible - we can collect 8G in sub second always, and this gives you your best chance of

Re: 10000+ CF support from Cassandra

2015-05-26 Thread graham sanderson
Are the CFs different, or all the same schema? Are you contractually obligated to actually separate data into separate CFs? It seems like you’d have a lot simpler time if you could use the part of the partition key to separate data. Note also, I don’t know what disks you are using, but disk

Re: 10000+ CF support from Cassandra

2015-05-28 Thread Graham Sanderson
Depending on your use case and data types (for example if you can have a minimally Nested Json representation of the objects; Than you could go with a common mapstring,string representation where keys are top love object fields and values are valid Json literals as strings; eg unquoted

Re: 10000+ CF support from Cassandra

2015-06-01 Thread graham sanderson
I strongly advise against this approach. Jon, I think so too. But so you actually foresee any problems with this approach? I can think of a few. [I want to evaluate if we can live with this problem] Just to be clear, I’m not saying this is a great approach, I AM saying that it may be better

Re: How to measure disk space used by a keyspace?

2015-07-01 Thread graham sanderson
If you are pushing metric data to graphite, there is org.apache.cassandra.metrics.keyspace.keyspace_name.LiveDiskSpaceUsed.value … for each node; Easy enough to graph the sum across machines. Metrics/JMX are tied together in C*, so there is an equivalent value exposed via JMX… I don’t know

Re: What are problems with schema disagreement

2015-07-02 Thread graham sanderson
What version of C* are you running? Some versions of 2.0.x might occasionally fail to propagate schema changes in a timely fashion (though they would fix themselves eventually - in the order of a few minutes) On Jul 2, 2015, at 9:37 PM, John Wong gokoproj...@gmail.com wrote: Hi. Here is

Re: Slow performance because of used-up Waste in AtomicBTreeColumns

2015-07-23 Thread Graham Sanderson
Multiple writes to a single partition key are guaranteed to be atomic. Therefore there has to be some protection. First rule of thumb, don’t write at insanely high rates to the same partition key concurrently (you can probably avoid this, but hints as currently implemented suffer because the

Re: Bulk loading performance

2015-07-13 Thread Graham Sanderson
Ironically in my experience the fastest ways to get data into C* are considered “anti-patterns” by most (but I have no problem saturating multiple gigabit network links if I really feel like inserting fast) It’s been a while since I tried some of the newer approaches though (my fast load code

Re: High cpu usage when the cluster is idle

2015-10-24 Thread Graham Sanderson
I would imagine you are running on fairly slow machines (given the CPU usage), but 2.0.12 and 2.1 use a fairly old version of the yammer/codehale metrics library. It is waking up every 5 seconds, and updating Meters… there are a bunch of these Meters per table (embedded in Timers), so your

Re: Cassandra stalls and dropped messages not due to GC

2015-10-29 Thread Graham Sanderson
distributed database technology, > delivering Apache Cassandra to the world’s most innovative enterprises. > Datastax is built to be agile, always-on, and predictably scalable to any > size. With more than 500 customers in 45 countries, DataStax is the database > technology and

Re: compression cpu overhead

2015-11-03 Thread Graham Sanderson
On read or write? https://issues.apache.org/jira/browse/CASSANDRA-7039 and friends in 2.2 should make some difference, I didn’t immediately find perf numbers though. > On Nov 3, 2015, at 5:42 PM, Dan Kinder wrote: >

Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread Graham Sanderson
Agreed too. It also matters what you are inserting… if you are inserting to the same (or small set of) partition key(s) you will be limited because writes to the same partition key on a single node are atomic and isolated. > On Nov 5, 2015, at 8:49 PM, Venkatesh Arivazhagan

Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread Graham Sanderson
Also it sounds like you are reading the data from a single file - the problem could easily be with your load tool try (as someone suggested) using cassandra stress > On Nov 5, 2015, at 9:06 PM, Graham Sanderson <gra...@vast.com> wrote: > > Agreed too. It also matters what yo

Re: Cassandra stalls and dropped messages not due to GC

2015-10-29 Thread Graham Sanderson
you didn’t say what you upgraded from, but if it is 2.0.x, then look at CASSANDRA-9504 If so and you use commitlog_sync: batch Then you probably want to set commitlog_sync_batch_window_in_ms: 1 (or 2) Note I’m only slightly convinced this is the cause because of your READ_REPAIR issues (though

BEWARE https://issues.apache.org/jira/browse/CASSANDRA-9504

2015-10-19 Thread Graham Sanderson
If you had Cassandra 2.0.x (possibly before) and upgraded to Cassandra 2.1, you may have had commitlog_sync: batch commitlog_sync_batch_window_in_ms: 25 in you cassiandra.yaml It turned out that this was pretty much broken in 2.0 (i.e. fsyncs just happened immediately), but fixed in 2.1,

Re: BEWARE https://issues.apache.org/jira/browse/CASSANDRA-9504

2015-10-19 Thread Graham Sanderson
an issue > On Oct 19, 2015, at 11:37 AM, Graham Sanderson <gra...@vast.com> wrote: > > - commitlog_sync_batch_window_in_ms behavior has changed from the > maximum time to wait between fsync to the minimum time. We are > working on making this more user-friendl

Re: unusual GC log

2015-10-20 Thread Graham Sanderson
What version of C* are you running? any special settings in cassandra.yaml; are you running with stock GC settings in cassandra-env.sh? what JDK/OS? > On Oct 19, 2015, at 11:40 PM, 曹志富 wrote: > > INFO [Service Thread] 2015-10-20 10:42:47,854 GCInspector.java:252 - ParNew

Re: BEWARE https://issues.apache.org/jira/browse/CASSANDRA-9504

2015-10-19 Thread Graham Sanderson
from starving. The suggested default is now 2ms. was added retroactively to NEWS.txt in 2.1.6 which is why it is not obvious > On Oct 19, 2015, at 11:03 AM, Michael Shuler <mich...@pbandjelly.org> wrote: > > On 10/19/2015 10:55 AM, Graham Sanderson wrote: >> If you had Cass

Re: Realtime data and (C)AP

2015-10-10 Thread Graham Sanderson
t; I've used the Java driver's DowngradingConsistencyRetryPolicy for that in > cases where it makes sense. > > Ref: > http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/DowngradingConsistencyRetryPolicy.html > > Steve > > > >> On Fri, Oct

Re: Realtime data and (C)AP

2015-10-11 Thread Graham Sanderson
me other cool > things like integrate Zipkin tracing at a driver level, and add other utility > like token aware batches, and concurrent token aware batch selects. > > On Sat, Oct 10, 2015 at 2:49 PM Graham Sanderson <gra...@vast.com > <mailto:gra...@vast.com>>

Re: Realtime data and (C)AP

2015-10-09 Thread Graham Sanderson
Most of our writes are not user facing so local_quorum is good... We also read at local_quorum because we prefer guaranteed consistency... But we very quickly fall back to local_one in the cases where some data fast is better than a failure. Currently we do that on a per read basis but we could

Re: Realtime data and (C)AP

2015-10-09 Thread Graham Sanderson
> On Oct 9, 2015, at 8:02 PM, Graham Sanderson <gra...@vast.com> wrote: > > Most of our writes are not user facing so local_quorum is good... We also > read at local_quorum because we prefer guaranteed consistency... But we very > quickly fall back to local_one in the cases

  1   2   >