Timeout reading row from CF with collections
I'm running into a problem trying to read data from a column family that includes a number of collections. Cluster details: 4 nodes running 1.2.6 on VMs with 4 cpus and 7 Gb of ram. raid 0 striped across 4 disks for the data and logs each node has about 500 MB of data currently loaded Here is the schema: create table user_scores ( user_id varchar, post_type varchar, score double, team_to_score_map mapvarchar, double, affiliation_to_score_map mapvarchar, double, campaign_to_score_map mapvarchar, double, person_to_score_map mapvarchar, double, primary key(user_id, post_type) ) with compaction = { 'class' : 'LeveledCompactionStrategy', 'sstable_size_in_mb' : 10 }; I used the leveled compaction strategy as I thought it would help with read latency… Here is a trace of a simple select against the cluster when it had nothing else was reading or writing (cpu was 2%): activity| timestamp| source | source_elapsed -+--++ execute_cql3_query | 05:51:34,557 | 100.69.176.51 | 0 Message received from /100.69.176.51 | 05:51:34,195 | 100.69.184.134 |102 Executing single-partition query on user_scores | 05:51:34,199 | 100.69.184.134 | 3512 Acquiring sstable references | 05:51:34,199 | 100.69.184.134 | 3741 Merging memtable tombstones | 05:51:34,199 | 100.69.184.134 | 3890 Key cache hit for sstable 5 | 05:51:34,199 | 100.69.184.134 | 4040 Seeking to partition beginning in data file | 05:51:34,199 | 100.69.184.134 | 4059 Merging data from memtables and 1 sstables | 05:51:34,200 | 100.69.184.134 | 4412 Parsing select * from user_scores where user_id='26257166' LIMIT 1; | 05:51:34,558 | 100.69.176.51 | 91 Peparing statement | 05:51:34,558 | 100.69.176.51 |238 Enqueuing data request to /100.69.184.134 | 05:51:34,558 | 100.69.176.51 |567 Sending message to /100.69.184.134 | 05:51:34,558 | 100.69.176.51 |979 Request complete | 05:51:54,562 | 100.69.176.51 | 20005209 You can see that I increased the timeout and it still fails. This seems to happen with rows that have maps with a larger number of entries. It is very reproducible with my current data set. Any ideas on why I can't query for a row? Thanks! Paul
Cassandra-CQL-Csharp-driver-sample
Hi, I created a very simple CRUD operation using Cassandra CQL C-sharp driver. If somebody is interested, please try it out and feedback / comments are welcome. https://github.com/muralidharand/cassandra-CQL-csharp-driver-sample -- Thanks, Murali
Re: Node tokens / data move
Can he not specify all 256 tokens in the YAML of the new cluster and then copy sstables? I know it is a bit ugly but should work. You can pass a comma separated list of tokens to the -Dcassandra.replace_token JVM param. AFAIK it's not possible to provide the list in the yaml file. Cheers A - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 11/07/2013, at 5:07 AM, Baskar Duraikannu baskar.duraikannu...@gmail.com wrote: I copied the sstables and then ran a repair. It worked. Looks like export and import may have been much faster given that we had very little data. Thanks everyone. On Tue, Jul 9, 2013 at 1:34 PM, sankalp kohli kohlisank...@gmail.com wrote: Hi Aaron, Can he not specify all 256 tokens in the YAML of the new cluster and then copy sstables? I know it is a bit ugly but should work. Sankalp On Tue, Jul 9, 2013 at 3:19 AM, Baskar Duraikannu baskar.duraikannu...@gmail.com wrote: Thanks Aaron On 7/9/13, aaron morton aa...@thelastpickle.com wrote: Can I just copy data files for the required keyspaces, create schema manually and run repair? If you have something like RF 3 and 3 nodes then yes, you can copy the data from one node in the source cluster to all nodes in the dest cluster and use cleanup to remove the unneeded data. Because each node in the source cluster has a full copy of the data. If that's not the case you cannot copy the data files, even if they have the same number of nodes, because the nodes in the dest cluster will have different tokens. AFAIK you need to export the full data set from the source DC and then import it into the dest system. The Bulk Load utility may be of help http://www.datastax.com/docs/1.2/references/bulkloader . You could copy the SSTables from every node in the source system and bulk load them into the dest system. That process will ensure rows are sent to nodes that are replicas. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 9/07/2013, at 12:45 PM, Baskar Duraikannu baskar.duraikannu...@gmail.com wrote: We have two clusters used by two different groups with vnodes enabled. Now there is a need to move some of the keyspaces from cluster 1 to cluster 2. Can I just copy data files for the required keyspaces, create schema manually and run repair? Anything else required? Please help. -- Thanks, Baskar Duraikannu
Re: how to determine RF on the fly ?
It's available on the Thrift API call describe_keyspaces() https://github.com/apache/cassandra/blob/trunk/interface/cassandra.thrift#L730 Cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 11/07/2013, at 7:04 AM, Robert Coli rc...@eventbrite.com wrote: On Wed, Jul 10, 2013 at 12:58 AM, Илья Шипицин chipits...@gmail.com wrote: is there easy way to determine current RF, for instance, via mx4j ? The methods which show keyspace or schema (from CLI or cqlsh) show the replication factor, as the replication factor is a keyspace property. I don't believe it's available via JMX, but there's no reason it couldn't be... =Rob
Re: Quorum reads and response time
But when I run the same query with consistency level as Quorum, it is taking ~2.3 seconds. It feels as if querying of the nodes are in sequence. No. As Sankalp says look for GC issues. If none then take a look at how much data you are pulling back, and tell us what sort of query you are using. Cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 11/07/2013, at 7:10 AM, sankalp kohli kohlisank...@gmail.com wrote: The coordinator node has to merge the results from 2 nodes and the request is done in parallel. I have seen lot of GC pressure with range queries because of tombstones. Can you see logs to see if there is lot of GC going on. Also try to have GC log enabled. On Wed, Jul 10, 2013 at 9:57 AM, Baskar Duraikannu baskar.duraikannu...@gmail.com wrote: Just adding few other details to my question. - We are using RandomPartitioner - 256 virtual nodes configured. On Wed, Jul 10, 2013 at 12:54 PM, Baskar Duraikannu baskar.duraikannu...@gmail.com wrote: I have a 3 node cluster with RF=3. All nodes are running. I have a table with 39 rows and ~44,000 columns evenly spread across 39 rows. When I do range slice query on this table with consistency of one, it returns the data back in about ~600 ms. I tried the same from all of the 3 nodes,no matter which node I ran it from, queries were answered in 600 ms for consistency level of one. But when I run the same query with consistency level as Quorum, it is taking ~2.3 seconds. It feels as if querying of the nodes are in sequence. Is this normal? -- Regards, Baskar Duraikannu
Re: Timeout reading row from CF with collections
My bet is that you're hitting https://issues.apache.org/jira/browse/CASSANDRA-5677. -- Sylvain On Fri, Jul 12, 2013 at 8:17 AM, Paul Ingalls paulinga...@gmail.com wrote: I'm running into a problem trying to read data from a column family that includes a number of collections. Cluster details: 4 nodes running 1.2.6 on VMs with 4 cpus and 7 Gb of ram. raid 0 striped across 4 disks for the data and logs each node has about 500 MB of data currently loaded Here is the schema: create table user_scores ( user_id varchar, post_type varchar, score double, team_to_score_map mapvarchar, double, affiliation_to_score_map mapvarchar, double, campaign_to_score_map mapvarchar, double, person_to_score_map mapvarchar, double, primary key(user_id, post_type) ) with compaction = { 'class' : 'LeveledCompactionStrategy', 'sstable_size_in_mb' : 10 }; I used the leveled compaction strategy as I thought it would help with read latency… Here is a trace of a simple select against the cluster when it had nothing else was reading or writing (cpu was 2%): activity| timestamp| source | source_elapsed -+--++ execute_cql3_query | 05:51:34,557 | 100.69.176.51 | 0 Message received from /100.69.176.51| 05:51:34,195 | 100.69.184.134 |102 Executing single-partition query on user_scores | 05:51:34,199 | 100.69.184.134 | 3512 Acquiring sstable references | 05:51:34,199 | 100.69.184.134 | 3741 Merging memtable tombstones | 05:51:34,199 | 100.69.184.134 | 3890 Key cache hit for sstable 5 | 05:51:34,199 | 100.69.184.134 | 4040 Seeking to partition beginning in data file | 05:51:34,199 | 100.69.184.134 | 4059 Merging data from memtables and 1 sstables | 05:51:34,200 | 100.69.184.134 | 4412 Parsing select * from user_scores where user_id='26257166' LIMIT 1; | 05:51:34,558 | 100.69.176.51 | 91 Peparing statement | 05:51:34,558 | 100.69.176.51 |238 Enqueuing data request to /100.69.184.134| 05:51:34,558 | 100.69.176.51 |567 Sending message to /100.69.184.134| 05:51:34,558 | 100.69.176.51 |979 Request complete | 05:51:54,562 | 100.69.176.51 | 20005209 You can see that I increased the timeout and it still fails. This seems to happen with rows that have maps with a larger number of entries. It is very reproducible with my current data set. Any ideas on why I can't query for a row? Thanks! Paul
Re: temporarily running a cassandra side by side in production
We are starting to think we are going to try to run a side by side cassandra instance in production while we map/reduce from one cassandra into the new instance. What do you mean by side-by-side ? Can I assume a cassandra instance will not only bind to the new ports when I change these values but will talk to the other cassandra nodes on those same ports as well such that this cassandra instance is completely independent of my other cassandra instance? Not sure what you mean, but all nodes in the same cluster must be configured with the same storage port. The best way to ensure clusters to not interfere with each other is to have different seed lists and different cluster names. Are there other gotchas that I have to be aware of? I'm not sure what you are attempting to do. Cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 11/07/2013, at 11:37 AM, Hiller, Dean dean.hil...@nrel.gov wrote: We have a 12 node production cluster and a 4 node QA cluster. We are starting to think we are going to try to run a side by side cassandra instance in production while we map/reduce from one cassandra into the new instance. We are intending to do something like this Modify all ports in cassandra.yaml and the jmx port in cassandra-env.sh, 7000, 7001, 9160, 9042, and cassandra-env 7199. Can I assume a cassandra instance will not only bind to the new ports when I change these values but will talk to the other cassandra nodes on those same ports as well such that this cassandra instance is completely independent of my other cassandra instance? Are there other gotchas that I have to be aware of? (we are refactoring our model into a new faster model that we tested in QA with live data as well as moving randompartitioner to murmur) Thanks, Dean
Re: manually removing sstable
That sounds sane to me. Couple of caveats: * Remember that Expiring Columns turn into Tombstones and can only be purged after TTL and gc_grace. * Tombstones will only be purged if all fragments of a row are in the SStable(s) being compacted. Cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 11/07/2013, at 10:17 PM, Theo Hultberg t...@iconara.net wrote: a colleague of mine came up with an alternative solution that also seems to work, and I'd just like your opinion on if it's sound. we run find to list all old sstables, and then use cmdline-jmxclient to run the forceUserDefinedCompaction function on each of them, this is roughly what we do (but with find and xargs to orchestrate it) java -jar cmdline-jmxclient-0.10.3.jar - localhost:7199 org.apache.cassandra.db:type=CompactionManager forceUserDefinedCompaction=the_keyspace,db_file_name the downside is that c* needs to read the file and do disk io, but the upside is that it doesn't require a restart. c* does a little more work, but we can schedule that during off-peak hours. another upside is that it feels like we're pretty safe from screwups, we won't accidentally remove an sstable with live data, the worst case is that we ask c* to compact an sstable with live data and end up with an identical sstable. if anyone else wants to do the same thing, this is the full cron command: 0 4 * * * find /path/to/cassandra/data/the_keyspace_name -maxdepth 1 -type f -name '*-Data.db' -mtime +8 -printf forceUserDefinedCompaction=the_keyspace_name,\%P\n | xargs -t --no-run-if-empty java -jar /usr/local/share/java/cmdline-jmxclient-0.10.3.jar - localhost:7199 org.apache.cassandra.db:type=CompactionManager just change the keyspace name and the path to the data directory. T# On Thu, Jul 11, 2013 at 7:09 AM, Theo Hultberg t...@iconara.net wrote: thanks a lot. I can confirm that it solved our problem too. looks like the C* 2.0 feature is perfect for us. T# On Wed, Jul 10, 2013 at 7:28 PM, Marcus Eriksson krum...@gmail.com wrote: yep that works, you need to remove all components of the sstable though, not just -Data.db and, in 2.0 there is this: https://issues.apache.org/jira/browse/CASSANDRA-5228 /Marcus On Wed, Jul 10, 2013 at 2:09 PM, Theo Hultberg t...@iconara.net wrote: Hi, I think I remember reading that if you have sstables that you know contain only data that whose ttl has expired, it's safe to remove them manually by stopping c*, removing the *-Data.db files and then starting up c* again. is this correct? we have a cluster where everything is written with a ttl, and sometimes c* needs to compact over a 100 gb of sstables where we know ever has expired, and we'd rather just manually get rid of those. T#
Re: IllegalArgumentException on query with AbstractCompositeType
The “ALLOW FILTERING” clause also has no effect. You only need that when the WHERE clause contains predicates for columns that are not part of the primary key. CREATE INDEX ON conv_msgdata_by_participant_cql(msgReadFlag); On general this is a bad idea in Cassandra (also in a relational DB IMHO). You will get poor performance from it. Caused by: java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:247) at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51) at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31) at org.apache.cassandra.db.columniterator.IndexedSliceReader$BlockFetcher.isColumnBeforeSliceFinish(IndexedSliceReader.java:216) at org.apache.cassandra.db.columniterator.IndexedSliceReader$SimpleBlockFetcher.init(IndexedSliceReader.java:450) at org.apache.cassandra.db.columniterator.IndexedSliceReader.init(IndexedSliceReader.java:85) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:68) This looks like an error in the on disk data, or maybe in passing the value for the messageId value but I doubt it. What version are you using ? Can you reproduce this outside of your unit tests ? Cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 12/07/2013, at 12:40 AM, Pruner, Anne (Anne) pru...@avaya.com wrote: Hi, I’ve been tearing my hair out trying to figure out why this query fails. In fact, it only fails on machines with slower CPUs and after having previously run some other junit tests. I’m running junits to an embedded Cassandra server, which works well in pretty much all other cases, but this one is flaky. I’ve tried to rule out timing issues by placing a 10 second delay just before this query, just in case somehow the data isn’t getting into the db in a timely manner, but that doesn’t have any effect. I’ve also tried removing the “ORDER BY” clause, which seems to be the place in the code it’s getting hung up on, but that also doesn’t have any effect. The “ALLOW FILTERING” clause also has no effect. DEBUG [Native-Transport-Requests:16] 2013-07-10 16:28:21,993 Message.java (line 277) Received: QUERY SELECT * FROM conv_msgdata_by_participant_cql WHEREentityConversationId='bulktestfromus...@test.cacontact_811b5efc-b621-4361-9dc9-2e4755be7d89' AND messageId'2013-07-10T20:29:09.773Zzz' ORDER BY messageId DESC LIMIT 15 ALLOW FILTERING; ERROR [ReadStage:34] 2013-07-10 16:28:21,995 CassandraDaemon.java (line 132) Exception in thread Thread[ReadStage:34,5,main] java.lang.RuntimeException: java.lang.IllegalArgumentException at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:247) at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51) at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31) at org.apache.cassandra.db.columniterator.IndexedSliceReader$BlockFetcher.isColumnBeforeSliceFinish(IndexedSliceReader.java:216) at org.apache.cassandra.db.columniterator.IndexedSliceReader$SimpleBlockFetcher.init(IndexedSliceReader.java:450) at org.apache.cassandra.db.columniterator.IndexedSliceReader.init(IndexedSliceReader.java:85) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:68) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:44) at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:101) at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:275) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) at
Re: manually removing sstable
thanks aaron, the second point I had not considered, and it could explain why the sstables don't always disapear completely, sometimes a small file (but megabytes instead of gigabytes) is left behind. T# On Fri, Jul 12, 2013 at 10:25 AM, aaron morton aa...@thelastpickle.comwrote: That sounds sane to me. Couple of caveats: * Remember that Expiring Columns turn into Tombstones and can only be purged after TTL and gc_grace. * Tombstones will only be purged if all fragments of a row are in the SStable(s) being compacted. Cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 11/07/2013, at 10:17 PM, Theo Hultberg t...@iconara.net wrote: a colleague of mine came up with an alternative solution that also seems to work, and I'd just like your opinion on if it's sound. we run find to list all old sstables, and then use cmdline-jmxclient to run the forceUserDefinedCompaction function on each of them, this is roughly what we do (but with find and xargs to orchestrate it) java -jar cmdline-jmxclient-0.10.3.jar - localhost:7199 org.apache.cassandra.db:type=CompactionManager forceUserDefinedCompaction=the_keyspace,db_file_name the downside is that c* needs to read the file and do disk io, but the upside is that it doesn't require a restart. c* does a little more work, but we can schedule that during off-peak hours. another upside is that it feels like we're pretty safe from screwups, we won't accidentally remove an sstable with live data, the worst case is that we ask c* to compact an sstable with live data and end up with an identical sstable. if anyone else wants to do the same thing, this is the full cron command: 0 4 * * * find /path/to/cassandra/data/the_keyspace_name -maxdepth 1 -type f -name '*-Data.db' -mtime +8 -printf forceUserDefinedCompaction=the_keyspace_name,\%P\n | xargs -t --no-run-if-empty java -jar /usr/local/share/java/cmdline-jmxclient-0.10.3.jar - localhost:7199 org.apache.cassandra.db:type=CompactionManager just change the keyspace name and the path to the data directory. T# On Thu, Jul 11, 2013 at 7:09 AM, Theo Hultberg t...@iconara.net wrote: thanks a lot. I can confirm that it solved our problem too. looks like the C* 2.0 feature is perfect for us. T# On Wed, Jul 10, 2013 at 7:28 PM, Marcus Eriksson krum...@gmail.comwrote: yep that works, you need to remove all components of the sstable though, not just -Data.db and, in 2.0 there is this: https://issues.apache.org/jira/browse/CASSANDRA-5228 /Marcus On Wed, Jul 10, 2013 at 2:09 PM, Theo Hultberg t...@iconara.net wrote: Hi, I think I remember reading that if you have sstables that you know contain only data that whose ttl has expired, it's safe to remove them manually by stopping c*, removing the *-Data.db files and then starting up c* again. is this correct? we have a cluster where everything is written with a ttl, and sometimes c* needs to compact over a 100 gb of sstables where we know ever has expired, and we'd rather just manually get rid of those. T#
Extract meta-data using cql 3
Hi experts, How to extract meta-data of a table or a keyspace using CQL 3.0? -- Thanks, Murali
Re: Extract meta-data using cql 3
The raw answer is that you should query the system tables. The schema is stored in the 3 following tables: System.schema_keyspaces, System.schema_columnfamilies and System.schema_columns. Unfortunately, the information stored in there is, for different reasons, not in a form that makes a lot of sense from a CQL3 point of view. So in practice, you should probably rely on your client drivers that might provide that same information but in a more usable way. For instance, with cqlsh, you have a DESCRIBE command. Of if you say use the DataStax Java driver, you can access all those metadata through cluster.getMetadata().getKeyspaces(), etc... On Fri, Jul 12, 2013 at 10:52 AM, Murali muralidharan@gmail.com wrote: Hi experts, How to extract meta-data of a table or a keyspace using CQL 3.0? -- Thanks, Murali
Re: Extract meta-data using cql 3
there's a keyspace called system which has a few tables that contain the metadata. for example schema_keyspaces that contain keyspace metadata, and schema_columnfamilies that contain table metadata. there are more, just fire up cqlsh and do a describe keyspace in the system keyspace to find them. T# On Fri, Jul 12, 2013 at 10:52 AM, Murali muralidharan@gmail.com wrote: Hi experts, How to extract meta-data of a table or a keyspace using CQL 3.0? -- Thanks, Murali
Re: Alternate major compaction
with some very little work (less then 10 KB of code) is possible to have online sstable splitter and exported this functionality over JMX.
Error: Main method not found in class org.apache.cassandra.service.CassandraDaemon
Earlier, everything was working fine but now i am getting this strange error. Initially i was working via tarball installation and did install a Cassandra rpm package. Since then, i am getting Error: Main method not found in class org.apache.cassandra.service.CassandraDaemon, please define the main method as: public static void main(String[] args) running from tarball installation. I did try setting CASSANDRA_HOME as CASSANDRA_HOME=/home/impadmin/software/apache-cassandra-1.2.4/ but no luck. This error is quite confusing, how can a user define a main method within Cassandra source code?? -Vivek
[BETA RELEASE] Apache Cassandra 2.0.0-beta1 released
The Cassandra team is pleased to announce the release of the first beta for the future Apache Cassandra 2.0.0. Let me first stress that this is beta software and as such is *not* ready for production use. The goal of this release is to give a preview of what will become Cassandra 2.0 and to get wider testing before the final release. As such, it is likely not bug free but all help in testing this beta would be greatly appreciated and will help make 2.0 a solid release. So please report any problem you may encounter[3,4] with this release and have a look at the change log[1] and release notes[2] to see where Cassandra 2.0 differs from the previous series. Apache Cassandra 2.0.0-beta1[5] is available as usual from the cassandra website (http://cassandra.apache.org/download/) and a debian package is available using the 20x branch (see http://wiki.apache.org/cassandra/DebianPackaging). Thank you for your help in testing and have fun with it. [1]: http://goo.gl/TjQGd (CHANGES.txt) [2]: http://goo.gl/K4QsX (NEWS.txt) [3]: https://issues.apache.org/jira/browse/CASSANDRA [4]: user@cassandra.apache.org [5]: http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/cassandra-2.0.0-beta1
Re: temporarily running a cassandra side by side in production
Heh, oops, yes, We have 12 nodes and are trying to run 2 instances of cassandra on those 12 nodes. So far, in QA this appears to be working. I like clustername change idea as a just in case so I will definitely be doing that one. Thanks, Dean From: aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday, July 12, 2013 2:20 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: temporarily running a cassandra side by side in production We are starting to think we are going to try to run a side by side cassandra instance in production while we map/reduce from one cassandra into the new instance. What do you mean by side-by-side ? Can I assume a cassandra instance will not only bind to the new ports when I change these values but will talk to the other cassandra nodes on those same ports as well such that this cassandra instance is completely independent of my other cassandra instance? Not sure what you mean, but all nodes in the same cluster must be configured with the same storage port. The best way to ensure clusters to not interfere with each other is to have different seed lists and different cluster names. Are there other gotchas that I have to be aware of? I'm not sure what you are attempting to do. Cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 11/07/2013, at 11:37 AM, Hiller, Dean dean.hil...@nrel.govmailto:dean.hil...@nrel.gov wrote: We have a 12 node production cluster and a 4 node QA cluster. We are starting to think we are going to try to run a side by side cassandra instance in production while we map/reduce from one cassandra into the new instance. We are intending to do something like this Modify all ports in cassandra.yaml and the jmx port in cassandra-env.sh, 7000, 7001, 9160, 9042, and cassandra-env 7199. Can I assume a cassandra instance will not only bind to the new ports when I change these values but will talk to the other cassandra nodes on those same ports as well such that this cassandra instance is completely independent of my other cassandra instance? Are there other gotchas that I have to be aware of? (we are refactoring our model into a new faster model that we tested in QA with live data as well as moving randompartitioner to murmur) Thanks, Dean
Compression ratio
Hi All, Can anyone explain the compression ratio? Is it the compressed data / original or original/ compressed ? Or something else. thanks a lot. Best Regards, Cem
Representation of dynamically added columns in table (column family) schema using cqlsh
A basic question and it seems that I have a gap in my understanding. I have a simple table in Cassandra with multiple column families. I add new columns to each of these column families on the fly. When I view (using the 'DESCRIBE table' command) the schema of a particular column family, I see only one entry for column (bolded below). What is the reason for that? The column that I am adding have string names and byte values, written using Hector 1.1-3 ( HFactory.createColumn(...) method). CREATE TABLE mytable ( key text, *column1* ascii, value blob, PRIMARY KEY (key, column1) ) WITH COMPACT STORAGE AND bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=1.00 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh 3.0.2 Cassandra 1.2.5 CQL spec 3.0.0 Thrift protocol 19.36.0 Given this, I can also only query on this one column1 or value using the 'SELECT' statement. The OpsCenter on the other hand, displays multiple columns as expected. Basically the demarcation of multiple columns i clearer. Thanks a lot. Regards, Shahab
Re: node tool ring displays 33.33% owns on 3 node cluster with replication
Not sure if it's the best/intended behavior, but you should see it go back to 100% if you run: nodetool -h 127.0.0.1 -p 8080 ring keyspace. I think the rationale for showing 33% is that different keyspaces might have different RFs, so it's unclear what to show for ownership. However, if you include the keyspace as part of your query, you'll get it weighted by the RF of that keyspace. I believe the same logic applies for nodetool status. Andrew On Thu, Jul 11, 2013 at 12:58 PM, Jason Tyler jaty...@yahoo-inc.com wrote: Thanks Rob! I was able to confirm with getendpoints. Cheers, ~Jason From: Robert Coli rc...@eventbrite.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Wednesday, July 10, 2013 4:09 PM To: user@cassandra.apache.org user@cassandra.apache.org Cc: Francois Richard frich...@yahoo-inc.com Subject: Re: node tool ring displays 33.33% owns on 3 node cluster with replication On Wed, Jul 10, 2013 at 4:04 PM, Jason Tyler jaty...@yahoo-inc.comwrote: Is this simply a display issue, or have I lost replication? Almost certainly just a display issue. Do nodetool -h localhost getendpoints keyspace columnfamily 0, which will tell you the endpoints for the non-transformed key 0. It should give you 3 endpoints. You could also do this test with a known existing key and then go to those nodes and verify that they have that data on disk via sstable2json. (FWIW, it is an odd display issue/bug if it is one. Because it has reverted to pre-1.1 behavior...) =Rob
Re: Compression ratio
it's compressed/original. https://github.com/apache/cassandra/blob/cassandra-1.1.11/src/java/org/apache/cassandra/io/sstable/SSTableMetadata.java#L124 On Fri, Jul 12, 2013 at 10:02 AM, cem cayiro...@gmail.com wrote: Hi All, Can anyone explain the compression ratio? Is it the compressed data / original or original/ compressed ? Or something else. thanks a lot. Best Regards, Cem -- Yuki Morishita t:yukim (http://twitter.com/yukim)
Re: Compression ratio
Thank you very much! On Fri, Jul 12, 2013 at 5:59 PM, Yuki Morishita mor.y...@gmail.com wrote: it's compressed/original. https://github.com/apache/cassandra/blob/cassandra-1.1.11/src/java/org/apache/cassandra/io/sstable/SSTableMetadata.java#L124 On Fri, Jul 12, 2013 at 10:02 AM, cem cayiro...@gmail.com wrote: Hi All, Can anyone explain the compression ratio? Is it the compressed data / original or original/ compressed ? Or something else. thanks a lot. Best Regards, Cem -- Yuki Morishita t:yukim (http://twitter.com/yukim)
Re: How many DCs can you have in a cluster?
More than the DC, I think you will be bound by number of replicas. I dont know how it will work in case of 10-20 replication factor specially for range queries. On Thu, Jul 11, 2013 at 7:14 PM, Blair Zajac bl...@orcaware.com wrote: In this C* Summit 2013 talk titled A Deep Dive Into How Cassandra Resolves Inconsistent Data [1], Jason Brown of Netflix mentions that they have 5 data centers in the same cluster, two in the US, one in Europe, one in Brazil and one in Asia (I'm going from memory now since I don't want to watch the video again). Is there a practical limit on how many different data centers one can have in a single cluster? Thanks, Blair [1] http://www.youtube.com/watch?**v=VRZk-NhfX18list=** PLqcm6qE9lgKJzVvwHprow9h7KMpb5**hcUUindex=57http://www.youtube.com/watch?v=VRZk-NhfX18list=PLqcm6qE9lgKJzVvwHprow9h7KMpb5hcUUindex=57
AUTO : Samuel CARRIERE is out of the office (retour 07/08/2013)
Je suis absent(e) du bureau jusqu'au 07/08/2013 Remarque : ceci est une réponse automatique à votre message Compression ratio envoyé le 12/07/2013 17:02:11. C'est la seule notification que vous recevrez pendant l'absence de cette personne.
Re: How many DCs can you have in a cluster?
Yes, there's going to be a lot of replicas in total, but the replication factor will be 3 in each DC. Will it still be an issue? Blair On Jul 12, 2013, at 10:58 AM, sankalp kohli kohlisank...@gmail.com wrote: More than the DC, I think you will be bound by number of replicas. I dont know how it will work in case of 10-20 replication factor specially for range queries. On Thu, Jul 11, 2013 at 7:14 PM, Blair Zajac bl...@orcaware.com wrote: In this C* Summit 2013 talk titled A Deep Dive Into How Cassandra Resolves Inconsistent Data [1], Jason Brown of Netflix mentions that they have 5 data centers in the same cluster, two in the US, one in Europe, one in Brazil and one in Asia (I'm going from memory now since I don't want to watch the video again). Is there a practical limit on how many different data centers one can have in a single cluster? Thanks, Blair [1] http://www.youtube.com/watch?v=VRZk-NhfX18list=PLqcm6qE9lgKJzVvwHprow9h7KMpb5hcUUindex=57
Re: Timeout reading row from CF with collections
Yep, that was it. I built from the cassandra 1.2 branch and no more timeouts. Thanks for getting that fix into 1.2! Paul On Jul 12, 2013, at 1:20 AM, Sylvain Lebresne sylv...@datastax.com wrote: My bet is that you're hitting https://issues.apache.org/jira/browse/CASSANDRA-5677. -- Sylvain On Fri, Jul 12, 2013 at 8:17 AM, Paul Ingalls paulinga...@gmail.com wrote: I'm running into a problem trying to read data from a column family that includes a number of collections. Cluster details: 4 nodes running 1.2.6 on VMs with 4 cpus and 7 Gb of ram. raid 0 striped across 4 disks for the data and logs each node has about 500 MB of data currently loaded Here is the schema: create table user_scores ( user_id varchar, post_type varchar, score double, team_to_score_map mapvarchar, double, affiliation_to_score_map mapvarchar, double, campaign_to_score_map mapvarchar, double, person_to_score_map mapvarchar, double, primary key(user_id, post_type) ) with compaction = { 'class' : 'LeveledCompactionStrategy', 'sstable_size_in_mb' : 10 }; I used the leveled compaction strategy as I thought it would help with read latency… Here is a trace of a simple select against the cluster when it had nothing else was reading or writing (cpu was 2%): activity| timestamp| source | source_elapsed -+--++ execute_cql3_query | 05:51:34,557 | 100.69.176.51 | 0 Message received from /100.69.176.51 | 05:51:34,195 | 100.69.184.134 |102 Executing single-partition query on user_scores | 05:51:34,199 | 100.69.184.134 | 3512 Acquiring sstable references | 05:51:34,199 | 100.69.184.134 | 3741 Merging memtable tombstones | 05:51:34,199 | 100.69.184.134 | 3890 Key cache hit for sstable 5 | 05:51:34,199 | 100.69.184.134 | 4040 Seeking to partition beginning in data file | 05:51:34,199 | 100.69.184.134 | 4059 Merging data from memtables and 1 sstables | 05:51:34,200 | 100.69.184.134 | 4412 Parsing select * from user_scores where user_id='26257166' LIMIT 1; | 05:51:34,558 | 100.69.176.51 | 91 Peparing statement | 05:51:34,558 | 100.69.176.51 |238 Enqueuing data request to /100.69.184.134 | 05:51:34,558 | 100.69.176.51 |567 Sending message to /100.69.184.134 | 05:51:34,558 | 100.69.176.51 |979 Request complete | 05:51:54,562 | 100.69.176.51 | 20005209 You can see that I increased the timeout and it still fails. This seems to happen with rows that have maps with a larger number of entries. It is very reproducible with my current data set. Any ideas on why I can't query for a row? Thanks! Paul
hot sstables evicted from page cache on compaction causing high latency
Having a real issue where at the completion of large compactions, it will evict hot sstables from the kernel page cache causing huge read latency while it is backfilled. https://dl.dropboxusercontent.com/s/149h7ssru0dapkg/Screen%20Shot%202013-07-12%20at%201.46.19%20PM.png Blue line - page cache Green line - disk read latency (ms) Red line - CF read latency (ms) The beginning of both high latency plateaus correspond with the completion of a compaction. Seems like applying/enabling this will help? https://issues.apache.org/jira/browse/CASSANDRA-4937 - C* 1.2.6 - 3 Nodes - 24G RAM (8G heap) - (2) 3TB 7.2k disks using JBOD feature of C*
Re: Alternate major compaction
On Thu, Jul 11, 2013 at 9:43 PM, Takenori Sato ts...@cloudian.com wrote: I made the repository public. Now you can checkout from here. https://github.com/cloudian/support-tools checksstablegarbage is the tool. Enjoy, and any feedback is welcome. Thanks very much, useful tool! Out of curiousity, what does writesstablekeys do that the upstream tool sstablekeys does not? =Rob
Re: Representation of dynamically added columns in table (column family) schema using cqlsh
If you're creating dynamic columns via Thrift interface, they will not be reflected in the CQL3 schema. I would recommend not mixing paradigms like that, either stick with CQL3 or Thrift / cassandra-cli. With compact storage creates column families which can be interacted with meaningfully via Thrift, but you'll be lacking any metadata on those columns to interact with them via cql. On Fri, Jul 12, 2013 at 11:13 AM, Shahab Yunus shahab.yu...@gmail.comwrote: A basic question and it seems that I have a gap in my understanding. I have a simple table in Cassandra with multiple column families. I add new columns to each of these column families on the fly. When I view (using the 'DESCRIBE table' command) the schema of a particular column family, I see only one entry for column (bolded below). What is the reason for that? The column that I am adding have string names and byte values, written using Hector 1.1-3 ( HFactory.createColumn(...) method). CREATE TABLE mytable ( key text, *column1* ascii, value blob, PRIMARY KEY (key, column1) ) WITH COMPACT STORAGE AND bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=1.00 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh 3.0.2 Cassandra 1.2.5 CQL spec 3.0.0 Thrift protocol 19.36.0 Given this, I can also only query on this one column1 or value using the 'SELECT' statement. The OpsCenter on the other hand, displays multiple columns as expected. Basically the demarcation of multiple columns i clearer. Thanks a lot. Regards, Shahab
Minimum CPU and RAM for Cassandra and Hadoop Cluster
Dear Cassandra experts, I have an HP Proliant ML350 G8 server, and I want to put virtual servers on it. I would like to put the maximum number of nodes for a Cassandra + Hadoop cluster. I was wondering - what is the minimum RAM and memory per node I that I need to have Cassandra + Hadoop before the performance decreases are not worth the extra nodes? Also, what is the suggested typical number of CPU cores / Node ? Would it make sense to have 1 core / node ? Less than that ? Any insight is appreciated! Thanks very much for your time! Martin
Re: Representation of dynamically added columns in table (column family) schema using cqlsh
Thanks Eric for the explanation. Regards, Shahab On Fri, Jul 12, 2013 at 11:13 AM, Shahab Yunus shahab.yu...@gmail.comwrote: A basic question and it seems that I have a gap in my understanding. I have a simple table in Cassandra with multiple column families. I add new columns to each of these column families on the fly. When I view (using the 'DESCRIBE table' command) the schema of a particular column family, I see only one entry for column (bolded below). What is the reason for that? The column that I am adding have string names and byte values, written using Hector 1.1-3 ( HFactory.createColumn(...) method). CREATE TABLE mytable ( key text, *column1* ascii, value blob, PRIMARY KEY (key, column1) ) WITH COMPACT STORAGE AND bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=1.00 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh 3.0.2 Cassandra 1.2.5 CQL spec 3.0.0 Thrift protocol 19.36.0 Given this, I can also only query on this one column1 or value using the 'SELECT' statement. The OpsCenter on the other hand, displays multiple columns as expected. Basically the demarcation of multiple columns i clearer. Thanks a lot. Regards, Shahab
Re: Node tokens / data move
its possible to change num_tokens on node with data? i changed it and restarted node but it still has same amount in nodetool status.
Re: Alternate major compaction
It's light. Without -v option, you can even run it against just a SSTable file without needing the whole Cassandra installation. - Takenori On Sat, Jul 13, 2013 at 6:18 AM, Robert Coli rc...@eventbrite.com wrote: On Thu, Jul 11, 2013 at 9:43 PM, Takenori Sato ts...@cloudian.com wrote: I made the repository public. Now you can checkout from here. https://github.com/cloudian/support-tools checksstablegarbage is the tool. Enjoy, and any feedback is welcome. Thanks very much, useful tool! Out of curiousity, what does writesstablekeys do that the upstream tool sstablekeys does not? =Rob
Re: Rhombus - A time-series object store for Cassandra
Hello Rob, Thanks for the pointer. I have a couple of queries: How does this project compare to the KairosDb project on github ( For one I see that Rhombus supports multi column query which is cool whereas kairos time series DB/OpenTSDB do not seem to have such a feature - although we can use the tags to achieve something similar ? ) Are there any roll ups performed automatically by Rhombus ? Can we control the TTL of the data being inserted ? I am looking at the some of the time series based projects for production use preferably running on top of cassandra and was wondering if Rhombus can be seen as a pure time series optimized schema or something more than that ? Regards, Ananth On 7/12/13 7:15 AM, Rob Righter rob.righ...@pardot.com wrote: Hello, Just wanted to share a project that we have been working on. It's a time-series object store for Cassandra. We tried to generalize the common use cases for storing time-series data in Cassandra and automatically handle the denormalization, indexing, and wide row sharding. It currently exists as a Java Library. We have it deployed as a web service in a Dropwizard app server with a REST style interface. The plan is to eventually release that Dropwizard app too. The project and explanation is available on Github at: https://github.com/Pardot/Rhombus I would love to hear feedback. Many Thanks, Rob