Re: CQL performance inserting multiple cluster keys under same partition key
On Tue, Aug 26, 2014 at 6:50 PM, Jaydeep Chovatia chovatia.jayd...@gmail.com wrote: Hi, I have question on inserting multiple cluster keys under same partition key. Ex: CREATE TABLE Employee ( deptId int, empId int, name varchar, address varchar, salary int, PRIMARY KEY(deptId, empId) ); BEGIN *UNLOGGED *BATCH INSERT INTO Employee (deptId, empId, name, address, salary) VALUES (1, 10, 'testNameA', 'testAddressA', 2); INSERT INTO Employee (deptId, empId, name, address, salary) VALUES (1, 20, 'testNameB', 'testAddressB', 3); APPLY BATCH; Here we are inserting two cluster keys (10 and 20) under same partition key (1). Q1) Is this batch transaction atomic and isolated? If yes then is there any performance overhead with this syntax? As long as the update are under the same partition key (and I insist, only in that condition), logged (the one without the UNLOGGED keyword) and unlogged batches behave *exactly* the same way. So yes, in that case the batch is atomic and isolated (though on the isolation, you may want to be aware that while technically isolated, the usual timestamp rules still apply and so you might not get the behavior you think if 2 batches have the same timestamp: see CASSANDRA-6123 https://issues.apache.org/jira/browse/CASSANDRA-6123). There is no also no performance overhead (assuming you meant over logged batches). Q2) Is this CQL syntax can be considered equivalent of Thrift batch_mutate? It is equivalent, both (the CQL syntax and Thrift batch_mutate) resolve to the same operation internally. -- Sylvain
Can't Add AWS Node due to /mnt/cassandra/data directory
I already have a 3node m3.large DSE cluster, but I can't seem to add another m3.large node. I'm using the ubuntu-trusty-14.04-amd64-server-20140607.1 (ami-a7fdfee2) AMI (instance-store backed, PV) on AWS, I install java 7 and the JNA, then I go into opscenter to add a node. Things look good for 3 or 4 green circles, until I either get this error: Start Errored: Timed out waiting for Cassandra to start. or this error: Agent Connection Errored: Timed out waiting for agent to connect. I check the system.log and output.log, and they both say: INFO [main] 2014-08-27 08:17:24,642 CLibrary.java (line 121) JNA mlockall successful ERROR [main] 2014-08-27 08:17:24,644 CassandraDaemon.java (line 235) *Directory /mnt/cassandra/data doesn't exist* *ERROR [main] 2014-08-27 08:17:24,645 CassandraDaemon.java (line 239) Has no permission to create /mnt/cassandra/data directory* INFO [Thread-1] 2014-08-27 08:17:24,646 DseDaemon.java (line 477) DSE shutting down... ERROR [Thread-1] 2014-08-27 08:17:24,725 CassandraDaemon.java (line 199) Exception in thread Thread[Thread-1,5,main] java.lang.AssertionError at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1263) at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:171) at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:478) at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:384) My agent.log file says: Node is still provisioning, not attempting to determine ip. INFO [Initialization] 2014-08-27 08:40:57,848 Sleeping for 20s before trying to determine IP over JMX again INFO [Initialization] 2014-08-27 08:41:17,849 Node is still provisioning, not attempting to determine ip. INFO [Initialization] 2014-08-27 08:41:17,849 Sleeping for 20s before trying to determine IP over JMX again INFO [Initialization] 2014-08-27 08:41:37,849 Node is still provisioning, not attempting to determine ip. INFO [Initialization] 2014-08-27 08:41:37,850 Sleeping for 20s before trying to determine IP over JMX again INFO [Initialization] 2014-08-27 08:41:57,850 Node is still provisioning, not attempting to determine ip. I feel like I'm missing something easy with the mount, so if you could point me in the right direction, I would really appreciate it! -- Stephen Portanova (480) 495-2634
Re: Can't Add AWS Node due to /mnt/cassandra/data directory
Hi stephen, I have never added a node via OpsCenter, so this may be a short coming of that process. However in non OpsCenter installs you would have to create the data directories first: sudo mkdir -p /mnt/cassandra/commitlog sudo mkdir -p /mnt/cassandra/data sudo mkdir -p /mnt/cassandra/saved_caches And then give the cassandra user ownership of those directories: sudo chown -R cassandra:cassandra /mnt/cassandra Once this is done Cassandra will have the correct directories and permission to start up. Mark On 27 August 2014 09:50, Stephen Portanova sport...@gmail.com wrote: I already have a 3node m3.large DSE cluster, but I can't seem to add another m3.large node. I'm using the ubuntu-trusty-14.04-amd64-server-20140607.1 (ami-a7fdfee2) AMI (instance-store backed, PV) on AWS, I install java 7 and the JNA, then I go into opscenter to add a node. Things look good for 3 or 4 green circles, until I either get this error: Start Errored: Timed out waiting for Cassandra to start. or this error: Agent Connection Errored: Timed out waiting for agent to connect. I check the system.log and output.log, and they both say: INFO [main] 2014-08-27 08:17:24,642 CLibrary.java (line 121) JNA mlockall successful ERROR [main] 2014-08-27 08:17:24,644 CassandraDaemon.java (line 235) *Directory /mnt/cassandra/data doesn't exist* *ERROR [main] 2014-08-27 08:17:24,645 CassandraDaemon.java (line 239) Has no permission to create /mnt/cassandra/data directory* INFO [Thread-1] 2014-08-27 08:17:24,646 DseDaemon.java (line 477) DSE shutting down... ERROR [Thread-1] 2014-08-27 08:17:24,725 CassandraDaemon.java (line 199) Exception in thread Thread[Thread-1,5,main] java.lang.AssertionError at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1263) at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:171) at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:478) at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:384) My agent.log file says: Node is still provisioning, not attempting to determine ip. INFO [Initialization] 2014-08-27 08:40:57,848 Sleeping for 20s before trying to determine IP over JMX again INFO [Initialization] 2014-08-27 08:41:17,849 Node is still provisioning, not attempting to determine ip. INFO [Initialization] 2014-08-27 08:41:17,849 Sleeping for 20s before trying to determine IP over JMX again INFO [Initialization] 2014-08-27 08:41:37,849 Node is still provisioning, not attempting to determine ip. INFO [Initialization] 2014-08-27 08:41:37,850 Sleeping for 20s before trying to determine IP over JMX again INFO [Initialization] 2014-08-27 08:41:57,850 Node is still provisioning, not attempting to determine ip. I feel like I'm missing something easy with the mount, so if you could point me in the right direction, I would really appreciate it! -- Stephen Portanova (480) 495-2634
Re: Can't Add AWS Node due to /mnt/cassandra/data directory
Worked great! Thanks Mark! On Wed, Aug 27, 2014 at 2:00 AM, Mark Reddy mark.l.re...@gmail.com wrote: Hi stephen, I have never added a node via OpsCenter, so this may be a short coming of that process. However in non OpsCenter installs you would have to create the data directories first: sudo mkdir -p /mnt/cassandra/commitlog sudo mkdir -p /mnt/cassandra/data sudo mkdir -p /mnt/cassandra/saved_caches And then give the cassandra user ownership of those directories: sudo chown -R cassandra:cassandra /mnt/cassandra Once this is done Cassandra will have the correct directories and permission to start up. Mark On 27 August 2014 09:50, Stephen Portanova sport...@gmail.com wrote: I already have a 3node m3.large DSE cluster, but I can't seem to add another m3.large node. I'm using the ubuntu-trusty-14.04-amd64-server-20140607.1 (ami-a7fdfee2) AMI (instance-store backed, PV) on AWS, I install java 7 and the JNA, then I go into opscenter to add a node. Things look good for 3 or 4 green circles, until I either get this error: Start Errored: Timed out waiting for Cassandra to start. or this error: Agent Connection Errored: Timed out waiting for agent to connect. I check the system.log and output.log, and they both say: INFO [main] 2014-08-27 08:17:24,642 CLibrary.java (line 121) JNA mlockall successful ERROR [main] 2014-08-27 08:17:24,644 CassandraDaemon.java (line 235) *Directory /mnt/cassandra/data doesn't exist* *ERROR [main] 2014-08-27 08:17:24,645 CassandraDaemon.java (line 239) Has no permission to create /mnt/cassandra/data directory* INFO [Thread-1] 2014-08-27 08:17:24,646 DseDaemon.java (line 477) DSE shutting down... ERROR [Thread-1] 2014-08-27 08:17:24,725 CassandraDaemon.java (line 199) Exception in thread Thread[Thread-1,5,main] java.lang.AssertionError at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1263) at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:171) at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:478) at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:384) My agent.log file says: Node is still provisioning, not attempting to determine ip. INFO [Initialization] 2014-08-27 08:40:57,848 Sleeping for 20s before trying to determine IP over JMX again INFO [Initialization] 2014-08-27 08:41:17,849 Node is still provisioning, not attempting to determine ip. INFO [Initialization] 2014-08-27 08:41:17,849 Sleeping for 20s before trying to determine IP over JMX again INFO [Initialization] 2014-08-27 08:41:37,849 Node is still provisioning, not attempting to determine ip. INFO [Initialization] 2014-08-27 08:41:37,850 Sleeping for 20s before trying to determine IP over JMX again INFO [Initialization] 2014-08-27 08:41:57,850 Node is still provisioning, not attempting to determine ip. I feel like I'm missing something easy with the mount, so if you could point me in the right direction, I would really appreciate it! -- Stephen Portanova (480) 495-2634 -- Stephen Portanova (480) 495-2634
Bulk load in cassandra
Hi I installed Cassandra on one node successfully using CLI I am able to add a table to the keyspace as well as retrieve the data from the table. My query is if I have text file on my local file system and I want to load on Cassandra cluster or you can say bulk load. How can I achieve that. Please help me out. Regards Malay Nilabh BIDW BU/ Big Data CoE LT Infotech Ltd, Hinjewadi,Pune [cid:image001.gif@01CFC21E.64B11CD0]: +91-20-66571746 [cid:image002.png@01CFC21E.64B11CD0]+91-73-879-00727 Email: malay.nil...@lntinfotech.commailto:malay.nil...@lntinfotech.com || Save Paper - Save Trees || The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and using or disseminating the information, and must notify the sender and delete it from their system. LT Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail
Re: Bulk load in cassandra
Hi Malay, Yesterday i answered for your question but you didn't replied back whether it worked for you or not. Anyways you mean by importing text file into cassandra. you can do that by following way. COPY keyspace.columnfamily (column1, column2,...) FROM 'temp.csv' (location of file); for directly executing above command your file has to be in cassandra/bin location. Thanks, Umang Shah Pentaho BI-ETL Developer shahuma...@gmail.com On Wed, Aug 27, 2014 at 12:13 PM, Malay Nilabh malay.nil...@lntinfotech.com wrote: Hi I installed Cassandra on one node successfully using CLI I am able to add a table to the keyspace as well as retrieve the data from the table. My query is if I have text file on my local file system and I want to load on Cassandra cluster or you can say bulk load. How can I achieve that. Please help me out. Regards *Malay Nilabh* BIDW BU/ Big Data CoE LT Infotech Ltd, Hinjewadi,Pune [image: Description: image001]: +91-20-66571746 [image: Description: Description: Description: Description: cid:image002.png@01CF1EAD.959B9290]+91-73-879-00727 Email: malay.nil...@lntinfotech.com *|| Save Paper - Save Trees || * -- The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and using or disseminating the information, and must notify the sender and delete it from their system. LT Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail -- Regards, Umang V.Shah +919886829019
Re: Bulk load in cassandra
Please try COPY command via CQL shell if it is delimited file. Regards, Baskar Duraikannu -Original Message- From: Malay Nilabh malay.nil...@lntinfotech.com Date: Wed, 27 Aug 2014 17:43:21 To: user@cassandra.apache.orguser@cassandra.apache.org Reply-To: user@cassandra.apache.org Subject: Bulk load in cassandra Hi I installed Cassandra on one node successfully using CLI I am able to add a table to the keyspace as well as retrieve the data from the table. My query is if I have text file on my local file system and I want to load on Cassandra cluster or you can say bulk load. How can I achieve that. Please help me out. Regards Malay Nilabh BIDW BU/ Big Data CoE LT Infotech Ltd, Hinjewadi,Pune [cid:image001.gif@01CFC21E.64B11CD0]: +91-20-66571746 [cid:image002.png@01CFC21E.64B11CD0]+91-73-879-00727 Email: malay.nil...@lntinfotech.commailto:malay.nil...@lntinfotech.com || Save Paper - Save Trees || The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and using or disseminating the information, and must notify the sender and delete it from their system. LT Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail
Re: Installing Cassandra Multinode on CentOs coming up with exception
Hey Patricia, Thanks for your kind response. I will surely take care of that provided the use of virtual nodes. Thanks again! On Tue, Aug 26, 2014 at 10:42 PM, Patricia Gorla patri...@thelastpickle.com wrote: Vineet, One more thing -- you have initial_token and num_tokens both set. If you are trying to use virtual nodes, you should comment out initial_token as this setting overrides num_tokens. Cheers, On Tue, Aug 26, 2014 at 5:39 AM, Vineet Mishra clearmido...@gmail.com wrote: Thanks Vivek! It was indeed a formatting issue in yaml, got it work! On Tue, Aug 26, 2014 at 6:06 PM, Vivek Mishra mishra.v...@gmail.com wrote: Please read about http://www.yaml.org/start.html. Looks like formatting issue. You might be missing/adding incorrect spaces Validate your YAML file. This should help you out http://yamllint.com/ -Vivek On Tue, Aug 26, 2014 at 4:20 PM, Vineet Mishra clearmido...@gmail.com wrote: Hi Mark, Yes I was generating my own cassandra.yaml with the configuration mentioned below, cluster_name: 'node' initial_token: 0 num_tokens: 256 seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider parameters: - seeds: 192.168.1.32 listen_address: 192.168.1.32 rpc_address: 0.0.0.0 endpoint_snitch: RackInferringSnitch Similarly for second node cluster_name: 'node' initial_token: 2305843009213693952 num_tokens: 256 seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider parameters: - seeds: 192.168.1.32 listen_address: 192.168.1.36 rpc_address: 0.0.0.0 endpoint_snitch: RackInferringSnitch and so on. . . But even if I use default xml with the necessary configurational changes I am getting following error. INFO 16:13:38,225 Loading settings from file:/home/cluster/cassandra/conf/cassandra.yaml ERROR 16:13:38,301 Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: Invalid yaml at org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:100) at org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:135) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:111) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:156) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585) Caused by: while parsing a block mapping in 'reader', line 10, column 2: cluster_name: 'node' ^ expected block end, but found BlockMappingStart in 'reader', line 30, column 3: initial_token: 0 ^ at org.yaml.snakeyaml.parser.ParserImpl$ParseBlockMappingKey.produce(ParserImpl.java:570) at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(ParserImpl.java:158) at org.yaml.snakeyaml.parser.ParserImpl.checkEvent(ParserImpl.java:143) at org.yaml.snakeyaml.composer.Composer.composeMappingNode(Composer.java:230) at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:159) at org.yaml.snakeyaml.composer.Composer.composeDocument(Composer.java:122) at org.yaml.snakeyaml.composer.Composer.getSingleNode(Composer.java:105) at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:120) at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:481) at org.yaml.snakeyaml.Yaml.loadAs(Yaml.java:475) at org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:93) ... 5 more Invalid yaml Could you figure out whats making the yaml invalid. Thanks! On Tue, Aug 26, 2014 at 4:06 PM, Mark Reddy mark.l.re...@gmail.com wrote: You are missing commitlog_sync in your cassandra.yaml. Are you generating your own cassandra.yaml or editing the package default? If you are generating your own there are several configuration options that are required and if not present, Cassandra will fail to start. Regards, Mark On 26 August 2014 11:14, Vineet Mishra clearmido...@gmail.com wrote: Thanks Mark, That was indeed yaml formatting issue. Moreover I am getting the underlying error now, INFO 15:33:43,770 Loading settings from file:/home/cluster/cassandra/conf/cassandra.yaml INFO 15:33:44,100 Data files directories: [/var/lib/cassandra/data] INFO 15:33:44,101 Commit log directory: /var/lib/cassandra/commitlog ERROR 15:33:44,103 Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: Missing required directive CommitLogSync at org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:147) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:111) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:156) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496) at
[RELEASE] Apache Cassandra 2.0.10 released
I looked for the newest release, but only see release candidates, not a stable release. http://archive.apache.org/dist/cassandra/2.1.0/
Re: are dynamic columns supported at all in CQL 3?
Using the post's example, consider the query of get all readings for sensor 1. With dynamic columns, the query is just select * from data where sensor_id=1. In CQL, not only does this take N different queries (one per sample) but you have to explicitly know the collected_at values to query for. Right? This does work in CQL (v3.1.1 , tried on Cassandra 2.0.4) cqlsh:playlist CREATE TABLE data ( ... sensor_id int, ... collected_at timestamp, ... volts float, ... PRIMARY KEY (sensor_id, collected_at) ... ) WITH COMPACT STORAGE; cqlsh:playlist insert into data(sensor_id,collected_at,volts) values (1,'2014-0 5-01 00:00:00',1.2); cqlsh:playlist insert into data(sensor_id,collected_at,volts) values (1,'2014-0 5-02 00:00:00',1.3); cqlsh:playlist insert into data(sensor_id,collected_at,volts) values (1,'2014-0 5-03 00:00:00',1.4); cqlsh:playlist insert into data(sensor_id,collected_at,volts) values (2,'2014-0 5-03 00:00:00',2.4); cqlsh:playlist select * from data; sensor_id | collected_at | volts ---+--+--- 1 | 2014-05-01 00:00:00Pacific Daylight Time | 1.2 1 | 2014-05-02 00:00:00Pacific Daylight Time | 1.3 1 | 2014-05-03 00:00:00Pacific Daylight Time | 1.4 2 | 2014-05-03 00:00:00Pacific Daylight Time | 2.4 (4 rows) cqlsh:playlist select * from data where sensor_id=1; sensor_id | collected_at | volts ---+--+--- 1 | 2014-05-01 00:00:00Pacific Daylight Time | 1.2 1 | 2014-05-02 00:00:00Pacific Daylight Time | 1.3 1 | 2014-05-03 00:00:00Pacific Daylight Time | 1.4 (3 rows) cqlsh:playlist On Tue, Aug 26, 2014 at 12:33 PM, Ian Rose ianr...@fullstory.com wrote: Unfortunately, no. I've read that and the solution presented only works in limited scenarios. Using the post's example, consider the query of get all readings for sensor 1. With dynamic columns, the query is just select * from data where sensor_id=1. In CQL, not only does this take N different queries (one per sample) but you have to explicitly know the collected_at values to query for. Right? The other suggestion, to use collections (such as a map), again works in some circumstances, but not all. In particular, each item in a collection is limited to 64k bytes which is not something we want to be limited to (we are storing byte arrays that occasionally exceed this size). On Tue, Aug 26, 2014 at 3:14 PM, Shane Hansen shanemhan...@gmail.com wrote: Does this answer your question Ian? http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows On Tue, Aug 26, 2014 at 1:12 PM, Ian Rose ianr...@fullstory.com wrote: Is it possible in CQL to create a table that supports dynamic column names? I am using C* v2.0.9, which I assume implies CQL version 3. This page appears to show that this was supported in CQL 2 with the 'with comparator' and 'with default_validation' options but that CQL 3 does not support this: http://www.datastax.com/dev/blog/whats-new-in-cql-3-0 Am I understanding that right? If so, what is my best course of action? Create the table using the cassandra-cli tool? Thanks, - Ian
Re: [RELEASE] Apache Cassandra 2.0.10 released
This release is for 2.0.10, not the 2.1.x line. If you want this release it is at http://archive.apache.org/dist/cassandra/2.0.10/ The 2.1.x line is not stable yet. On August 27, 2014 at 11:29:46 AM, Razi Khaja (razi.kh...@gmail.com) wrote: I looked for the newest release, but only see release candidates, not a stable release. http://archive.apache.org/dist/cassandra/2.1.0/
Re: are dynamic columns supported at all in CQL 3?
Deepak - Yes, you are indeed right. I must admit I am still trying to learn what queries can and cannot be performed in Cassandra and I didn't realize that you could query on a non-fully-specified primary key, as long as you *do* fully qualify the partition key. Cheers, Ian On Wed, Aug 27, 2014 at 11:31 AM, Deepak Shetty shet...@gmail.com wrote: Using the post's example, consider the query of get all readings for sensor 1. With dynamic columns, the query is just select * from data where sensor_id=1. In CQL, not only does this take N different queries (one per sample) but you have to explicitly know the collected_at values to query for. Right? This does work in CQL (v3.1.1 , tried on Cassandra 2.0.4) cqlsh:playlist CREATE TABLE data ( ... sensor_id int, ... collected_at timestamp, ... volts float, ... PRIMARY KEY (sensor_id, collected_at) ... ) WITH COMPACT STORAGE; cqlsh:playlist insert into data(sensor_id,collected_at,volts) values (1,'2014-0 5-01 00:00:00',1.2); cqlsh:playlist insert into data(sensor_id,collected_at,volts) values (1,'2014-0 5-02 00:00:00',1.3); cqlsh:playlist insert into data(sensor_id,collected_at,volts) values (1,'2014-0 5-03 00:00:00',1.4); cqlsh:playlist insert into data(sensor_id,collected_at,volts) values (2,'2014-0 5-03 00:00:00',2.4); cqlsh:playlist select * from data; sensor_id | collected_at | volts ---+--+--- 1 | 2014-05-01 00:00:00Pacific Daylight Time | 1.2 1 | 2014-05-02 00:00:00Pacific Daylight Time | 1.3 1 | 2014-05-03 00:00:00Pacific Daylight Time | 1.4 2 | 2014-05-03 00:00:00Pacific Daylight Time | 2.4 (4 rows) cqlsh:playlist select * from data where sensor_id=1; sensor_id | collected_at | volts ---+--+--- 1 | 2014-05-01 00:00:00Pacific Daylight Time | 1.2 1 | 2014-05-02 00:00:00Pacific Daylight Time | 1.3 1 | 2014-05-03 00:00:00Pacific Daylight Time | 1.4 (3 rows) cqlsh:playlist On Tue, Aug 26, 2014 at 12:33 PM, Ian Rose ianr...@fullstory.com wrote: Unfortunately, no. I've read that and the solution presented only works in limited scenarios. Using the post's example, consider the query of get all readings for sensor 1. With dynamic columns, the query is just select * from data where sensor_id=1. In CQL, not only does this take N different queries (one per sample) but you have to explicitly know the collected_at values to query for. Right? The other suggestion, to use collections (such as a map), again works in some circumstances, but not all. In particular, each item in a collection is limited to 64k bytes which is not something we want to be limited to (we are storing byte arrays that occasionally exceed this size). On Tue, Aug 26, 2014 at 3:14 PM, Shane Hansen shanemhan...@gmail.com wrote: Does this answer your question Ian? http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows On Tue, Aug 26, 2014 at 1:12 PM, Ian Rose ianr...@fullstory.com wrote: Is it possible in CQL to create a table that supports dynamic column names? I am using C* v2.0.9, which I assume implies CQL version 3. This page appears to show that this was supported in CQL 2 with the 'with comparator' and 'with default_validation' options but that CQL 3 does not support this: http://www.datastax.com/dev/blog/whats-new-in-cql-3-0 Am I understanding that right? If so, what is my best course of action? Create the table using the cassandra-cli tool? Thanks, - Ian
Re: CQL performance inserting multiple cluster keys under same partition key
This clarifies my doubt. Thanks You Sylvain for your help. On Tue, Aug 26, 2014 at 11:59 PM, Sylvain Lebresne sylv...@datastax.com wrote: On Tue, Aug 26, 2014 at 6:50 PM, Jaydeep Chovatia chovatia.jayd...@gmail.com wrote: Hi, I have question on inserting multiple cluster keys under same partition key. Ex: CREATE TABLE Employee ( deptId int, empId int, name varchar, address varchar, salary int, PRIMARY KEY(deptId, empId) ); BEGIN *UNLOGGED *BATCH INSERT INTO Employee (deptId, empId, name, address, salary) VALUES (1, 10, 'testNameA', 'testAddressA', 2); INSERT INTO Employee (deptId, empId, name, address, salary) VALUES (1, 20, 'testNameB', 'testAddressB', 3); APPLY BATCH; Here we are inserting two cluster keys (10 and 20) under same partition key (1). Q1) Is this batch transaction atomic and isolated? If yes then is there any performance overhead with this syntax? As long as the update are under the same partition key (and I insist, only in that condition), logged (the one without the UNLOGGED keyword) and unlogged batches behave *exactly* the same way. So yes, in that case the batch is atomic and isolated (though on the isolation, you may want to be aware that while technically isolated, the usual timestamp rules still apply and so you might not get the behavior you think if 2 batches have the same timestamp: see CASSANDRA-6123 https://issues.apache.org/jira/browse/CASSANDRA-6123). There is no also no performance overhead (assuming you meant over logged batches). Q2) Is this CQL syntax can be considered equivalent of Thrift batch_mutate? It is equivalent, both (the CQL syntax and Thrift batch_mutate) resolve to the same operation internally. -- Sylvain
How often are JMX Cassandra metrics reset?
I'm using JMX to retrieve Cassandra metrics. I notice that Max and Count are cumulative and aren't reset.How often are the stats for Mean, 99tthPercentile, etc reset back to zero? For example, 99thPercentile shows as 1.5 mls. Over how many minutes? ClientRequest/Read/Latency: LatencyUnit = MICROSECONDS FiveMinuteRate = 1.12 FifteenMinuteRate = 1.11 RateUnit = SECONDS MeanRate = 1.65 OneMinuteRate = 1.13 EventType = calls Max = 237,373.37 Count = 961,312 50thPercentile = 383.2 Mean = 908.46 Min = 95.64 StdDev = 3,034.62 75thPercentile = 626.34 95thPercentile = 954.31 98thPercentile = 1,443.11 99thPercentile = 1,472.4 999thPercentile = 1,858.1 Donald A. Smith | Senior Software Engineer P: 425.201.3900 x 3866 C: (206) 819-5965 F: (646) 443-2333 dona...@audiencescience.commailto:dona...@audiencescience.com [AudienceScience]
Re: How often are JMX Cassandra metrics reset?
On Wed, Aug 27, 2014 at 12:38 PM, Donald Smith donald.sm...@audiencescience.com wrote: I’m using JMX to retrieve Cassandra metrics. I notice that Max and Count are cumulative and aren’t reset.How often are the stats for Mean, 99tthPercentile, etc reset back to zero? If they're like the old latency numbers, they are from node startup time and are never reset. =Rob
Re: Bulk load in cassandra
On Wed, Aug 27, 2014 at 5:13 AM, Malay Nilabh malay.nil...@lntinfotech.com wrote: I installed Cassandra on one node successfully using CLI I am able to add a table to the keyspace as well as retrieve the data from the table. My query is if I have text file on my local file system and I want to load on Cassandra cluster or you can say bulk load. How can I achieve that. Please help me out. http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra or CQLsh COPY but beware that COPY is capable of timing out in the current implementation. =Rob
Re: Too many SSTables after rebalancing cluster (LCS)
Try turning down 'tombstone_threshold' to something like '0.05' from it's default of '0.2.' This will cause the SSTable to be considered for tombstone only compactions more frequently (if %5 of the columns are tombstones instead of 20%). For a bit more info, see: http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/compactSubprop.html On Tue, Aug 26, 2014 at 1:38 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: Hey folks, After adding more nodes and moving tokens of old nodes to rebalance the ring, I noticed that the old nodes had significant more data then the newly bootstrapped nodes, even after cleanup. I noticed that the old nodes had a much larger number of SSTables on LCS CFs, and most of them located on the last level: Node N-1 (old node): [1, 10, 102/100, 173, 2403, 0, 0, 0, 0] (total:2695) *Node N (new node): [1, 10, 108/100, 214, 0, 0, 0, 0, 0] (total: 339)*Node N+1 (old node): [1, 10, 87, 113, 1076, 0, 0, 0, 0] (total: 1287) Since these sstables have a lot of tombstones, and they're not updated frequently, they remain in the last level forever, and are never cleaned. What is the solution here? The good old change to STCS and then back to LCS, or is there something less brute force? Environment: Cassandra 1.2.16 - non-vnondes Any help would be very much appreciated. Cheers, -- *Paulo Motta* Chaordic | *Platform* *www.chaordic.com.br http://www.chaordic.com.br/* +55 48 3232.3200 -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com
Re: Too many SSTables after rebalancing cluster (LCS)
Great idea, will try that (right now is 10%, but being more aggressive should hopefully work). Cheers! On Wed, Aug 27, 2014 at 7:02 PM, Nate McCall n...@thelastpickle.com wrote: Try turning down 'tombstone_threshold' to something like '0.05' from it's default of '0.2.' This will cause the SSTable to be considered for tombstone only compactions more frequently (if %5 of the columns are tombstones instead of 20%). For a bit more info, see: http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/compactSubprop.html On Tue, Aug 26, 2014 at 1:38 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: Hey folks, After adding more nodes and moving tokens of old nodes to rebalance the ring, I noticed that the old nodes had significant more data then the newly bootstrapped nodes, even after cleanup. I noticed that the old nodes had a much larger number of SSTables on LCS CFs, and most of them located on the last level: Node N-1 (old node): [1, 10, 102/100, 173, 2403, 0, 0, 0, 0] (total:2695) *Node N (new node): [1, 10, 108/100, 214, 0, 0, 0, 0, 0] (total: 339)*Node N+1 (old node): [1, 10, 87, 113, 1076, 0, 0, 0, 0] (total: 1287) Since these sstables have a lot of tombstones, and they're not updated frequently, they remain in the last level forever, and are never cleaned. What is the solution here? The good old change to STCS and then back to LCS, or is there something less brute force? Environment: Cassandra 1.2.16 - non-vnondes Any help would be very much appreciated. Cheers, -- *Paulo Motta* Chaordic | *Platform* *www.chaordic.com.br http://www.chaordic.com.br/* +55 48 3232.3200 -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com -- *Paulo Motta* Chaordic | *Platform* *www.chaordic.com.br http://www.chaordic.com.br/* +55 48 3232.3200
Re: Too many SSTables after rebalancing cluster (LCS)
Another option to force things - deleting the json metadata file for that table will cause LCS to put all SSTables in level 0 and begin recompacting them. On Wed, Aug 27, 2014 at 5:15 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: Great idea, will try that (right now is 10%, but being more aggressive should hopefully work). Cheers! On Wed, Aug 27, 2014 at 7:02 PM, Nate McCall n...@thelastpickle.com wrote: Try turning down 'tombstone_threshold' to something like '0.05' from it's default of '0.2.' This will cause the SSTable to be considered for tombstone only compactions more frequently (if %5 of the columns are tombstones instead of 20%). For a bit more info, see: http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/compactSubprop.html On Tue, Aug 26, 2014 at 1:38 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: Hey folks, After adding more nodes and moving tokens of old nodes to rebalance the ring, I noticed that the old nodes had significant more data then the newly bootstrapped nodes, even after cleanup. I noticed that the old nodes had a much larger number of SSTables on LCS CFs, and most of them located on the last level: Node N-1 (old node): [1, 10, 102/100, 173, 2403, 0, 0, 0, 0] (total:2695) *Node N (new node): [1, 10, 108/100, 214, 0, 0, 0, 0, 0] (total: 339)*Node N+1 (old node): [1, 10, 87, 113, 1076, 0, 0, 0, 0] (total: 1287) Since these sstables have a lot of tombstones, and they're not updated frequently, they remain in the last level forever, and are never cleaned. What is the solution here? The good old change to STCS and then back to LCS, or is there something less brute force? Environment: Cassandra 1.2.16 - non-vnondes Any help would be very much appreciated. Cheers, -- *Paulo Motta* Chaordic | *Platform* *www.chaordic.com.br http://www.chaordic.com.br/* +55 48 3232.3200 -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com -- *Paulo Motta* Chaordic | *Platform* *www.chaordic.com.br http://www.chaordic.com.br/* +55 48 3232.3200 -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com
Re: Too many SSTables after rebalancing cluster (LCS)
On Wed, Aug 27, 2014 at 3:27 PM, Nate McCall n...@thelastpickle.com wrote: Another option to force things - deleting the json metadata file for that table will cause LCS to put all SSTables in level 0 and begin recompacting them. That's possible in versions where the level is in a JSON file, which is versions before 2.0. In 2.0+ you can use nodetool for the same purpose. https://issues.apache.org/jira/browse/CASSANDRA-5271 (Fixed; 2.0 beta 1): Create tool to drop sstables to level 0 =Rob
Re: Can't Add AWS Node due to /mnt/cassandra/data directory
Make sure you have also setup the ephemeral drives as a raid device (use mdadm) and mounted it under /mnt/cassandra otherwise your data dir is the os partition which is usually very small. Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359 On 27 Aug 2014, at 8:21 pm, Stephen Portanova sport...@gmail.com wrote: Worked great! Thanks Mark! On Wed, Aug 27, 2014 at 2:00 AM, Mark Reddy mark.l.re...@gmail.com wrote: Hi stephen, I have never added a node via OpsCenter, so this may be a short coming of that process. However in non OpsCenter installs you would have to create the data directories first: sudo mkdir -p /mnt/cassandra/commitlog sudo mkdir -p /mnt/cassandra/data sudo mkdir -p /mnt/cassandra/saved_caches And then give the cassandra user ownership of those directories: sudo chown -R cassandra:cassandra /mnt/cassandra Once this is done Cassandra will have the correct directories and permission to start up. Mark On 27 August 2014 09:50, Stephen Portanova sport...@gmail.com wrote: I already have a 3node m3.large DSE cluster, but I can't seem to add another m3.large node. I'm using the ubuntu-trusty-14.04-amd64-server-20140607.1 (ami-a7fdfee2) AMI (instance-store backed, PV) on AWS, I install java 7 and the JNA, then I go into opscenter to add a node. Things look good for 3 or 4 green circles, until I either get this error: Start Errored: Timed out waiting for Cassandra to start. or this error: Agent Connection Errored: Timed out waiting for agent to connect. I check the system.log and output.log, and they both say: INFO [main] 2014-08-27 08:17:24,642 CLibrary.java (line 121) JNA mlockall successful ERROR [main] 2014-08-27 08:17:24,644 CassandraDaemon.java (line 235) Directory /mnt/cassandra/data doesn't exist ERROR [main] 2014-08-27 08:17:24,645 CassandraDaemon.java (line 239) Has no permission to create /mnt/cassandra/data directory INFO [Thread-1] 2014-08-27 08:17:24,646 DseDaemon.java (line 477) DSE shutting down... ERROR [Thread-1] 2014-08-27 08:17:24,725 CassandraDaemon.java (line 199) Exception in thread Thread[Thread-1,5,main] java.lang.AssertionError at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1263) at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:171) at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:478) at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:384) My agent.log file says: Node is still provisioning, not attempting to determine ip. INFO [Initialization] 2014-08-27 08:40:57,848 Sleeping for 20s before trying to determine IP over JMX again INFO [Initialization] 2014-08-27 08:41:17,849 Node is still provisioning, not attempting to determine ip. INFO [Initialization] 2014-08-27 08:41:17,849 Sleeping for 20s before trying to determine IP over JMX again INFO [Initialization] 2014-08-27 08:41:37,849 Node is still provisioning, not attempting to determine ip. INFO [Initialization] 2014-08-27 08:41:37,850 Sleeping for 20s before trying to determine IP over JMX again INFO [Initialization] 2014-08-27 08:41:57,850 Node is still provisioning, not attempting to determine ip. I feel like I'm missing something easy with the mount, so if you could point me in the right direction, I would really appreciate it! -- Stephen Portanova (480) 495-2634 -- Stephen Portanova (480) 495-2634