Re: CQL performance inserting multiple cluster keys under same partition key

2014-08-27 Thread Sylvain Lebresne
On Tue, Aug 26, 2014 at 6:50 PM, Jaydeep Chovatia 
chovatia.jayd...@gmail.com wrote:

 Hi,

 I have question on inserting multiple cluster keys under same partition
 key.

 Ex:

 CREATE TABLE Employee (
   deptId int,
   empId int,
   name   varchar,
   address varchar,
   salary int,
   PRIMARY KEY(deptId, empId)
 );

 BEGIN *UNLOGGED *BATCH
   INSERT INTO Employee (deptId, empId, name, address, salary) VALUES (1,
 10, 'testNameA', 'testAddressA', 2);
   INSERT INTO Employee (deptId, empId, name, address, salary) VALUES (1,
 20, 'testNameB', 'testAddressB', 3);
 APPLY BATCH;

 Here we are inserting two cluster keys (10 and 20) under same partition
 key (1).
 Q1) Is this batch transaction atomic and isolated? If yes then is there
 any performance overhead with this syntax?


As long as the update are under the same partition key (and I insist, only
in that condition), logged (the one without the UNLOGGED keyword) and
unlogged batches behave *exactly* the same way. So yes, in that case the
batch is atomic and isolated (though on the isolation, you may want to be
aware that while technically isolated, the usual timestamp rules still
apply and so you might not get the behavior you think if 2 batches have the
same timestamp: see CASSANDRA-6123
https://issues.apache.org/jira/browse/CASSANDRA-6123). There is no also
no performance overhead (assuming you meant over logged batches).

Q2) Is this CQL syntax can be considered equivalent of Thrift
 batch_mutate?


It is equivalent, both (the CQL syntax and Thrift batch_mutate) resolve
to the same operation internally.

--
Sylvain


Can't Add AWS Node due to /mnt/cassandra/data directory

2014-08-27 Thread Stephen Portanova
I already have a 3node m3.large DSE cluster, but I can't seem to add
another m3.large node. I'm using the
ubuntu-trusty-14.04-amd64-server-20140607.1
(ami-a7fdfee2) AMI (instance-store backed, PV) on AWS, I install java 7 and
the JNA, then I go into opscenter to add a node. Things look good for 3 or
4 green circles, until I either get this error: Start Errored: Timed out
waiting for Cassandra to start. or this error: Agent Connection Errored:
Timed out waiting for agent to connect.

I check the system.log and output.log, and they both say:
INFO [main] 2014-08-27 08:17:24,642 CLibrary.java (line 121) JNA mlockall
successful
ERROR [main] 2014-08-27 08:17:24,644 CassandraDaemon.java (line 235) *Directory
/mnt/cassandra/data doesn't exist*
*ERROR [main] 2014-08-27 08:17:24,645 CassandraDaemon.java (line 239) Has
no permission to create /mnt/cassandra/data directory*
 INFO [Thread-1] 2014-08-27 08:17:24,646 DseDaemon.java (line 477) DSE
shutting down...
ERROR [Thread-1] 2014-08-27 08:17:24,725 CassandraDaemon.java (line 199)
Exception in thread Thread[Thread-1,5,main]
java.lang.AssertionError
at
org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1263)
at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:171)
at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:478)
at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:384)

My agent.log file says:

Node is still provisioning, not attempting to determine ip.

 INFO [Initialization] 2014-08-27 08:40:57,848 Sleeping for 20s before
trying to determine IP over JMX again

 INFO [Initialization] 2014-08-27 08:41:17,849 Node is still provisioning,
not attempting to determine ip.

 INFO [Initialization] 2014-08-27 08:41:17,849 Sleeping for 20s before
trying to determine IP over JMX again

 INFO [Initialization] 2014-08-27 08:41:37,849 Node is still provisioning,
not attempting to determine ip.

 INFO [Initialization] 2014-08-27 08:41:37,850 Sleeping for 20s before
trying to determine IP over JMX again

 INFO [Initialization] 2014-08-27 08:41:57,850 Node is still provisioning,
not attempting to determine ip.


I feel like I'm missing something easy with the mount, so if you could
point me in the right direction, I would really appreciate it!

-- 
Stephen Portanova
(480) 495-2634


Re: Can't Add AWS Node due to /mnt/cassandra/data directory

2014-08-27 Thread Mark Reddy
Hi stephen,

I have never added a node via OpsCenter, so this may be a short coming of
that process. However in non OpsCenter installs you would have to create
the data directories first:

sudo mkdir -p /mnt/cassandra/commitlog
sudo mkdir -p /mnt/cassandra/data
sudo mkdir -p /mnt/cassandra/saved_caches

And then give the cassandra user ownership of those directories:

sudo chown -R cassandra:cassandra /mnt/cassandra

Once this is done Cassandra will have the correct directories and
permission to start up.


Mark


On 27 August 2014 09:50, Stephen Portanova sport...@gmail.com wrote:

 I already have a 3node m3.large DSE cluster, but I can't seem to add
 another m3.large node. I'm using the 
 ubuntu-trusty-14.04-amd64-server-20140607.1
 (ami-a7fdfee2) AMI (instance-store backed, PV) on AWS, I install java 7
 and the JNA, then I go into opscenter to add a node. Things look good for 3
 or 4 green circles, until I either get this error: Start Errored: Timed
 out waiting for Cassandra to start. or this error: Agent Connection
 Errored: Timed out waiting for agent to connect.

 I check the system.log and output.log, and they both say:
 INFO [main] 2014-08-27 08:17:24,642 CLibrary.java (line 121) JNA mlockall
 successful
 ERROR [main] 2014-08-27 08:17:24,644 CassandraDaemon.java (line 235) 
 *Directory
 /mnt/cassandra/data doesn't exist*
 *ERROR [main] 2014-08-27 08:17:24,645 CassandraDaemon.java (line 239) Has
 no permission to create /mnt/cassandra/data directory*
  INFO [Thread-1] 2014-08-27 08:17:24,646 DseDaemon.java (line 477) DSE
 shutting down...
 ERROR [Thread-1] 2014-08-27 08:17:24,725 CassandraDaemon.java (line 199)
 Exception in thread Thread[Thread-1,5,main]
 java.lang.AssertionError
 at
 org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1263)
 at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:171)
 at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:478)
 at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:384)

 My agent.log file says:

 Node is still provisioning, not attempting to determine ip.

  INFO [Initialization] 2014-08-27 08:40:57,848 Sleeping for 20s before
 trying to determine IP over JMX again

  INFO [Initialization] 2014-08-27 08:41:17,849 Node is still provisioning,
 not attempting to determine ip.

  INFO [Initialization] 2014-08-27 08:41:17,849 Sleeping for 20s before
 trying to determine IP over JMX again

  INFO [Initialization] 2014-08-27 08:41:37,849 Node is still provisioning,
 not attempting to determine ip.

  INFO [Initialization] 2014-08-27 08:41:37,850 Sleeping for 20s before
 trying to determine IP over JMX again

  INFO [Initialization] 2014-08-27 08:41:57,850 Node is still provisioning,
 not attempting to determine ip.


 I feel like I'm missing something easy with the mount, so if you could
 point me in the right direction, I would really appreciate it!

 --
 Stephen Portanova
 (480) 495-2634



Re: Can't Add AWS Node due to /mnt/cassandra/data directory

2014-08-27 Thread Stephen Portanova
Worked great! Thanks Mark!


On Wed, Aug 27, 2014 at 2:00 AM, Mark Reddy mark.l.re...@gmail.com wrote:

 Hi stephen,

 I have never added a node via OpsCenter, so this may be a short coming of
 that process. However in non OpsCenter installs you would have to create
 the data directories first:

 sudo mkdir -p /mnt/cassandra/commitlog
 sudo mkdir -p /mnt/cassandra/data
 sudo mkdir -p /mnt/cassandra/saved_caches

 And then give the cassandra user ownership of those directories:

 sudo chown -R cassandra:cassandra /mnt/cassandra

 Once this is done Cassandra will have the correct directories and
 permission to start up.


 Mark


 On 27 August 2014 09:50, Stephen Portanova sport...@gmail.com wrote:

 I already have a 3node m3.large DSE cluster, but I can't seem to add
 another m3.large node. I'm using the 
 ubuntu-trusty-14.04-amd64-server-20140607.1
 (ami-a7fdfee2) AMI (instance-store backed, PV) on AWS, I install java 7
 and the JNA, then I go into opscenter to add a node. Things look good for 3
 or 4 green circles, until I either get this error: Start Errored: Timed
 out waiting for Cassandra to start. or this error: Agent Connection
 Errored: Timed out waiting for agent to connect.

 I check the system.log and output.log, and they both say:
 INFO [main] 2014-08-27 08:17:24,642 CLibrary.java (line 121) JNA mlockall
 successful
 ERROR [main] 2014-08-27 08:17:24,644 CassandraDaemon.java (line 235) 
 *Directory
 /mnt/cassandra/data doesn't exist*
 *ERROR [main] 2014-08-27 08:17:24,645 CassandraDaemon.java (line 239) Has
 no permission to create /mnt/cassandra/data directory*
  INFO [Thread-1] 2014-08-27 08:17:24,646 DseDaemon.java (line 477) DSE
 shutting down...
 ERROR [Thread-1] 2014-08-27 08:17:24,725 CassandraDaemon.java (line 199)
 Exception in thread Thread[Thread-1,5,main]
 java.lang.AssertionError
 at
 org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1263)
 at
 com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:171)
 at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:478)
 at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:384)

 My agent.log file says:

 Node is still provisioning, not attempting to determine ip.

  INFO [Initialization] 2014-08-27 08:40:57,848 Sleeping for 20s before
 trying to determine IP over JMX again

  INFO [Initialization] 2014-08-27 08:41:17,849 Node is still
 provisioning, not attempting to determine ip.

  INFO [Initialization] 2014-08-27 08:41:17,849 Sleeping for 20s before
 trying to determine IP over JMX again

  INFO [Initialization] 2014-08-27 08:41:37,849 Node is still
 provisioning, not attempting to determine ip.

  INFO [Initialization] 2014-08-27 08:41:37,850 Sleeping for 20s before
 trying to determine IP over JMX again

  INFO [Initialization] 2014-08-27 08:41:57,850 Node is still
 provisioning, not attempting to determine ip.


 I feel like I'm missing something easy with the mount, so if you could
 point me in the right direction, I would really appreciate it!

 --
 Stephen Portanova
 (480) 495-2634





-- 
Stephen Portanova
(480) 495-2634


Bulk load in cassandra

2014-08-27 Thread Malay Nilabh
Hi
I installed Cassandra on one node successfully using CLI I am able to add a 
table to the keyspace as well as  retrieve the data from the table. My query is 
if I have text file on my local file system and I want to load on Cassandra 
cluster or you can say bulk load. How can I achieve that. Please help me out.

Regards
Malay Nilabh
BIDW BU/ Big Data CoE
LT Infotech Ltd, Hinjewadi,Pune
[cid:image001.gif@01CFC21E.64B11CD0]: +91-20-66571746
[cid:image002.png@01CFC21E.64B11CD0]+91-73-879-00727
Email: malay.nil...@lntinfotech.commailto:malay.nil...@lntinfotech.com
|| Save Paper - Save Trees ||



The contents of this e-mail and any attachment(s) may contain confidential or 
privileged information for the intended recipient(s). Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail and 
using or disseminating the information, and must notify the sender and delete 
it from their system. LT Infotech will not accept responsibility or liability 
for the accuracy or completeness of, or the presence of any virus or disabling 
code in this e-mail


Re: Bulk load in cassandra

2014-08-27 Thread Umang Shah
Hi Malay,

Yesterday i answered for your question but you didn't replied back whether
it worked for you or not.

Anyways you mean by importing text file into cassandra.

you can do that by following way.

COPY keyspace.columnfamily (column1, column2,...) FROM 'temp.csv' (location
of file);

for directly executing above command your file has to be in cassandra/bin
location.

Thanks,
Umang Shah
Pentaho BI-ETL Developer
shahuma...@gmail.com


On Wed, Aug 27, 2014 at 12:13 PM, Malay Nilabh malay.nil...@lntinfotech.com
 wrote:

  Hi

 I installed Cassandra on one node successfully using CLI I am able to add
 a table to the keyspace as well as  retrieve the data from the table. My
 query is if I have text file on my local file system and I want to load on
 Cassandra cluster or you can say bulk load. How can I achieve that. Please
 help me out.



 Regards

 *Malay Nilabh*

 BIDW BU/ Big Data CoE

 LT Infotech Ltd, Hinjewadi,Pune

 [image: Description: image001]: +91-20-66571746

 [image: Description: Description: Description: Description:
 cid:image002.png@01CF1EAD.959B9290]+91-73-879-00727

 Email: malay.nil...@lntinfotech.com

 *|| Save Paper - Save Trees || *



 --
 The contents of this e-mail and any attachment(s) may contain confidential
 or privileged information for the intended recipient(s). Unintended
 recipients are prohibited from taking action on the basis of information in
 this e-mail and using or disseminating the information, and must notify the
 sender and delete it from their system. LT Infotech will not accept
 responsibility or liability for the accuracy or completeness of, or the
 presence of any virus or disabling code in this e-mail




-- 
Regards,
Umang V.Shah
+919886829019


Re: Bulk load in cassandra

2014-08-27 Thread baskar.duraikannu
Please try COPY command via CQL shell if it is delimited file. 

Regards,
Baskar Duraikannu

-Original Message-
From: Malay Nilabh malay.nil...@lntinfotech.com
Date: Wed, 27 Aug 2014 17:43:21 
To: user@cassandra.apache.orguser@cassandra.apache.org
Reply-To: user@cassandra.apache.org
Subject: Bulk load in cassandra

Hi
I installed Cassandra on one node successfully using CLI I am able to add a 
table to the keyspace as well as  retrieve the data from the table. My query is 
if I have text file on my local file system and I want to load on Cassandra 
cluster or you can say bulk load. How can I achieve that. Please help me out.

Regards
Malay Nilabh
BIDW BU/ Big Data CoE
LT Infotech Ltd, Hinjewadi,Pune
[cid:image001.gif@01CFC21E.64B11CD0]: +91-20-66571746
[cid:image002.png@01CFC21E.64B11CD0]+91-73-879-00727
Email: malay.nil...@lntinfotech.commailto:malay.nil...@lntinfotech.com
|| Save Paper - Save Trees ||



The contents of this e-mail and any attachment(s) may contain confidential or 
privileged information for the intended recipient(s). Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail and 
using or disseminating the information, and must notify the sender and delete 
it from their system. LT Infotech will not accept responsibility or liability 
for the accuracy or completeness of, or the presence of any virus or disabling 
code in this e-mail



Re: Installing Cassandra Multinode on CentOs coming up with exception

2014-08-27 Thread Vineet Mishra
Hey Patricia,

Thanks for your kind response. I will surely take care of that provided the
use of virtual nodes.

Thanks again!


On Tue, Aug 26, 2014 at 10:42 PM, Patricia Gorla patri...@thelastpickle.com
 wrote:

 Vineet,

 One more thing -- you have initial_token and num_tokens both set. If you
 are trying to use virtual nodes, you should comment out initial_token as
 this setting overrides num_tokens.

 Cheers,


 On Tue, Aug 26, 2014 at 5:39 AM, Vineet Mishra clearmido...@gmail.com
 wrote:

 Thanks Vivek!

 It was indeed a formatting issue in yaml, got it work!


 On Tue, Aug 26, 2014 at 6:06 PM, Vivek Mishra mishra.v...@gmail.com
 wrote:

 Please read about http://www.yaml.org/start.html.
 Looks like formatting issue. You might be missing/adding incorrect spaces

 Validate your YAML file. This should help you out
 http://yamllint.com/

 -Vivek


 On Tue, Aug 26, 2014 at 4:20 PM, Vineet Mishra clearmido...@gmail.com
 wrote:

 Hi Mark,

 Yes I was generating my own cassandra.yaml with the configuration
 mentioned below,

 cluster_name: 'node'
 initial_token: 0
 num_tokens: 256
 seed_provider:
 - class_name: org.apache.cassandra.locator.SimpleSeedProvider
 parameters:
 - seeds: 192.168.1.32
 listen_address: 192.168.1.32
 rpc_address: 0.0.0.0
 endpoint_snitch: RackInferringSnitch

 Similarly for second node

 cluster_name: 'node'
 initial_token: 2305843009213693952
 num_tokens: 256
 seed_provider:
 - class_name: org.apache.cassandra.locator.SimpleSeedProvider
 parameters:
 - seeds: 192.168.1.32
 listen_address: 192.168.1.36
 rpc_address: 0.0.0.0
 endpoint_snitch: RackInferringSnitch

 and so on. . .



 But even if I use default xml with the necessary configurational
 changes I am getting following error.

  INFO 16:13:38,225 Loading settings from
 file:/home/cluster/cassandra/conf/cassandra.yaml
 ERROR 16:13:38,301 Fatal configuration error
 org.apache.cassandra.exceptions.ConfigurationException: Invalid yaml
  at
 org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:100)
 at
 org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:135)
  at
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:111)
 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:156)
  at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
 at
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
 Caused by: while parsing a block mapping
  in 'reader', line 10, column 2:
  cluster_name: 'node'
  ^
 expected block end, but found BlockMappingStart
  in 'reader', line 30, column 3:
   initial_token: 0
   ^

 at
 org.yaml.snakeyaml.parser.ParserImpl$ParseBlockMappingKey.produce(ParserImpl.java:570)
  at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(ParserImpl.java:158)
 at org.yaml.snakeyaml.parser.ParserImpl.checkEvent(ParserImpl.java:143)
  at
 org.yaml.snakeyaml.composer.Composer.composeMappingNode(Composer.java:230)
 at org.yaml.snakeyaml.composer.Composer.composeNode(Composer.java:159)
  at
 org.yaml.snakeyaml.composer.Composer.composeDocument(Composer.java:122)
 at org.yaml.snakeyaml.composer.Composer.getSingleNode(Composer.java:105)
  at
 org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:120)
 at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:481)
  at org.yaml.snakeyaml.Yaml.loadAs(Yaml.java:475)
 at
 org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:93)
  ... 5 more
 Invalid yaml

 Could you figure out whats making the yaml invalid.

 Thanks!


 On Tue, Aug 26, 2014 at 4:06 PM, Mark Reddy mark.l.re...@gmail.com
 wrote:

 You are missing commitlog_sync in your cassandra.yaml.

 Are you generating your own cassandra.yaml or editing the package
 default? If you are generating your own there are several configuration
 options that are required and if not present, Cassandra will fail to
 start.


 Regards,
 Mark


 On 26 August 2014 11:14, Vineet Mishra clearmido...@gmail.com wrote:

 Thanks Mark,
 That was indeed yaml formatting issue.
 Moreover I am getting the underlying error now,

 INFO 15:33:43,770 Loading settings from
 file:/home/cluster/cassandra/conf/cassandra.yaml
  INFO 15:33:44,100 Data files directories: [/var/lib/cassandra/data]
  INFO 15:33:44,101 Commit log directory: /var/lib/cassandra/commitlog
 ERROR 15:33:44,103 Fatal configuration error
 org.apache.cassandra.exceptions.ConfigurationException: Missing
 required directive CommitLogSync
  at
 org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:147)
 at
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:111)
  at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:156)
 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
  at
 

[RELEASE] Apache Cassandra 2.0.10 released

2014-08-27 Thread Razi Khaja
I looked for the newest release, but only see release candidates, not a
stable release.
http://archive.apache.org/dist/cassandra/2.1.0/


Re: are dynamic columns supported at all in CQL 3?

2014-08-27 Thread Deepak Shetty
 Using the post's example, consider the query of get all readings for
sensor 1.  With dynamic columns, the query is just select * from data
where sensor_id=1.  In CQL, not only does this take N different queries
(one per sample) but you have to explicitly know the collected_at values to
query for.  Right?
This does work in CQL (v3.1.1 , tried on Cassandra 2.0.4)

cqlsh:playlist CREATE TABLE data (
...   sensor_id int,
...   collected_at timestamp,
...   volts float,
...   PRIMARY KEY (sensor_id, collected_at)
... ) WITH COMPACT STORAGE;
cqlsh:playlist insert into data(sensor_id,collected_at,volts) values
(1,'2014-0
5-01 00:00:00',1.2);
cqlsh:playlist insert into data(sensor_id,collected_at,volts) values
(1,'2014-0
5-02 00:00:00',1.3);
cqlsh:playlist insert into data(sensor_id,collected_at,volts) values
(1,'2014-0
5-03 00:00:00',1.4);
cqlsh:playlist insert into data(sensor_id,collected_at,volts) values
(2,'2014-0
5-03 00:00:00',2.4);
cqlsh:playlist select * from data;

 sensor_id | collected_at | volts
---+--+---
 1 | 2014-05-01 00:00:00Pacific Daylight Time |   1.2
 1 | 2014-05-02 00:00:00Pacific Daylight Time |   1.3
 1 | 2014-05-03 00:00:00Pacific Daylight Time |   1.4
 2 | 2014-05-03 00:00:00Pacific Daylight Time |   2.4

(4 rows)

cqlsh:playlist select * from data where sensor_id=1;

 sensor_id | collected_at | volts
---+--+---
 1 | 2014-05-01 00:00:00Pacific Daylight Time |   1.2
 1 | 2014-05-02 00:00:00Pacific Daylight Time |   1.3
 1 | 2014-05-03 00:00:00Pacific Daylight Time |   1.4

(3 rows)

cqlsh:playlist






On Tue, Aug 26, 2014 at 12:33 PM, Ian Rose ianr...@fullstory.com wrote:

 Unfortunately, no.  I've read that and the solution presented only works
 in limited scenarios.  Using the post's example, consider the query of get
 all readings for sensor 1.  With dynamic columns, the query is just
 select * from data where sensor_id=1.  In CQL, not only does this take N
 different queries (one per sample) but you have to explicitly know the
 collected_at values to query for.  Right?

 The other suggestion, to use collections (such as a map), again works in
 some circumstances, but not all.  In particular, each item in a collection
 is limited to 64k bytes which is not something we want to be limited to (we
 are storing byte arrays that occasionally exceed this size).



 On Tue, Aug 26, 2014 at 3:14 PM, Shane Hansen shanemhan...@gmail.com
 wrote:

 Does this answer your question Ian?

 http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows



 On Tue, Aug 26, 2014 at 1:12 PM, Ian Rose ianr...@fullstory.com wrote:

 Is it possible in CQL to create a table that supports dynamic column
 names?  I am using C* v2.0.9, which I assume implies CQL version 3.

 This page appears to show that this was supported in CQL 2 with the
 'with comparator' and 'with default_validation' options but that CQL 3 does
 not support this: http://www.datastax.com/dev/blog/whats-new-in-cql-3-0

 Am I understanding that right?  If so, what is my best course of action?
  Create the table using the cassandra-cli tool?

 Thanks,
 - Ian






Re: [RELEASE] Apache Cassandra 2.0.10 released

2014-08-27 Thread Russ Bradberry
This release is for 2.0.10, not the 2.1.x line.  If you want this release it is 
at http://archive.apache.org/dist/cassandra/2.0.10/ The 2.1.x line is not 
stable yet.


On August 27, 2014 at 11:29:46 AM, Razi Khaja (razi.kh...@gmail.com) wrote:

I looked for the newest release, but only see release candidates, not a stable 
release.
http://archive.apache.org/dist/cassandra/2.1.0/


Re: are dynamic columns supported at all in CQL 3?

2014-08-27 Thread Ian Rose
Deepak -

Yes, you are indeed right.  I must admit I am still trying to learn what
queries can and cannot be performed in Cassandra and I didn't realize that
you could query on a non-fully-specified primary key, as long as you *do* fully
qualify the partition key.

Cheers,
Ian



On Wed, Aug 27, 2014 at 11:31 AM, Deepak Shetty shet...@gmail.com wrote:

  Using the post's example, consider the query of get all readings for
 sensor 1.  With dynamic columns, the query is just select * from data
 where sensor_id=1.  In CQL, not only does this take N different queries
 (one per sample) but you have to explicitly know the collected_at values to
 query for.  Right?
 This does work in CQL (v3.1.1 , tried on Cassandra 2.0.4)

 cqlsh:playlist CREATE TABLE data (
 ...   sensor_id int,
 ...   collected_at timestamp,
 ...   volts float,
 ...   PRIMARY KEY (sensor_id, collected_at)
 ... ) WITH COMPACT STORAGE;
 cqlsh:playlist insert into data(sensor_id,collected_at,volts) values
 (1,'2014-0
 5-01 00:00:00',1.2);
 cqlsh:playlist insert into data(sensor_id,collected_at,volts) values
 (1,'2014-0
 5-02 00:00:00',1.3);
 cqlsh:playlist insert into data(sensor_id,collected_at,volts) values
 (1,'2014-0
 5-03 00:00:00',1.4);
 cqlsh:playlist insert into data(sensor_id,collected_at,volts) values
 (2,'2014-0
 5-03 00:00:00',2.4);
 cqlsh:playlist select * from data;

  sensor_id | collected_at | volts
 ---+--+---
  1 | 2014-05-01 00:00:00Pacific Daylight Time |   1.2
  1 | 2014-05-02 00:00:00Pacific Daylight Time |   1.3
  1 | 2014-05-03 00:00:00Pacific Daylight Time |   1.4
  2 | 2014-05-03 00:00:00Pacific Daylight Time |   2.4

 (4 rows)

 cqlsh:playlist select * from data where sensor_id=1;

  sensor_id | collected_at | volts
 ---+--+---
  1 | 2014-05-01 00:00:00Pacific Daylight Time |   1.2
  1 | 2014-05-02 00:00:00Pacific Daylight Time |   1.3
  1 | 2014-05-03 00:00:00Pacific Daylight Time |   1.4

 (3 rows)

 cqlsh:playlist






 On Tue, Aug 26, 2014 at 12:33 PM, Ian Rose ianr...@fullstory.com wrote:

 Unfortunately, no.  I've read that and the solution presented only works
 in limited scenarios.  Using the post's example, consider the query of get
 all readings for sensor 1.  With dynamic columns, the query is just
 select * from data where sensor_id=1.  In CQL, not only does this take N
 different queries (one per sample) but you have to explicitly know the
 collected_at values to query for.  Right?

 The other suggestion, to use collections (such as a map), again works in
 some circumstances, but not all.  In particular, each item in a collection
 is limited to 64k bytes which is not something we want to be limited to (we
 are storing byte arrays that occasionally exceed this size).



 On Tue, Aug 26, 2014 at 3:14 PM, Shane Hansen shanemhan...@gmail.com
 wrote:

 Does this answer your question Ian?

 http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows



 On Tue, Aug 26, 2014 at 1:12 PM, Ian Rose ianr...@fullstory.com wrote:

 Is it possible in CQL to create a table that supports dynamic column
 names?  I am using C* v2.0.9, which I assume implies CQL version 3.

 This page appears to show that this was supported in CQL 2 with the
 'with comparator' and 'with default_validation' options but that CQL 3 does
 not support this: http://www.datastax.com/dev/blog/whats-new-in-cql-3-0

 Am I understanding that right?  If so, what is my best course of
 action?  Create the table using the cassandra-cli tool?

 Thanks,
 - Ian







Re: CQL performance inserting multiple cluster keys under same partition key

2014-08-27 Thread Jaydeep Chovatia
This clarifies my doubt.
Thanks You Sylvain for your help.


On Tue, Aug 26, 2014 at 11:59 PM, Sylvain Lebresne sylv...@datastax.com
wrote:

 On Tue, Aug 26, 2014 at 6:50 PM, Jaydeep Chovatia 
 chovatia.jayd...@gmail.com wrote:

 Hi,

 I have question on inserting multiple cluster keys under same partition
 key.

 Ex:

 CREATE TABLE Employee (
   deptId int,
   empId int,
   name   varchar,
   address varchar,
   salary int,
   PRIMARY KEY(deptId, empId)
 );

 BEGIN *UNLOGGED *BATCH
   INSERT INTO Employee (deptId, empId, name, address, salary) VALUES (1,
 10, 'testNameA', 'testAddressA', 2);
   INSERT INTO Employee (deptId, empId, name, address, salary) VALUES (1,
 20, 'testNameB', 'testAddressB', 3);
 APPLY BATCH;

 Here we are inserting two cluster keys (10 and 20) under same partition
 key (1).
 Q1) Is this batch transaction atomic and isolated? If yes then is there
 any performance overhead with this syntax?


 As long as the update are under the same partition key (and I insist, only
 in that condition), logged (the one without the UNLOGGED keyword) and
 unlogged batches behave *exactly* the same way. So yes, in that case the
 batch is atomic and isolated (though on the isolation, you may want to be
 aware that while technically isolated, the usual timestamp rules still
 apply and so you might not get the behavior you think if 2 batches have the
 same timestamp: see CASSANDRA-6123
 https://issues.apache.org/jira/browse/CASSANDRA-6123). There is no also
 no performance overhead (assuming you meant over logged batches).

 Q2) Is this CQL syntax can be considered equivalent of Thrift
 batch_mutate?


 It is equivalent, both (the CQL syntax and Thrift batch_mutate) resolve
 to the same operation internally.

 --
 Sylvain



How often are JMX Cassandra metrics reset?

2014-08-27 Thread Donald Smith
I'm using JMX to retrieve Cassandra metrics.   I notice that  Max and Count are 
cumulative and aren't reset.How often are the stats for Mean, 
99tthPercentile, etc reset back to zero?

For example, 99thPercentile shows as 1.5 mls. Over how many minutes?

ClientRequest/Read/Latency:
LatencyUnit = MICROSECONDS
FiveMinuteRate = 1.12
FifteenMinuteRate = 1.11
RateUnit = SECONDS
MeanRate = 1.65
OneMinuteRate = 1.13
EventType = calls
   Max = 237,373.37
Count = 961,312
50thPercentile = 383.2
Mean = 908.46
Min = 95.64
StdDev = 3,034.62
75thPercentile = 626.34
95thPercentile = 954.31
98thPercentile = 1,443.11
99thPercentile = 1,472.4
999thPercentile = 1,858.1

Donald A. Smith | Senior Software Engineer
P: 425.201.3900 x 3866
C: (206) 819-5965
F: (646) 443-2333
dona...@audiencescience.commailto:dona...@audiencescience.com

[AudienceScience]



Re: How often are JMX Cassandra metrics reset?

2014-08-27 Thread Robert Coli
On Wed, Aug 27, 2014 at 12:38 PM, Donald Smith 
donald.sm...@audiencescience.com wrote:

  I’m using JMX to retrieve Cassandra metrics.   I notice that  Max and
 Count are cumulative and aren’t reset.How often are the stats for Mean,
 99tthPercentile, etc reset back to zero?


If they're like the old latency numbers, they are from node startup time
and are never reset.

=Rob


Re: Bulk load in cassandra

2014-08-27 Thread Robert Coli
On Wed, Aug 27, 2014 at 5:13 AM, Malay Nilabh malay.nil...@lntinfotech.com
wrote:

  I installed Cassandra on one node successfully using CLI I am able to
 add a table to the keyspace as well as  retrieve the data from the table.
 My query is if I have text file on my local file system and I want to load
 on Cassandra cluster or you can say bulk load. How can I achieve that.
 Please help me out.


http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra
or
CQLsh COPY but beware that COPY is capable of timing out in the current
implementation.

=Rob


Re: Too many SSTables after rebalancing cluster (LCS)

2014-08-27 Thread Nate McCall
Try turning down 'tombstone_threshold' to something like '0.05' from it's
default of '0.2.' This will cause the SSTable to be considered for
tombstone only compactions more frequently (if %5 of the columns are
tombstones instead of 20%).

For a bit more info, see:
http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/compactSubprop.html


On Tue, Aug 26, 2014 at 1:38 PM, Paulo Ricardo Motta Gomes 
paulo.mo...@chaordicsystems.com wrote:

 Hey folks,

 After adding more nodes and moving tokens of old nodes to rebalance the
 ring, I noticed that the old nodes had significant more data then the
 newly bootstrapped nodes, even after cleanup.

 I noticed that the old nodes had a much larger number of SSTables on LCS
 CFs, and most of them located on the last level:

 Node N-1 (old node): [1, 10, 102/100, 173, 2403, 0, 0, 0, 0] (total:2695)

 *Node N (new node): [1, 10, 108/100, 214, 0, 0, 0, 0, 0] (total: 339)*Node
 N+1 (old node): [1, 10, 87, 113, 1076, 0, 0, 0, 0] (total: 1287)

 Since these sstables have a lot of tombstones, and they're not updated
 frequently, they remain in the last level forever, and are never cleaned.

 What is the solution here? The good old change to STCS and then back to
 LCS, or is there something less brute force?

 Environment: Cassandra 1.2.16 - non-vnondes

 Any help would be very much appreciated.

 Cheers,

 --
 *Paulo Motta*

 Chaordic | *Platform*
 *www.chaordic.com.br http://www.chaordic.com.br/*
 +55 48 3232.3200




-- 
-
Nate McCall
Austin, TX
@zznate

Co-Founder  Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Too many SSTables after rebalancing cluster (LCS)

2014-08-27 Thread Paulo Ricardo Motta Gomes
Great idea, will try that (right now is 10%, but being more aggressive
should hopefully work).

Cheers!


On Wed, Aug 27, 2014 at 7:02 PM, Nate McCall n...@thelastpickle.com wrote:

 Try turning down 'tombstone_threshold' to something like '0.05' from it's
 default of '0.2.' This will cause the SSTable to be considered for
 tombstone only compactions more frequently (if %5 of the columns are
 tombstones instead of 20%).

 For a bit more info, see:

 http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/compactSubprop.html


 On Tue, Aug 26, 2014 at 1:38 PM, Paulo Ricardo Motta Gomes 
 paulo.mo...@chaordicsystems.com wrote:

 Hey folks,

 After adding more nodes and moving tokens of old nodes to rebalance the
 ring, I noticed that the old nodes had significant more data then the
 newly bootstrapped nodes, even after cleanup.

 I noticed that the old nodes had a much larger number of SSTables on LCS
 CFs, and most of them located on the last level:

 Node N-1 (old node): [1, 10, 102/100, 173, 2403, 0, 0, 0, 0] (total:2695)

 *Node N (new node): [1, 10, 108/100, 214, 0, 0, 0, 0, 0] (total: 339)*Node
 N+1 (old node): [1, 10, 87, 113, 1076, 0, 0, 0, 0] (total: 1287)

 Since these sstables have a lot of tombstones, and they're not updated
 frequently, they remain in the last level forever, and are never cleaned.

 What is the solution here? The good old change to STCS and then back to
 LCS, or is there something less brute force?

 Environment: Cassandra 1.2.16 - non-vnondes

 Any help would be very much appreciated.

 Cheers,

 --
 *Paulo Motta*

 Chaordic | *Platform*
 *www.chaordic.com.br http://www.chaordic.com.br/*
 +55 48 3232.3200




 --
 -
 Nate McCall
 Austin, TX
 @zznate

 Co-Founder  Sr. Technical Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com




-- 
*Paulo Motta*

Chaordic | *Platform*
*www.chaordic.com.br http://www.chaordic.com.br/*
+55 48 3232.3200


Re: Too many SSTables after rebalancing cluster (LCS)

2014-08-27 Thread Nate McCall
Another option to force things - deleting the json metadata file for that
table will cause LCS to put all SSTables in level 0 and begin recompacting
them.


On Wed, Aug 27, 2014 at 5:15 PM, Paulo Ricardo Motta Gomes 
paulo.mo...@chaordicsystems.com wrote:

 Great idea, will try that (right now is 10%, but being more aggressive
 should hopefully work).

 Cheers!


 On Wed, Aug 27, 2014 at 7:02 PM, Nate McCall n...@thelastpickle.com
 wrote:

 Try turning down 'tombstone_threshold' to something like '0.05' from it's
 default of '0.2.' This will cause the SSTable to be considered for
 tombstone only compactions more frequently (if %5 of the columns are
 tombstones instead of 20%).

 For a bit more info, see:

 http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/compactSubprop.html


 On Tue, Aug 26, 2014 at 1:38 PM, Paulo Ricardo Motta Gomes 
 paulo.mo...@chaordicsystems.com wrote:

 Hey folks,

 After adding more nodes and moving tokens of old nodes to rebalance
 the ring, I noticed that the old nodes had significant more data then the
 newly bootstrapped nodes, even after cleanup.

 I noticed that the old nodes had a much larger number of SSTables on LCS
 CFs, and most of them located on the last level:

 Node N-1 (old node): [1, 10, 102/100, 173, 2403, 0, 0, 0, 0] (total:2695)

 *Node N (new node): [1, 10, 108/100, 214, 0, 0, 0, 0, 0] (total: 339)*Node
 N+1 (old node): [1, 10, 87, 113, 1076, 0, 0, 0, 0] (total: 1287)

 Since these sstables have a lot of tombstones, and they're not updated
 frequently, they remain in the last level forever, and are never cleaned.

 What is the solution here? The good old change to STCS and then back to
 LCS, or is there something less brute force?

 Environment: Cassandra 1.2.16 - non-vnondes

 Any help would be very much appreciated.

 Cheers,

 --
 *Paulo Motta*

 Chaordic | *Platform*
 *www.chaordic.com.br http://www.chaordic.com.br/*
 +55 48 3232.3200




 --
 -
 Nate McCall
 Austin, TX
 @zznate

 Co-Founder  Sr. Technical Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com




 --
 *Paulo Motta*

 Chaordic | *Platform*
 *www.chaordic.com.br http://www.chaordic.com.br/*
 +55 48 3232.3200




-- 
-
Nate McCall
Austin, TX
@zznate

Co-Founder  Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Too many SSTables after rebalancing cluster (LCS)

2014-08-27 Thread Robert Coli
On Wed, Aug 27, 2014 at 3:27 PM, Nate McCall n...@thelastpickle.com wrote:

 Another option to force things - deleting the json metadata file for that
 table will cause LCS to put all SSTables in level 0 and begin recompacting
 them.


That's possible in versions where the level is in a JSON file, which is
versions before 2.0. In 2.0+ you can use nodetool for the same purpose.

https://issues.apache.org/jira/browse/CASSANDRA-5271 (Fixed; 2.0 beta 1):
Create tool to drop sstables to level 0

=Rob


Re: Can't Add AWS Node due to /mnt/cassandra/data directory

2014-08-27 Thread Ben Bromhead
Make sure you have also setup the ephemeral drives as a raid device (use mdadm) 
and mounted it under /mnt/cassandra otherwise your data dir is the os partition 
which is usually very small.

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359

On 27 Aug 2014, at 8:21 pm, Stephen Portanova sport...@gmail.com wrote:

 Worked great! Thanks Mark!
 
 
 On Wed, Aug 27, 2014 at 2:00 AM, Mark Reddy mark.l.re...@gmail.com wrote:
 Hi stephen,
 
 I have never added a node via OpsCenter, so this may be a short coming of 
 that process. However in non OpsCenter installs you would have to create the 
 data directories first:
 
 sudo mkdir -p /mnt/cassandra/commitlog
 sudo mkdir -p /mnt/cassandra/data
 sudo mkdir -p /mnt/cassandra/saved_caches
 
 And then give the cassandra user ownership of those directories:
 
 sudo chown -R cassandra:cassandra /mnt/cassandra 
 
 Once this is done Cassandra will have the correct directories and permission 
 to start up.
 
 
 Mark
 
 
 On 27 August 2014 09:50, Stephen Portanova sport...@gmail.com wrote:
 I already have a 3node m3.large DSE cluster, but I can't seem to add another 
 m3.large node. I'm using the ubuntu-trusty-14.04-amd64-server-20140607.1 
 (ami-a7fdfee2) AMI (instance-store backed, PV) on AWS, I install java 7 and 
 the JNA, then I go into opscenter to add a node. Things look good for 3 or 4 
 green circles, until I either get this error: Start Errored: Timed out 
 waiting for Cassandra to start. or this error: Agent Connection Errored: 
 Timed out waiting for agent to connect.
 
 I check the system.log and output.log, and they both say:
 INFO [main] 2014-08-27 08:17:24,642 CLibrary.java (line 121) JNA mlockall 
 successful
 ERROR [main] 2014-08-27 08:17:24,644 CassandraDaemon.java (line 235) 
 Directory /mnt/cassandra/data doesn't exist
 ERROR [main] 2014-08-27 08:17:24,645 CassandraDaemon.java (line 239) Has no 
 permission to create /mnt/cassandra/data directory
  INFO [Thread-1] 2014-08-27 08:17:24,646 DseDaemon.java (line 477) DSE 
 shutting down...
 ERROR [Thread-1] 2014-08-27 08:17:24,725 CassandraDaemon.java (line 199) 
 Exception in thread Thread[Thread-1,5,main]
 java.lang.AssertionError
 at 
 org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1263)
 at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:171)
 at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:478)
 at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:384)
 
 My agent.log file says:
 Node is still provisioning, not attempting to determine ip.
 
  INFO [Initialization] 2014-08-27 08:40:57,848 Sleeping for 20s before trying 
 to determine IP over JMX again
 
  INFO [Initialization] 2014-08-27 08:41:17,849 Node is still provisioning, 
 not attempting to determine ip.
 
  INFO [Initialization] 2014-08-27 08:41:17,849 Sleeping for 20s before trying 
 to determine IP over JMX again
 
  INFO [Initialization] 2014-08-27 08:41:37,849 Node is still provisioning, 
 not attempting to determine ip.
 
  INFO [Initialization] 2014-08-27 08:41:37,850 Sleeping for 20s before trying 
 to determine IP over JMX again
 
  INFO [Initialization] 2014-08-27 08:41:57,850 Node is still provisioning, 
 not attempting to determine ip.
 
 
 
 I feel like I'm missing something easy with the mount, so if you could point 
 me in the right direction, I would really appreciate it!
 
 -- 
 Stephen Portanova
 (480) 495-2634
 
 
 
 
 -- 
 Stephen Portanova
 (480) 495-2634