Re: CQL Thrift

2013-08-30 Thread Jon Haddad
If you're going to work with CQL, work with CQL.  If you're going to work with 
Thrift, work with Thrift.  Don't mix.

On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote:

 Hi,
 If i a create a table with CQL3 as 
 
 create table user(user_id text PRIMARY KEY, first_name text, last_name text, 
 emailid text);
 
 and create index as:
 create index on user(first_name);
 
 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId) 
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');
 
 
 Then if update same column family using Cassandra-cli as:
 
 update column family user with key_validation_class='UTF8Type' and 
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type', 
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', 
 index_type:KEYS}];
 
 
 Now if i connect via cqlsh and explore user table, i can see column 
 first_name,last_name are not part of table structure anymore. Here is the 
 output:
 
 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};
 
 cqlsh:cql3usage select * from user;
 
  user_id
 -
  @mevivs
 
 
 
 
 
 I understand that, CQL3 and thrift interoperability is an issue. But this 
 looks to me a very basic scenario.
 
 
 
 Any suggestions? Or If anybody can explain a reason behind this?
 
 -Vivek
 
 
 
 



Re: CQL Thrift

2013-08-30 Thread Vivek Mishra
And surprisingly if i alter table as :

alter table user add first_name text;
alter table user add last_name text;

It gives me back column with values, but still no indexes.

Thrift and CQL3 depends on same storage engine. Do they really maintain
different metadata for same column family?

-Vivek



On Fri, Aug 30, 2013 at 11:08 PM, Vivek Mishra mishra.v...@gmail.comwrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text, last_name
 text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

 cqlsh:cql3usage select * from user;

  user_id
 -
  @mevivs





 I understand that, CQL3 and thrift interoperability is an issue. But this
 looks to me a very basic scenario.



 Any suggestions? Or If anybody can explain a reason behind this?

 -Vivek







Re: CQL Thrift

2013-08-30 Thread Peter Lin
in my case, I built a temporal database on top of Cassandra, so it's
absolutely key.

Dynamic columns are super powerful, which relational database have no
equivalent. For me, that is one of the top 3 reasons for using Cassandra.



On Fri, Aug 30, 2013 at 2:03 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 If you talk about comparator. Yes, that's a valid point and not possible
 with CQL3.

 -Vivek


 On Fri, Aug 30, 2013 at 11:31 PM, Peter Lin wool...@gmail.com wrote:


 I use dynamic columns all the time and they vary in type.

 With CQL you can define a default type, but you can't insert specific
 types of data for column name and value. It forces you to use all bytes or
 all strings, which would require coverting it to other types.

 thrift is much more powerful in that respect.

 not everyone needs to take advantage of the full power of dynamic columns.


 On Fri, Aug 30, 2013 at 1:58 PM, Jon Haddad j...@jonhaddad.com wrote:

 Just curious - what do you need to do that requires thrift?  We've build
 our entire platform using CQL3 and we haven't hit any issues.

 On Aug 30, 2013, at 10:53 AM, Peter Lin wool...@gmail.com wrote:


 my bias perspective, I find the sweet spot is thrift for insert/update
 and CQL for select queries.

 CQL is too limiting and negates the power of storing arbitrary data
 types in dynamic columns.


 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:

 If you're going to work with CQL, work with CQL.  If you're going to
 work with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com
 wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text, last_name
 text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

 cqlsh:cql3usage select * from user;

  user_id
 -
  @mevivs





 I understand that, CQL3 and thrift interoperability is an issue. But
 this looks to me a very basic scenario.



 Any suggestions? Or If anybody can explain a reason behind this?

 -Vivek












Re: CQL Thrift

2013-08-30 Thread Vivek Mishra
Hi,
I understand that, but i want to understand the reason behind
such behavior?  Is it because of maintaining different metadata objects for
CQL3 and thrift?

Any suggestion?

-Vivek


On Fri, Aug 30, 2013 at 11:15 PM, Jon Haddad j...@jonhaddad.com wrote:

 If you're going to work with CQL, work with CQL.  If you're going to work
 with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text, last_name
 text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

 cqlsh:cql3usage select * from user;

  user_id
 -
  @mevivs





 I understand that, CQL3 and thrift interoperability is an issue. But this
 looks to me a very basic scenario.



 Any suggestions? Or If anybody can explain a reason behind this?

 -Vivek








Re: CQL Thrift

2013-08-30 Thread Jon Haddad
Could you please give a more concrete example?  

On Aug 30, 2013, at 11:10 AM, Peter Lin wool...@gmail.com wrote:

 
 in my case, I built a temporal database on top of Cassandra, so it's 
 absolutely key.
 
 Dynamic columns are super powerful, which relational database have no 
 equivalent. For me, that is one of the top 3 reasons for using Cassandra.
 
 
 
 On Fri, Aug 30, 2013 at 2:03 PM, Vivek Mishra mishra.v...@gmail.com wrote:
 If you talk about comparator. Yes, that's a valid point and not possible with 
 CQL3.
 
 -Vivek
 
 
 On Fri, Aug 30, 2013 at 11:31 PM, Peter Lin wool...@gmail.com wrote:
 
 I use dynamic columns all the time and they vary in type.
 
 With CQL you can define a default type, but you can't insert specific types 
 of data for column name and value. It forces you to use all bytes or all 
 strings, which would require coverting it to other types.
 
 thrift is much more powerful in that respect.
 
 not everyone needs to take advantage of the full power of dynamic columns.
 
 
 On Fri, Aug 30, 2013 at 1:58 PM, Jon Haddad j...@jonhaddad.com wrote:
 Just curious - what do you need to do that requires thrift?  We've build our 
 entire platform using CQL3 and we haven't hit any issues.  
 
 On Aug 30, 2013, at 10:53 AM, Peter Lin wool...@gmail.com wrote:
 
 
 my bias perspective, I find the sweet spot is thrift for insert/update and 
 CQL for select queries.
 
 CQL is too limiting and negates the power of storing arbitrary data types in 
 dynamic columns.
 
 
 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:
 If you're going to work with CQL, work with CQL.  If you're going to work 
 with Thrift, work with Thrift.  Don't mix.
 
 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote:
 
 Hi,
 If i a create a table with CQL3 as 
 
 create table user(user_id text PRIMARY KEY, first_name text, last_name 
 text, emailid text);
 
 and create index as:
 create index on user(first_name);
 
 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId) 
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');
 
 
 Then if update same column family using Cassandra-cli as:
 
 update column family user with key_validation_class='UTF8Type' and 
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type', 
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', 
 index_type:KEYS}];
 
 
 Now if i connect via cqlsh and explore user table, i can see column 
 first_name,last_name are not part of table structure anymore. Here is the 
 output:
 
 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};
 
 cqlsh:cql3usage select * from user;
 
  user_id
 -
  @mevivs
 
 
 
 
 
 I understand that, CQL3 and thrift interoperability is an issue. But this 
 looks to me a very basic scenario.
 
 
 
 Any suggestions? Or If anybody can explain a reason behind this?
 
 -Vivek
 
 
 
 
 
 
 
 
 
 



Re: CQL Thrift

2013-08-30 Thread Jonathan Ellis
http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows


On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote:


 my bias perspective, I find the sweet spot is thrift for insert/update and
 CQL for select queries.

 CQL is too limiting and negates the power of storing arbitrary data types
 in dynamic columns.


 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:

 If you're going to work with CQL, work with CQL.  If you're going to work
 with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text, last_name
 text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

 cqlsh:cql3usage select * from user;

  user_id
 -
  @mevivs





 I understand that, CQL3 and thrift interoperability is an issue. But this
 looks to me a very basic scenario.



 Any suggestions? Or If anybody can explain a reason behind this?

 -Vivek









-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced


Re: CQL Thrift

2013-08-30 Thread Jon Haddad
Just curious - what do you need to do that requires thrift?  We've build our 
entire platform using CQL3 and we haven't hit any issues.  

On Aug 30, 2013, at 10:53 AM, Peter Lin wool...@gmail.com wrote:

 
 my bias perspective, I find the sweet spot is thrift for insert/update and 
 CQL for select queries.
 
 CQL is too limiting and negates the power of storing arbitrary data types in 
 dynamic columns.
 
 
 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:
 If you're going to work with CQL, work with CQL.  If you're going to work 
 with Thrift, work with Thrift.  Don't mix.
 
 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote:
 
 Hi,
 If i a create a table with CQL3 as 
 
 create table user(user_id text PRIMARY KEY, first_name text, last_name text, 
 emailid text);
 
 and create index as:
 create index on user(first_name);
 
 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId) 
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');
 
 
 Then if update same column family using Cassandra-cli as:
 
 update column family user with key_validation_class='UTF8Type' and 
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type', 
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', 
 index_type:KEYS}];
 
 
 Now if i connect via cqlsh and explore user table, i can see column 
 first_name,last_name are not part of table structure anymore. Here is the 
 output:
 
 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};
 
 cqlsh:cql3usage select * from user;
 
  user_id
 -
  @mevivs
 
 
 
 
 
 I understand that, CQL3 and thrift interoperability is an issue. But this 
 looks to me a very basic scenario.
 
 
 
 Any suggestions? Or If anybody can explain a reason behind this?
 
 -Vivek
 
 
 
 
 
 



Re: CQL Thrift

2013-08-30 Thread Vivek Mishra
CQL is too limiting and negates the power of storing arbitrary data types
in dynamic columns.

I agree but partly. You can always create column family with key, column
and value and store any number of arbitrary columns as column name in
column and it's corresponding value with value.  I find it much easier.

Coming back to original question, i think differentiator is the column
metadata is treated in thrift and CQL3. What i do not understand is, for
same column family if maintaining two set of metadata
objects(CqlMetadata,CFDef), why updating anyone would cause trouble for
another!

-Vivek


On Fri, Aug 30, 2013 at 11:23 PM, Peter Lin wool...@gmail.com wrote:


 my bias perspective, I find the sweet spot is thrift for insert/update and
 CQL for select queries.

 CQL is too limiting and negates the power of storing arbitrary data types
 in dynamic columns.


 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:

 If you're going to work with CQL, work with CQL.  If you're going to work
 with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text, last_name
 text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

 cqlsh:cql3usage select * from user;

  user_id
 -
  @mevivs





 I understand that, CQL3 and thrift interoperability is an issue. But this
 looks to me a very basic scenario.



 Any suggestions? Or If anybody can explain a reason behind this?

 -Vivek









Re: CQL Thrift

2013-08-30 Thread Peter Lin
I use dynamic columns all the time and they vary in type.

With CQL you can define a default type, but you can't insert specific types
of data for column name and value. It forces you to use all bytes or all
strings, which would require coverting it to other types.

thrift is much more powerful in that respect.

not everyone needs to take advantage of the full power of dynamic columns.


On Fri, Aug 30, 2013 at 1:58 PM, Jon Haddad j...@jonhaddad.com wrote:

 Just curious - what do you need to do that requires thrift?  We've build
 our entire platform using CQL3 and we haven't hit any issues.

 On Aug 30, 2013, at 10:53 AM, Peter Lin wool...@gmail.com wrote:


 my bias perspective, I find the sweet spot is thrift for insert/update and
 CQL for select queries.

 CQL is too limiting and negates the power of storing arbitrary data types
 in dynamic columns.


 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:

 If you're going to work with CQL, work with CQL.  If you're going to work
 with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text, last_name
 text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

 cqlsh:cql3usage select * from user;

  user_id
 -
  @mevivs





 I understand that, CQL3 and thrift interoperability is an issue. But this
 looks to me a very basic scenario.



 Any suggestions? Or If anybody can explain a reason behind this?

 -Vivek










Re: CQL Thrift

2013-08-30 Thread Vivek Mishra
True for newly build platform(s), but what about existing apps build using
thrift? As per http://
www.datastax.com/dev/blog/thrift-to-cql3http://www.datastax.com/dev/blog/thrift-to-cql3
it
should be easy.

I am just curious to understand the real reason behind such behavior.

-Vivek



On Fri, Aug 30, 2013 at 11:28 PM, Jon Haddad j...@jonhaddad.com wrote:

 Just curious - what do you need to do that requires thrift?  We've build
 our entire platform using CQL3 and we haven't hit any issues.

 On Aug 30, 2013, at 10:53 AM, Peter Lin wool...@gmail.com wrote:


 my bias perspective, I find the sweet spot is thrift for insert/update and
 CQL for select queries.

 CQL is too limiting and negates the power of storing arbitrary data types
 in dynamic columns.


 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:

 If you're going to work with CQL, work with CQL.  If you're going to work
 with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text, last_name
 text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

 cqlsh:cql3usage select * from user;

  user_id
 -
  @mevivs





 I understand that, CQL3 and thrift interoperability is an issue. But this
 looks to me a very basic scenario.



 Any suggestions? Or If anybody can explain a reason behind this?

 -Vivek










Re: CQL Thrift

2013-08-30 Thread Vivek Mishra
If you talk about comparator. Yes, that's a valid point and not possible
with CQL3.

-Vivek


On Fri, Aug 30, 2013 at 11:31 PM, Peter Lin wool...@gmail.com wrote:


 I use dynamic columns all the time and they vary in type.

 With CQL you can define a default type, but you can't insert specific
 types of data for column name and value. It forces you to use all bytes or
 all strings, which would require coverting it to other types.

 thrift is much more powerful in that respect.

 not everyone needs to take advantage of the full power of dynamic columns.


 On Fri, Aug 30, 2013 at 1:58 PM, Jon Haddad j...@jonhaddad.com wrote:

 Just curious - what do you need to do that requires thrift?  We've build
 our entire platform using CQL3 and we haven't hit any issues.

 On Aug 30, 2013, at 10:53 AM, Peter Lin wool...@gmail.com wrote:


 my bias perspective, I find the sweet spot is thrift for insert/update
 and CQL for select queries.

 CQL is too limiting and negates the power of storing arbitrary data types
 in dynamic columns.


 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:

 If you're going to work with CQL, work with CQL.  If you're going to
 work with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com
 wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text, last_name
 text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

 cqlsh:cql3usage select * from user;

  user_id
 -
  @mevivs





 I understand that, CQL3 and thrift interoperability is an issue. But
 this looks to me a very basic scenario.



 Any suggestions? Or If anybody can explain a reason behind this?

 -Vivek











Re: CQL Thrift

2013-08-30 Thread Peter Lin
In the interest of education and discussion.

I didn't mean to say CQL3 doesn't support dynamic columns. The example from
the page shows default type defined in the create statement.

create column family data
with key_validation_class=Int32Type
 and comparator=DateType
 and default_validation_class=FloatType;


If I try to insert a dynamic column that uses double for column name and
string for column value, it will throw an error. The kind of use case I'm
talking about defines a minimum number of static columns. Most of the
columns that are added at runtime are different name and value type. This
is specific to my use case.

Having said that, I believe it would be possible to provide that kind of
feature in CQL, but the trade off is it deviates from SQL. The grammar
would have to allow type declaration in the columns list and functions in
the values. Something like

insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values
('abc123', some string, double(102.211))

doubleType(newcol1) and string(newcol2) are dynamic columns.

I know many people find thrift hard to grok and struggle with it, but I'm a
firm believer in taking time to learn. Every developer should take time to
read cassandra source code and the source code for the driver they're using.



On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.com wrote:

 http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows


 On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote:


 my bias perspective, I find the sweet spot is thrift for insert/update
 and CQL for select queries.

 CQL is too limiting and negates the power of storing arbitrary data types
 in dynamic columns.


 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:

 If you're going to work with CQL, work with CQL.  If you're going to
 work with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com
 wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text, last_name
 text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

 cqlsh:cql3usage select * from user;

  user_id
 -
  @mevivs





 I understand that, CQL3 and thrift interoperability is an issue. But
 this looks to me a very basic scenario.



 Any suggestions? Or If anybody can explain a reason behind this?

 -Vivek









 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder, http://www.datastax.com
 @spyced



Re: CQL Thrift

2013-08-30 Thread Jon Haddad
It sounds like you want this:

create table data ( pk int, colname blob, value blob, primary key (pk, 
colname));

that gives you arbitrary columns (cleverly labeled colname) in a single row, 
where the value is value. 

If you don't want the overhead of storing colname in every row, try with 
compact storage.

Does this solve the problem, or am I missing something?

On Aug 30, 2013, at 11:45 AM, Peter Lin wool...@gmail.com wrote:

 
 you could dynamically create new tables at runtime and insert rows into the 
 new table, but is that better than using thrift and putting it into a regular 
 dynamic column with the exact name type and value type?
 
 that would mean if there's 20 dynamic columns of different types, you'd have 
 to execute 21 queries to rebuild the data. That's basically the same as using 
 EVA tables in relational databases.
 
 Having used that approach in the past to build temporal databases, it doesn't 
 scale well.
 
 
 
 On Fri, Aug 30, 2013 at 2:40 PM, Vivek Mishra mishra.v...@gmail.com wrote:
 create a column family as:
 
 create table dynamicTable(key text, nameAsDouble double, valueAsBlob blob);
 
 insert into dynamicTable(key, nameAsDouble, valueAsBlob) values ( key, 
 double(102.211), textAsBlob('valueInBytes').
 
 Do you think, it will work in case column name are double?
 
 -Vivek
 
 
 On Sat, Aug 31, 2013 at 12:03 AM, Peter Lin wool...@gmail.com wrote:
 
 In the interest of education and discussion.
 
 I didn't mean to say CQL3 doesn't support dynamic columns. The example from 
 the page shows default type defined in the create statement.
 create column family data 
 with key_validation_class=Int32Type 
  and comparator=DateType 
  and default_validation_class=FloatType;
 
 
 If I try to insert a dynamic column that uses double for column name and 
 string for column value, it will throw an error. The kind of use case I'm 
 talking about defines a minimum number of static columns. Most of the columns 
 that are added at runtime are different name and value type. This is specific 
 to my use case.
 
 Having said that, I believe it would be possible to provide that kind of 
 feature in CQL, but the trade off is it deviates from SQL. The grammar would 
 have to allow type declaration in the columns list and functions in the 
 values. Something like
 
 insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values 
 ('abc123', some string, double(102.211))
 
 doubleType(newcol1) and string(newcol2) are dynamic columns.
 
 I know many people find thrift hard to grok and struggle with it, but I'm a 
 firm believer in taking time to learn. Every developer should take time to 
 read cassandra source code and the source code for the driver they're using.
 
 
 
 On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.com wrote:
 http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows
 
 
 On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote:
 
 my bias perspective, I find the sweet spot is thrift for insert/update and 
 CQL for select queries.
 
 CQL is too limiting and negates the power of storing arbitrary data types in 
 dynamic columns.
 
 
 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:
 If you're going to work with CQL, work with CQL.  If you're going to work 
 with Thrift, work with Thrift.  Don't mix.
 
 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote:
 
 Hi,
 If i a create a table with CQL3 as 
 
 create table user(user_id text PRIMARY KEY, first_name text, last_name text, 
 emailid text);
 
 and create index as:
 create index on user(first_name);
 
 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId) 
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');
 
 
 Then if update same column family using Cassandra-cli as:
 
 update column family user with key_validation_class='UTF8Type' and 
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type', 
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', 
 index_type:KEYS}];
 
 
 Now if i connect via cqlsh and explore user table, i can see column 
 first_name,last_name are not part of table structure anymore. Here is the 
 output:
 
 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};
 
 cqlsh:cql3usage select * from user;
 
  user_id
 -
  @mevivs
 
 
 
 
 
 I understand that, CQL3 and thrift interoperability is an issue. But this 
 looks to me a very basic scenario.
 
 
 
 Any suggestions? Or If anybody can explain a reason behind this?
 
 -Vivek
 
 
 
 
 
 
 
 
 
 -- 
 Jonathan 

Re: CQL Thrift

2013-08-30 Thread Les Hazlewood
On Fri, Aug 30, 2013 at 10:58 AM, Jon Haddad j...@jonhaddad.com wrote:

 Just curious - what do you need to do that requires thrift?  We've build
 our entire platform using CQL3 and we haven't hit any issues.


Here's one thing: If you're using wide rows and you want to do anything
other than just append individual columns to the row, then CQL3 (as it
functions currently) is way too slow.

I just created the following Jira issue 5 minutes ago because we've been
fighting with this issue for the last 2 days. Our workaround was to swap
out CQL3 + DataStax Java Driver in favor of Astyanax for this particular
use case:

https://issues.apache.org/jira/browse/CASSANDRA-5959

Cheers,

--
Les Hazlewood | @lhazlewood
CTO, Stormpath | http://stormpath.com | @goStormpath | 888.391.5282


Re: CQL Thrift

2013-08-30 Thread Vivek Mishra
Did you try to explore CQL3 collection support for the same? You can
definitely save on number of rows with that.

Point which i am trying to make out is, you can achieve it via CQL3 (
Jonathan's blog :
http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows)

I agree with you that still thrift may have some valid points to prove, but
considering latest development around new Cassandra features, i think CQL3
is the path to follow.


-Vivek


On Sat, Aug 31, 2013 at 12:15 AM, Peter Lin wool...@gmail.com wrote:


 you could dynamically create new tables at runtime and insert rows into
 the new table, but is that better than using thrift and putting it into a
 regular dynamic column with the exact name type and value type?

 that would mean if there's 20 dynamic columns of different types, you'd
 have to execute 21 queries to rebuild the data. That's basically the same
 as using EVA tables in relational databases.

 Having used that approach in the past to build temporal databases, it
 doesn't scale well.



 On Fri, Aug 30, 2013 at 2:40 PM, Vivek Mishra mishra.v...@gmail.comwrote:

 create a column family as:

 create table dynamicTable(key text, nameAsDouble double, valueAsBlob
 blob);

 insert into dynamicTable(key, nameAsDouble, valueAsBlob) values ( key, 
 double(102.211),
 textAsBlob('valueInBytes').

 Do you think, it will work in case column name are double?

 -Vivek


 On Sat, Aug 31, 2013 at 12:03 AM, Peter Lin wool...@gmail.com wrote:


 In the interest of education and discussion.

 I didn't mean to say CQL3 doesn't support dynamic columns. The example
 from the page shows default type defined in the create statement.

 create column family data
 with key_validation_class=Int32Type
  and comparator=DateType
  and default_validation_class=FloatType;


 If I try to insert a dynamic column that uses double for column name and
 string for column value, it will throw an error. The kind of use case I'm
 talking about defines a minimum number of static columns. Most of the
 columns that are added at runtime are different name and value type. This
 is specific to my use case.

 Having said that, I believe it would be possible to provide that kind
 of feature in CQL, but the trade off is it deviates from SQL. The grammar
 would have to allow type declaration in the columns list and functions in
 the values. Something like

 insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values
 ('abc123', some string, double(102.211))

 doubleType(newcol1) and string(newcol2) are dynamic columns.

 I know many people find thrift hard to grok and struggle with it, but
 I'm a firm believer in taking time to learn. Every developer should take
 time to read cassandra source code and the source code for the driver
 they're using.



 On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.comwrote:


 http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows


 On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote:


 my bias perspective, I find the sweet spot is thrift for insert/update
 and CQL for select queries.

 CQL is too limiting and negates the power of storing arbitrary data
 types in dynamic columns.


 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:

 If you're going to work with CQL, work with CQL.  If you're going to
 work with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com
 wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text,
 last_name text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

 cqlsh:cql3usage select * from user;

  user_id
 -
  @mevivs





 I understand that, CQL3 and thrift interoperability is an issue. But
 this looks to me a very basic scenario.



 Any suggestions? Or If anybody can explain a reason behind this?

 -Vivek









Re: CQL Thrift

2013-08-30 Thread Vivek Mishra
@lhazlewood

https://issues.apache.org/jira/browse/CASSANDRA-5959

Begin batch

 multiple insert statements.

apply batch

It doesn't work for you?

-Vivek
On Sat, Aug 31, 2013 at 12:21 AM, Les Hazlewood lhazlew...@apache.orgwrote:

 On Fri, Aug 30, 2013 at 10:58 AM, Jon Haddad j...@jonhaddad.com wrote:

 Just curious - what do you need to do that requires thrift?  We've build
 our entire platform using CQL3 and we haven't hit any issues.


 Here's one thing: If you're using wide rows and you want to do anything
 other than just append individual columns to the row, then CQL3 (as it
 functions currently) is way too slow.

 I just created the following Jira issue 5 minutes ago because we've been
 fighting with this issue for the last 2 days. Our workaround was to swap
 out CQL3 + DataStax Java Driver in favor of Astyanax for this particular
 use case:

 https://issues.apache.org/jira/browse/CASSANDRA-5959

 Cheers,

 --
 Les Hazlewood | @lhazlewood
 CTO, Stormpath | http://stormpath.com | @goStormpath | 888.391.5282



CQL Thrift

2013-08-30 Thread Vivek Mishra
Hi,
If i a create a table with CQL3 as

create table user(user_id text PRIMARY KEY, first_name text, last_name
text, emailid text);

and create index as:
create index on user(first_name);

then inserted some data as:
insert into user(user_id,first_name,last_name,emailId)
values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


Then if update same column family using Cassandra-cli as:

update column family user with key_validation_class='UTF8Type' and
column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
index_type:KEYS}];


Now if i connect via cqlsh and explore user table, i can see column
first_name,last_name are not part of table structure anymore. Here is the
output:

CREATE TABLE user (
  key text PRIMARY KEY
) WITH
  bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.00 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.10 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};

cqlsh:cql3usage select * from user;

 user_id
-
 @mevivs





I understand that, CQL3 and thrift interoperability is an issue. But this
looks to me a very basic scenario.



Any suggestions? Or If anybody can explain a reason behind this?

-Vivek


Re: CQL Thrift

2013-08-30 Thread Peter Lin
CQL3 collections is meant to store stuff that is list, set, map. Plus,
collections currently do not supporting secondary indexes.

The point is often you don't know what columns are needed at design time.
If you know what's needed, use static columns.

Using a list, set or map to store data you don't know and can't predict in
the future feels like a hammer solution. Cassandra has this super
powerful and useful feature that developers can use via thrift.

The last time I looked DataStax's official statement is that thrift isn't
going away, so I take them at their word.



On Fri, Aug 30, 2013 at 2:51 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 Did you try to explore CQL3 collection support for the same? You can
 definitely save on number of rows with that.

 Point which i am trying to make out is, you can achieve it via CQL3 (
 Jonathan's blog :
 http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows
 )

 I agree with you that still thrift may have some valid points to prove,
 but considering latest development around new Cassandra features, i think
 CQL3 is the path to follow.


 -Vivek


 On Sat, Aug 31, 2013 at 12:15 AM, Peter Lin wool...@gmail.com wrote:


 you could dynamically create new tables at runtime and insert rows into
 the new table, but is that better than using thrift and putting it into a
 regular dynamic column with the exact name type and value type?

 that would mean if there's 20 dynamic columns of different types, you'd
 have to execute 21 queries to rebuild the data. That's basically the same
 as using EVA tables in relational databases.

  Having used that approach in the past to build temporal databases, it
 doesn't scale well.



 On Fri, Aug 30, 2013 at 2:40 PM, Vivek Mishra mishra.v...@gmail.comwrote:

 create a column family as:

 create table dynamicTable(key text, nameAsDouble double, valueAsBlob
 blob);

 insert into dynamicTable(key, nameAsDouble, valueAsBlob) values ( key, 
 double(102.211),
 textAsBlob('valueInBytes').

 Do you think, it will work in case column name are double?

 -Vivek


 On Sat, Aug 31, 2013 at 12:03 AM, Peter Lin wool...@gmail.com wrote:


 In the interest of education and discussion.

 I didn't mean to say CQL3 doesn't support dynamic columns. The example
 from the page shows default type defined in the create statement.

 create column family data
 with key_validation_class=Int32Type
  and comparator=DateType
  and default_validation_class=FloatType;


 If I try to insert a dynamic column that uses double for column name
 and string for column value, it will throw an error. The kind of use case
 I'm talking about defines a minimum number of static columns. Most of the
 columns that are added at runtime are different name and value type. This
 is specific to my use case.

 Having said that, I believe it would be possible to provide that kind
 of feature in CQL, but the trade off is it deviates from SQL. The grammar
 would have to allow type declaration in the columns list and functions in
 the values. Something like

 insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values
 ('abc123', some string, double(102.211))

 doubleType(newcol1) and string(newcol2) are dynamic columns.

 I know many people find thrift hard to grok and struggle with it, but
 I'm a firm believer in taking time to learn. Every developer should take
 time to read cassandra source code and the source code for the driver
 they're using.



 On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.comwrote:


 http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows


 On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote:


 my bias perspective, I find the sweet spot is thrift for
 insert/update and CQL for select queries.

 CQL is too limiting and negates the power of storing arbitrary data
 types in dynamic columns.


 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.comwrote:

 If you're going to work with CQL, work with CQL.  If you're going to
 work with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com
 wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text,
 last_name text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is 
 the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY

Re: CQL Thrift

2013-08-30 Thread Alex Popescu
On Fri, Aug 30, 2013 at 11:56 AM, Vivek Mishra mishra.v...@gmail.comwrote:

 @lhazlewood

 https://issues.apache.org/jira/browse/CASSANDRA-5959

 Begin batch

  multiple insert statements.

 apply batch

 It doesn't work for you?

 -Vivek


According to the OP batching inserts is slow. The SO thread [1] mentions
that the in their environment BATCH takes 1.5min, while the Thrift-based
approach is around 235millis.

[1]
http://stackoverflow.com/questions/18522191/using-cassandra-and-cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque
-- 

:- a)


Alex Popescu
Sen. Product Manager @ DataStax
@al3xandru


Re: CQL Thrift

2013-08-30 Thread Jon Haddad
It seems really strange to me that you're create a table with specific types 
then try to deviate from it.  Why not just use the blob type, then you can 
store whatever you want in there?

The whole point of adding strong typing is to adhere to it.  I wouldn't 
consider it a fault of the database that it does what you asked it to.

On Aug 30, 2013, at 11:33 AM, Peter Lin wool...@gmail.com wrote:

 
 In the interest of education and discussion.
 
 I didn't mean to say CQL3 doesn't support dynamic columns. The example from 
 the page shows default type defined in the create statement.
 create column family data 
 with key_validation_class=Int32Type 
  and comparator=DateType 
  and default_validation_class=FloatType;
 
 
 If I try to insert a dynamic column that uses double for column name and 
 string for column value, it will throw an error. The kind of use case I'm 
 talking about defines a minimum number of static columns. Most of the columns 
 that are added at runtime are different name and value type. This is specific 
 to my use case.
 
 Having said that, I believe it would be possible to provide that kind of 
 feature in CQL, but the trade off is it deviates from SQL. The grammar would 
 have to allow type declaration in the columns list and functions in the 
 values. Something like
 
 insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values 
 ('abc123', some string, double(102.211))
 
 doubleType(newcol1) and string(newcol2) are dynamic columns.
 
 I know many people find thrift hard to grok and struggle with it, but I'm a 
 firm believer in taking time to learn. Every developer should take time to 
 read cassandra source code and the source code for the driver they're using.
 
 
 
 On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.com wrote:
 http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows
 
 
 On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote:
 
 my bias perspective, I find the sweet spot is thrift for insert/update and 
 CQL for select queries.
 
 CQL is too limiting and negates the power of storing arbitrary data types in 
 dynamic columns.
 
 
 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:
 If you're going to work with CQL, work with CQL.  If you're going to work 
 with Thrift, work with Thrift.  Don't mix.
 
 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote:
 
 Hi,
 If i a create a table with CQL3 as 
 
 create table user(user_id text PRIMARY KEY, first_name text, last_name text, 
 emailid text);
 
 and create index as:
 create index on user(first_name);
 
 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId) 
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');
 
 
 Then if update same column family using Cassandra-cli as:
 
 update column family user with key_validation_class='UTF8Type' and 
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type', 
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', 
 index_type:KEYS}];
 
 
 Now if i connect via cqlsh and explore user table, i can see column 
 first_name,last_name are not part of table structure anymore. Here is the 
 output:
 
 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};
 
 cqlsh:cql3usage select * from user;
 
  user_id
 -
  @mevivs
 
 
 
 
 
 I understand that, CQL3 and thrift interoperability is an issue. But this 
 looks to me a very basic scenario.
 
 
 
 Any suggestions? Or If anybody can explain a reason behind this?
 
 -Vivek
 
 
 
 
 
 
 
 
 
 -- 
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder, http://www.datastax.com
 @spyced
 



Re: CQL Thrift

2013-08-30 Thread Vivek Mishra
create a column family as:

create table dynamicTable(key text, nameAsDouble double, valueAsBlob blob);

insert into dynamicTable(key, nameAsDouble, valueAsBlob) values (
key, double(102.211),
textAsBlob('valueInBytes').

Do you think, it will work in case column name are double?

-Vivek


On Sat, Aug 31, 2013 at 12:03 AM, Peter Lin wool...@gmail.com wrote:


 In the interest of education and discussion.

 I didn't mean to say CQL3 doesn't support dynamic columns. The example
 from the page shows default type defined in the create statement.

 create column family data
 with key_validation_class=Int32Type
  and comparator=DateType
  and default_validation_class=FloatType;


 If I try to insert a dynamic column that uses double for column name and
 string for column value, it will throw an error. The kind of use case I'm
 talking about defines a minimum number of static columns. Most of the
 columns that are added at runtime are different name and value type. This
 is specific to my use case.

 Having said that, I believe it would be possible to provide that kind of
 feature in CQL, but the trade off is it deviates from SQL. The grammar
 would have to allow type declaration in the columns list and functions in
 the values. Something like

 insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values
 ('abc123', some string, double(102.211))

 doubleType(newcol1) and string(newcol2) are dynamic columns.

 I know many people find thrift hard to grok and struggle with it, but I'm
 a firm believer in taking time to learn. Every developer should take time
 to read cassandra source code and the source code for the driver they're
 using.



 On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.com wrote:


 http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows


 On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote:


 my bias perspective, I find the sweet spot is thrift for insert/update
 and CQL for select queries.

 CQL is too limiting and negates the power of storing arbitrary data
 types in dynamic columns.


 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:

 If you're going to work with CQL, work with CQL.  If you're going to
 work with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com
 wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text, last_name
 text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

 cqlsh:cql3usage select * from user;

  user_id
 -
  @mevivs





 I understand that, CQL3 and thrift interoperability is an issue. But
 this looks to me a very basic scenario.



 Any suggestions? Or If anybody can explain a reason behind this?

 -Vivek









 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder, http://www.datastax.com
 @spyced





Re: CQL Thrift

2013-08-30 Thread Les Hazlewood
Yes, that's correct - and that's a scaled number.  In practice:

On the local dev machine, CQL3 inserting 10,000 columns (for 1 row) in a
BATCH took 1.5 minutes.  50,000 columns (the desired amount) in a BATCH
took 7.5 minutes.  The same Thrift functionality took _235 milliseconds_.
 That's almost 2,000 times faster (3 orders of magnitude difference)!

However, according to Aleksey Yeschenko, this performance problem has been
addressed in 2.0 beta 1 via
https://issues.apache.org/jira/browse/CASSANDRA-4693.

I'll reserve judgement until I can performance-test 2.0 beta 1 ;)

Cheers,

--
Les Hazlewood | @lhazlewood
CTO, Stormpath | http://stormpath.com | @goStormpath | 888.391.5282

On Fri, Aug 30, 2013 at 12:50 PM, Alex Popescu al...@datastax.com wrote:

 On Fri, Aug 30, 2013 at 11:56 AM, Vivek Mishra mishra.v...@gmail.comwrote:

 @lhazlewood

 https://issues.apache.org/jira/browse/CASSANDRA-5959

 Begin batch

  multiple insert statements.

 apply batch

 It doesn't work for you?

 -Vivek


 According to the OP batching inserts is slow. The SO thread [1] mentions
 that the in their environment BATCH takes 1.5min, while the Thrift-based
 approach is around 235millis.

 [1]
 http://stackoverflow.com/questions/18522191/using-cassandra-and-cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque
 --

 :- a)


 Alex Popescu
 Sen. Product Manager @ DataStax
 @al3xandru



Re: CQL Thrift

2013-08-30 Thread Peter Lin
This has nothing to do with compact storage.

Cassandra supports arbitrary dynamic columns of different name/value type
today. If people are happy with SQL metaphor, then CQL is fine.

Then again, if SQL metaphor was good for temporal databases, there wouldn't
be so many failed temporal databases built on RDB. I've built over 4
bi-temporal databases on RDB over the last 12 years, so it's not something
that was done lightly.

it was from years of pain. I won't bore others about the challenges of
building temporal databases.




On Fri, Aug 30, 2013 at 2:51 PM, Jon Haddad j...@jonhaddad.com wrote:

 It sounds like you want this:

 create table data ( pk int, colname blob, value blob, primary key (pk,
 colname));

 that gives you arbitrary columns (cleverly labeled colname) in a single
 row, where the value is value.

 If you don't want the overhead of storing colname in every row, try with
 compact storage.

 Does this solve the problem, or am I missing something?

 On Aug 30, 2013, at 11:45 AM, Peter Lin wool...@gmail.com wrote:


 you could dynamically create new tables at runtime and insert rows into
 the new table, but is that better than using thrift and putting it into a
 regular dynamic column with the exact name type and value type?

 that would mean if there's 20 dynamic columns of different types, you'd
 have to execute 21 queries to rebuild the data. That's basically the same
 as using EVA tables in relational databases.

 Having used that approach in the past to build temporal databases, it
 doesn't scale well.



 On Fri, Aug 30, 2013 at 2:40 PM, Vivek Mishra mishra.v...@gmail.comwrote:

 create a column family as:

 create table dynamicTable(key text, nameAsDouble double, valueAsBlob
 blob);

 insert into dynamicTable(key, nameAsDouble, valueAsBlob) values ( key, 
 double(102.211),
 textAsBlob('valueInBytes').

 Do you think, it will work in case column name are double?

 -Vivek


 On Sat, Aug 31, 2013 at 12:03 AM, Peter Lin wool...@gmail.com wrote:


 In the interest of education and discussion.

 I didn't mean to say CQL3 doesn't support dynamic columns. The example
 from the page shows default type defined in the create statement.

 create column family data
 with key_validation_class=Int32Type
  and comparator=DateType
  and default_validation_class=FloatType;


 If I try to insert a dynamic column that uses double for column name and
 string for column value, it will throw an error. The kind of use case I'm
 talking about defines a minimum number of static columns. Most of the
 columns that are added at runtime are different name and value type. This
 is specific to my use case.

 Having said that, I believe it would be possible to provide that kind
 of feature in CQL, but the trade off is it deviates from SQL. The grammar
 would have to allow type declaration in the columns list and functions in
 the values. Something like

 insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values
 ('abc123', some string, double(102.211))

 doubleType(newcol1) and string(newcol2) are dynamic columns.

 I know many people find thrift hard to grok and struggle with it, but
 I'm a firm believer in taking time to learn. Every developer should take
 time to read cassandra source code and the source code for the driver
 they're using.



 On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.comwrote:


 http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows


 On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote:


 my bias perspective, I find the sweet spot is thrift for insert/update
 and CQL for select queries.

 CQL is too limiting and negates the power of storing arbitrary data
 types in dynamic columns.


 On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:

 If you're going to work with CQL, work with CQL.  If you're going to
 work with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com
 wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text,
 last_name text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   

Re: CQL Thrift

2013-08-30 Thread Peter Lin
my bias perspective, I find the sweet spot is thrift for insert/update and
CQL for select queries.

CQL is too limiting and negates the power of storing arbitrary data types
in dynamic columns.


On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote:

 If you're going to work with CQL, work with CQL.  If you're going to work
 with Thrift, work with Thrift.  Don't mix.

 On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote:

 Hi,
 If i a create a table with CQL3 as

 create table user(user_id text PRIMARY KEY, first_name text, last_name
 text, emailid text);

 and create index as:
 create index on user(first_name);

 then inserted some data as:
 insert into user(user_id,first_name,last_name,emailId)
 values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in');


 Then if update same column family using Cassandra-cli as:

 update column family user with key_validation_class='UTF8Type' and
 column_metadata=[{column_name:last_name, validation_class:'UTF8Type',
 index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type',
 index_type:KEYS}];


 Now if i connect via cqlsh and explore user table, i can see column
 first_name,last_name are not part of table structure anymore. Here is the
 output:

 CREATE TABLE user (
   key text PRIMARY KEY
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};

 cqlsh:cql3usage select * from user;

  user_id
 -
  @mevivs





 I understand that, CQL3 and thrift interoperability is an issue. But this
 looks to me a very basic scenario.



 Any suggestions? Or If anybody can explain a reason behind this?

 -Vivek








Re: SQL Injection C* (via CQL Thrift)

2013-06-20 Thread aaron morton
 As for the thrift side (i.e. using Hector or Astyanax), anyone have a crafty 
 way to inject something?

The only thing I've ever heard of coming close was a thrift bug that allowed a 
malformed request to crash the server. But that was a while ago 
https://issues.apache.org/jira/browse/CASSANDRA-475

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 19/06/2013, at 1:46 AM, Brian O'Neill b...@alumni.brown.edu wrote:

 
 Perfect.  Thanks Sylvain.  That is exactly the input I was looking for, and I 
 agree completely.
 (t's easy enough to protect against)
 
 As for the thrift side (i.e. using Hector or Astyanax), anyone have a crafty 
 way to inject something?
 
 At first glance, it doesn't appear possible, but I'm not 100% confident 
 making that assertion. 
 
 -brian
 
 ---
 Brian O'Neill
 Lead Architect, Software Development
 Health Market Science
 The Science of Better Results
 2700 Horizon Drive • King of Prussia, PA • 19406
 M: 215.588.6024 • @boneill42  •  
 healthmarketscience.com
 
 This information transmitted in this email message is for the intended 
 recipient only and may contain confidential and/or privileged material. If 
 you received this email in error and are not the intended recipient, or the 
 person responsible to deliver it to the intended recipient, please contact 
 the sender at the email above and delete this email and any attachments and 
 destroy any copies thereof. Any review, retransmission, dissemination, 
 copying or other use of, or taking any action in reliance upon, this 
 information by persons or entities other than the intended recipient is 
 strictly prohibited.
  
 
 
 From: Sylvain Lebresne sylv...@datastax.com
 Reply-To: user@cassandra.apache.org
 Date: Tuesday, June 18, 2013 8:51 AM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: SQL Injection C* (via CQL  Thrift)
 
 If you're not careful, then CQL injection is possible.
 
 Say you naively build you query with
   UPDATE foo SET col=' + user_input + ' WHERE key = 'k'
 then if user_input is foo' AND col2='bar, your user will have overwritten a 
 column it shouldn't have been able to. And something equivalent in a BATCH 
 statement could allow to overwrite/delete some random row in some random 
 table.
 
 Now CQL being much more restricted than SQL (no subqueries, no generic 
 transaction, ...), the extent of what you can do with a CQL injection is way 
 smaller than in SQL. But you do have to be careful.
 
 As far as the Datastax java driver is concerned, you can fairly easily 
 protect yourself by using either:
 1) prepared statements: if the user input is a prepared variable, there is 
 nothing the user can do (it's equivalent to the thrift situation).
 2) using the query builder: it will escape quotes in the strings you 
 provided, thuse avoiding injection.
 
 So I would say that injections are definitively possible if you concatenate 
 strings too naively, but I don't think preventing them is very hard.
 
 --
 Sylvain
 
 
 On Tue, Jun 18, 2013 at 2:02 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 
 Mostly for fun, I wanted to throw this out there...
 
 We are undergoing a security audit for our platform (C* + Elastic Search + 
 Storm).  One component of that audit is susceptibility to SQL injection.  I 
 was wondering if anyone has attempted to construct a SQL injection attack 
 against Cassandra?  Is it even possible?
 
 I know the code paths fairly well, but...
 Does there exists a path in the code whereby user data gets interpreted, 
 which could be exploited to perform user operations?
 
 From the Thrift side of things, I've always felt safe.  Data is opaque.  
 Serializers are used to convert it to Bytes, and C* doesn't ever really do 
 anything with the data.
 
 In examining the CQL java-driver, it looks like there might be a bit more 
 exposure to injection.  (or even CQL over Thrift)  I haven't dug into the 
 code yet, but dependent on which flavor of the API you are using, you may be 
 including user data in your statements.  
 
 Does anyone know if the CQL java-driver does anything to protect against 
 injection?  Or is it possible to say that the syntax is strict enough that 
 any embedded operations in data would not parse?
 
 just some food for thought...
 I'll be digging into this over the next couple weeks.  If people are 
 interested, I can throw a blog post out there with the findings.
 
 -brian
 
 -- 
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42
 



Re: SQL Injection C* (via CQL Thrift)

2013-06-20 Thread Robert Coli
On Thu, Jun 20, 2013 at 2:15 AM, aaron morton aa...@thelastpickle.com wrote:
 As for the thrift side (i.e. using Hector or Astyanax), anyone have a crafty 
 way to inject something?

 The only thing I've ever heard of coming close was a thrift bug that allowed 
 a malformed request to crash the server. But that was a while ago 
 https://issues.apache.org/jira/browse/CASSANDRA-475

Oh, that brings me back. Literally my first interaction with a
cassandra server :

- start cassandra
- telnet localhost 9160
- asdasdasdasdsa
- Connection reset by peer
- notice server has crashed

Not *really* a Cassandra bug, but hilarious nonetheless. :)

=Rob


Re: SQL Injection C* (via CQL Thrift)

2013-06-20 Thread Edward Capriolo
My first interaction with cassandra: ../nodeprobe -p 9160 ...
Hum I can't seem to reach it :) Ow its no longer running...

You've come along way baby.


On Thu, Jun 20, 2013 at 12:59 PM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Jun 20, 2013 at 2:15 AM, aaron morton aa...@thelastpickle.com
 wrote:
  As for the thrift side (i.e. using Hector or Astyanax), anyone have a
 crafty way to inject something?
 
  The only thing I've ever heard of coming close was a thrift bug that
 allowed a malformed request to crash the server. But that was a while ago
 https://issues.apache.org/jira/browse/CASSANDRA-475

 Oh, that brings me back. Literally my first interaction with a
 cassandra server :

 - start cassandra
 - telnet localhost 9160
 - asdasdasdasdsa
 - Connection reset by peer
 - notice server has crashed

 Not *really* a Cassandra bug, but hilarious nonetheless. :)

 =Rob



SQL Injection C* (via CQL Thrift)

2013-06-18 Thread Brian O'Neill
Mostly for fun, I wanted to throw this out there...

We are undergoing a security audit for our platform (C* + Elastic Search +
Storm).  One component of that audit is susceptibility to SQL injection.  I
was wondering if anyone has attempted to construct a SQL injection attack
against Cassandra?  Is it even possible?

I know the code paths fairly well, but...
Does there exists a path in the code whereby user data gets interpreted,
which could be exploited to perform user operations?

From the Thrift side of things, I've always felt safe.  Data is opaque.
 Serializers are used to convert it to Bytes, and C* doesn't ever really do
anything with the data.

In examining the CQL java-driver, it looks like there might be a bit more
exposure to injection.  (or even CQL over Thrift)  I haven't dug into the
code yet, but dependent on which flavor of the API you are using, you may
be including user data in your statements.

Does anyone know if the CQL java-driver does anything to protect against
injection?  Or is it possible to say that the syntax is strict enough that
any embedded operations in data would not parse?

just some food for thought...
I'll be digging into this over the next couple weeks.  If people are
interested, I can throw a blog post out there with the findings.

-brian

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42


Re: SQL Injection C* (via CQL Thrift)

2013-06-18 Thread Sylvain Lebresne
If you're not careful, then CQL injection is possible.

Say you naively build you query with
  UPDATE foo SET col=' + user_input + ' WHERE key = 'k'
then if user_input is foo' AND col2='bar, your user will have overwritten
a column it shouldn't have been able to. And something equivalent in a
BATCH statement could allow to overwrite/delete some random row in some
random table.

Now CQL being much more restricted than SQL (no subqueries, no generic
transaction, ...), the extent of what you can do with a CQL injection is
way smaller than in SQL. But you do have to be careful.

As far as the Datastax java driver is concerned, you can fairly easily
protect yourself by using either:
1) prepared statements: if the user input is a prepared variable, there is
nothing the user can do (it's equivalent to the thrift situation).
2) using the query builder: it will escape quotes in the strings you
provided, thuse avoiding injection.

So I would say that injections are definitively possible if you concatenate
strings too naively, but I don't think preventing them is very hard.

--
Sylvain


On Tue, Jun 18, 2013 at 2:02 PM, Brian O'Neill b...@alumni.brown.eduwrote:


 Mostly for fun, I wanted to throw this out there...

 We are undergoing a security audit for our platform (C* + Elastic Search +
 Storm).  One component of that audit is susceptibility to SQL injection.  I
 was wondering if anyone has attempted to construct a SQL injection attack
 against Cassandra?  Is it even possible?

 I know the code paths fairly well, but...
 Does there exists a path in the code whereby user data gets interpreted,
 which could be exploited to perform user operations?

 From the Thrift side of things, I've always felt safe.  Data is opaque.
  Serializers are used to convert it to Bytes, and C* doesn't ever really do
 anything with the data.

 In examining the CQL java-driver, it looks like there might be a bit more
 exposure to injection.  (or even CQL over Thrift)  I haven't dug into the
 code yet, but dependent on which flavor of the API you are using, you may
 be including user data in your statements.

 Does anyone know if the CQL java-driver does anything to protect against
 injection?  Or is it possible to say that the syntax is strict enough that
 any embedded operations in data would not parse?

 just some food for thought...
 I'll be digging into this over the next couple weeks.  If people are
 interested, I can throw a blog post out there with the findings.

 -brian

 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



Re: SQL Injection C* (via CQL Thrift)

2013-06-18 Thread Brian O'Neill

Perfect.  Thanks Sylvain.  That is exactly the input I was looking for, and
I agree completely.
(t's easy enough to protect against)

As for the thrift side (i.e. using Hector or Astyanax), anyone have a crafty
way to inject something?

At first glance, it doesn't appear possible, but I'm not 100% confident
making that assertion.

-brian

---
Brian O'Neill
Lead Architect, Software Development
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Sylvain Lebresne sylv...@datastax.com
Reply-To:  user@cassandra.apache.org
Date:  Tuesday, June 18, 2013 8:51 AM
To:  user@cassandra.apache.org user@cassandra.apache.org
Subject:  Re: SQL Injection C* (via CQL  Thrift)

If you're not careful, then CQL injection is possible.

Say you naively build you query with
  UPDATE foo SET col=' + user_input + ' WHERE key = 'k'
then if user_input is foo' AND col2='bar, your user will have overwritten
a column it shouldn't have been able to. And something equivalent in a BATCH
statement could allow to overwrite/delete some random row in some random
table.

Now CQL being much more restricted than SQL (no subqueries, no generic
transaction, ...), the extent of what you can do with a CQL injection is way
smaller than in SQL. But you do have to be careful.

As far as the Datastax java driver is concerned, you can fairly easily
protect yourself by using either:
1) prepared statements: if the user input is a prepared variable, there is
nothing the user can do (it's equivalent to the thrift situation).
2) using the query builder: it will escape quotes in the strings you
provided, thuse avoiding injection.

So I would say that injections are definitively possible if you concatenate
strings too naively, but I don't think preventing them is very hard.

--
Sylvain


On Tue, Jun 18, 2013 at 2:02 PM, Brian O'Neill b...@alumni.brown.edu
wrote:
 
 Mostly for fun, I wanted to throw this out there...
 
 We are undergoing a security audit for our platform (C* + Elastic Search +
 Storm).  One component of that audit is susceptibility to SQL injection.  I
 was wondering if anyone has attempted to construct a SQL injection attack
 against Cassandra?  Is it even possible?
 
 I know the code paths fairly well, but...
 Does there exists a path in the code whereby user data gets interpreted, which
 could be exploited to perform user operations?
 
 From the Thrift side of things, I've always felt safe.  Data is opaque.
 Serializers are used to convert it to Bytes, and C* doesn't ever really do
 anything with the data.
 
 In examining the CQL java-driver, it looks like there might be a bit more
 exposure to injection.  (or even CQL over Thrift)  I haven't dug into the code
 yet, but dependent on which flavor of the API you are using, you may be
 including user data in your statements.
 
 Does anyone know if the CQL java-driver does anything to protect against
 injection?  Or is it possible to say that the syntax is strict enough that any
 embedded operations in data would not parse?
 
 just some food for thought...
 I'll be digging into this over the next couple weeks.  If people are
 interested, I can throw a blog post out there with the findings.
 
 -brian
 
 -- 
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42





Re: Cassandra, CQL, Thrift Deprecation?? and Erlang

2011-09-02 Thread Jonathan Ellis
The Thrift API is not going anywhere any time soon.

I'm not aware of anyone working on an erlang CQL client.

On Fri, Sep 2, 2011 at 7:39 AM, J T jt4websi...@googlemail.com wrote:
 Hi,

 I'm a fan of erlang, and have been using successive cassandra versions via
 the erlang thrift interface for a couple of years now.

 I see that cassandra seems to be moving to using CQL instead and so I was
 wondering if that means the thrift api will be deprecated and if so is there
 any effort underway to by anyone to create (whatever would be neccessary) to
 use cassandra via cql from erlang ?

 JT




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Cassandra, CQL, Thrift Deprecation?? and Erlang

2011-09-02 Thread J T
Ok, thats good to know.

If push came to shove I could probably write such a client myself after
doing the necessary research but I'd prefer to save myself the hassle.

Thanks.

On Fri, Sep 2, 2011 at 1:59 PM, Jonathan Ellis jbel...@gmail.com wrote:

 The Thrift API is not going anywhere any time soon.

 I'm not aware of anyone working on an erlang CQL client.

 On Fri, Sep 2, 2011 at 7:39 AM, J T jt4websi...@googlemail.com wrote:
  Hi,
 
  I'm a fan of erlang, and have been using successive cassandra versions
 via
  the erlang thrift interface for a couple of years now.
 
  I see that cassandra seems to be moving to using CQL instead and so I was
  wondering if that means the thrift api will be deprecated and if so is
 there
  any effort underway to by anyone to create (whatever would be neccessary)
 to
  use cassandra via cql from erlang ?
 
  JT
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com