Re: CQL Thrift
If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: CQL Thrift
And surprisingly if i alter table as : alter table user add first_name text; alter table user add last_name text; It gives me back column with values, but still no indexes. Thrift and CQL3 depends on same storage engine. Do they really maintain different metadata for same column family? -Vivek On Fri, Aug 30, 2013 at 11:08 PM, Vivek Mishra mishra.v...@gmail.comwrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: CQL Thrift
in my case, I built a temporal database on top of Cassandra, so it's absolutely key. Dynamic columns are super powerful, which relational database have no equivalent. For me, that is one of the top 3 reasons for using Cassandra. On Fri, Aug 30, 2013 at 2:03 PM, Vivek Mishra mishra.v...@gmail.com wrote: If you talk about comparator. Yes, that's a valid point and not possible with CQL3. -Vivek On Fri, Aug 30, 2013 at 11:31 PM, Peter Lin wool...@gmail.com wrote: I use dynamic columns all the time and they vary in type. With CQL you can define a default type, but you can't insert specific types of data for column name and value. It forces you to use all bytes or all strings, which would require coverting it to other types. thrift is much more powerful in that respect. not everyone needs to take advantage of the full power of dynamic columns. On Fri, Aug 30, 2013 at 1:58 PM, Jon Haddad j...@jonhaddad.com wrote: Just curious - what do you need to do that requires thrift? We've build our entire platform using CQL3 and we haven't hit any issues. On Aug 30, 2013, at 10:53 AM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: CQL Thrift
Hi, I understand that, but i want to understand the reason behind such behavior? Is it because of maintaining different metadata objects for CQL3 and thrift? Any suggestion? -Vivek On Fri, Aug 30, 2013 at 11:15 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: CQL Thrift
Could you please give a more concrete example? On Aug 30, 2013, at 11:10 AM, Peter Lin wool...@gmail.com wrote: in my case, I built a temporal database on top of Cassandra, so it's absolutely key. Dynamic columns are super powerful, which relational database have no equivalent. For me, that is one of the top 3 reasons for using Cassandra. On Fri, Aug 30, 2013 at 2:03 PM, Vivek Mishra mishra.v...@gmail.com wrote: If you talk about comparator. Yes, that's a valid point and not possible with CQL3. -Vivek On Fri, Aug 30, 2013 at 11:31 PM, Peter Lin wool...@gmail.com wrote: I use dynamic columns all the time and they vary in type. With CQL you can define a default type, but you can't insert specific types of data for column name and value. It forces you to use all bytes or all strings, which would require coverting it to other types. thrift is much more powerful in that respect. not everyone needs to take advantage of the full power of dynamic columns. On Fri, Aug 30, 2013 at 1:58 PM, Jon Haddad j...@jonhaddad.com wrote: Just curious - what do you need to do that requires thrift? We've build our entire platform using CQL3 and we haven't hit any issues. On Aug 30, 2013, at 10:53 AM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: CQL Thrift
http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek -- Jonathan Ellis Project Chair, Apache Cassandra co-founder, http://www.datastax.com @spyced
Re: CQL Thrift
Just curious - what do you need to do that requires thrift? We've build our entire platform using CQL3 and we haven't hit any issues. On Aug 30, 2013, at 10:53 AM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: CQL Thrift
CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. I agree but partly. You can always create column family with key, column and value and store any number of arbitrary columns as column name in column and it's corresponding value with value. I find it much easier. Coming back to original question, i think differentiator is the column metadata is treated in thrift and CQL3. What i do not understand is, for same column family if maintaining two set of metadata objects(CqlMetadata,CFDef), why updating anyone would cause trouble for another! -Vivek On Fri, Aug 30, 2013 at 11:23 PM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: CQL Thrift
I use dynamic columns all the time and they vary in type. With CQL you can define a default type, but you can't insert specific types of data for column name and value. It forces you to use all bytes or all strings, which would require coverting it to other types. thrift is much more powerful in that respect. not everyone needs to take advantage of the full power of dynamic columns. On Fri, Aug 30, 2013 at 1:58 PM, Jon Haddad j...@jonhaddad.com wrote: Just curious - what do you need to do that requires thrift? We've build our entire platform using CQL3 and we haven't hit any issues. On Aug 30, 2013, at 10:53 AM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: CQL Thrift
True for newly build platform(s), but what about existing apps build using thrift? As per http:// www.datastax.com/dev/blog/thrift-to-cql3http://www.datastax.com/dev/blog/thrift-to-cql3 it should be easy. I am just curious to understand the real reason behind such behavior. -Vivek On Fri, Aug 30, 2013 at 11:28 PM, Jon Haddad j...@jonhaddad.com wrote: Just curious - what do you need to do that requires thrift? We've build our entire platform using CQL3 and we haven't hit any issues. On Aug 30, 2013, at 10:53 AM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: CQL Thrift
If you talk about comparator. Yes, that's a valid point and not possible with CQL3. -Vivek On Fri, Aug 30, 2013 at 11:31 PM, Peter Lin wool...@gmail.com wrote: I use dynamic columns all the time and they vary in type. With CQL you can define a default type, but you can't insert specific types of data for column name and value. It forces you to use all bytes or all strings, which would require coverting it to other types. thrift is much more powerful in that respect. not everyone needs to take advantage of the full power of dynamic columns. On Fri, Aug 30, 2013 at 1:58 PM, Jon Haddad j...@jonhaddad.com wrote: Just curious - what do you need to do that requires thrift? We've build our entire platform using CQL3 and we haven't hit any issues. On Aug 30, 2013, at 10:53 AM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: CQL Thrift
In the interest of education and discussion. I didn't mean to say CQL3 doesn't support dynamic columns. The example from the page shows default type defined in the create statement. create column family data with key_validation_class=Int32Type and comparator=DateType and default_validation_class=FloatType; If I try to insert a dynamic column that uses double for column name and string for column value, it will throw an error. The kind of use case I'm talking about defines a minimum number of static columns. Most of the columns that are added at runtime are different name and value type. This is specific to my use case. Having said that, I believe it would be possible to provide that kind of feature in CQL, but the trade off is it deviates from SQL. The grammar would have to allow type declaration in the columns list and functions in the values. Something like insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values ('abc123', some string, double(102.211)) doubleType(newcol1) and string(newcol2) are dynamic columns. I know many people find thrift hard to grok and struggle with it, but I'm a firm believer in taking time to learn. Every developer should take time to read cassandra source code and the source code for the driver they're using. On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.com wrote: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek -- Jonathan Ellis Project Chair, Apache Cassandra co-founder, http://www.datastax.com @spyced
Re: CQL Thrift
It sounds like you want this: create table data ( pk int, colname blob, value blob, primary key (pk, colname)); that gives you arbitrary columns (cleverly labeled colname) in a single row, where the value is value. If you don't want the overhead of storing colname in every row, try with compact storage. Does this solve the problem, or am I missing something? On Aug 30, 2013, at 11:45 AM, Peter Lin wool...@gmail.com wrote: you could dynamically create new tables at runtime and insert rows into the new table, but is that better than using thrift and putting it into a regular dynamic column with the exact name type and value type? that would mean if there's 20 dynamic columns of different types, you'd have to execute 21 queries to rebuild the data. That's basically the same as using EVA tables in relational databases. Having used that approach in the past to build temporal databases, it doesn't scale well. On Fri, Aug 30, 2013 at 2:40 PM, Vivek Mishra mishra.v...@gmail.com wrote: create a column family as: create table dynamicTable(key text, nameAsDouble double, valueAsBlob blob); insert into dynamicTable(key, nameAsDouble, valueAsBlob) values ( key, double(102.211), textAsBlob('valueInBytes'). Do you think, it will work in case column name are double? -Vivek On Sat, Aug 31, 2013 at 12:03 AM, Peter Lin wool...@gmail.com wrote: In the interest of education and discussion. I didn't mean to say CQL3 doesn't support dynamic columns. The example from the page shows default type defined in the create statement. create column family data with key_validation_class=Int32Type and comparator=DateType and default_validation_class=FloatType; If I try to insert a dynamic column that uses double for column name and string for column value, it will throw an error. The kind of use case I'm talking about defines a minimum number of static columns. Most of the columns that are added at runtime are different name and value type. This is specific to my use case. Having said that, I believe it would be possible to provide that kind of feature in CQL, but the trade off is it deviates from SQL. The grammar would have to allow type declaration in the columns list and functions in the values. Something like insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values ('abc123', some string, double(102.211)) doubleType(newcol1) and string(newcol2) are dynamic columns. I know many people find thrift hard to grok and struggle with it, but I'm a firm believer in taking time to learn. Every developer should take time to read cassandra source code and the source code for the driver they're using. On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.com wrote: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek -- Jonathan
Re: CQL Thrift
On Fri, Aug 30, 2013 at 10:58 AM, Jon Haddad j...@jonhaddad.com wrote: Just curious - what do you need to do that requires thrift? We've build our entire platform using CQL3 and we haven't hit any issues. Here's one thing: If you're using wide rows and you want to do anything other than just append individual columns to the row, then CQL3 (as it functions currently) is way too slow. I just created the following Jira issue 5 minutes ago because we've been fighting with this issue for the last 2 days. Our workaround was to swap out CQL3 + DataStax Java Driver in favor of Astyanax for this particular use case: https://issues.apache.org/jira/browse/CASSANDRA-5959 Cheers, -- Les Hazlewood | @lhazlewood CTO, Stormpath | http://stormpath.com | @goStormpath | 888.391.5282
Re: CQL Thrift
Did you try to explore CQL3 collection support for the same? You can definitely save on number of rows with that. Point which i am trying to make out is, you can achieve it via CQL3 ( Jonathan's blog : http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows) I agree with you that still thrift may have some valid points to prove, but considering latest development around new Cassandra features, i think CQL3 is the path to follow. -Vivek On Sat, Aug 31, 2013 at 12:15 AM, Peter Lin wool...@gmail.com wrote: you could dynamically create new tables at runtime and insert rows into the new table, but is that better than using thrift and putting it into a regular dynamic column with the exact name type and value type? that would mean if there's 20 dynamic columns of different types, you'd have to execute 21 queries to rebuild the data. That's basically the same as using EVA tables in relational databases. Having used that approach in the past to build temporal databases, it doesn't scale well. On Fri, Aug 30, 2013 at 2:40 PM, Vivek Mishra mishra.v...@gmail.comwrote: create a column family as: create table dynamicTable(key text, nameAsDouble double, valueAsBlob blob); insert into dynamicTable(key, nameAsDouble, valueAsBlob) values ( key, double(102.211), textAsBlob('valueInBytes'). Do you think, it will work in case column name are double? -Vivek On Sat, Aug 31, 2013 at 12:03 AM, Peter Lin wool...@gmail.com wrote: In the interest of education and discussion. I didn't mean to say CQL3 doesn't support dynamic columns. The example from the page shows default type defined in the create statement. create column family data with key_validation_class=Int32Type and comparator=DateType and default_validation_class=FloatType; If I try to insert a dynamic column that uses double for column name and string for column value, it will throw an error. The kind of use case I'm talking about defines a minimum number of static columns. Most of the columns that are added at runtime are different name and value type. This is specific to my use case. Having said that, I believe it would be possible to provide that kind of feature in CQL, but the trade off is it deviates from SQL. The grammar would have to allow type declaration in the columns list and functions in the values. Something like insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values ('abc123', some string, double(102.211)) doubleType(newcol1) and string(newcol2) are dynamic columns. I know many people find thrift hard to grok and struggle with it, but I'm a firm believer in taking time to learn. Every developer should take time to read cassandra source code and the source code for the driver they're using. On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.comwrote: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: CQL Thrift
@lhazlewood https://issues.apache.org/jira/browse/CASSANDRA-5959 Begin batch multiple insert statements. apply batch It doesn't work for you? -Vivek On Sat, Aug 31, 2013 at 12:21 AM, Les Hazlewood lhazlew...@apache.orgwrote: On Fri, Aug 30, 2013 at 10:58 AM, Jon Haddad j...@jonhaddad.com wrote: Just curious - what do you need to do that requires thrift? We've build our entire platform using CQL3 and we haven't hit any issues. Here's one thing: If you're using wide rows and you want to do anything other than just append individual columns to the row, then CQL3 (as it functions currently) is way too slow. I just created the following Jira issue 5 minutes ago because we've been fighting with this issue for the last 2 days. Our workaround was to swap out CQL3 + DataStax Java Driver in favor of Astyanax for this particular use case: https://issues.apache.org/jira/browse/CASSANDRA-5959 Cheers, -- Les Hazlewood | @lhazlewood CTO, Stormpath | http://stormpath.com | @goStormpath | 888.391.5282
CQL Thrift
Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: CQL Thrift
CQL3 collections is meant to store stuff that is list, set, map. Plus, collections currently do not supporting secondary indexes. The point is often you don't know what columns are needed at design time. If you know what's needed, use static columns. Using a list, set or map to store data you don't know and can't predict in the future feels like a hammer solution. Cassandra has this super powerful and useful feature that developers can use via thrift. The last time I looked DataStax's official statement is that thrift isn't going away, so I take them at their word. On Fri, Aug 30, 2013 at 2:51 PM, Vivek Mishra mishra.v...@gmail.com wrote: Did you try to explore CQL3 collection support for the same? You can definitely save on number of rows with that. Point which i am trying to make out is, you can achieve it via CQL3 ( Jonathan's blog : http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows ) I agree with you that still thrift may have some valid points to prove, but considering latest development around new Cassandra features, i think CQL3 is the path to follow. -Vivek On Sat, Aug 31, 2013 at 12:15 AM, Peter Lin wool...@gmail.com wrote: you could dynamically create new tables at runtime and insert rows into the new table, but is that better than using thrift and putting it into a regular dynamic column with the exact name type and value type? that would mean if there's 20 dynamic columns of different types, you'd have to execute 21 queries to rebuild the data. That's basically the same as using EVA tables in relational databases. Having used that approach in the past to build temporal databases, it doesn't scale well. On Fri, Aug 30, 2013 at 2:40 PM, Vivek Mishra mishra.v...@gmail.comwrote: create a column family as: create table dynamicTable(key text, nameAsDouble double, valueAsBlob blob); insert into dynamicTable(key, nameAsDouble, valueAsBlob) values ( key, double(102.211), textAsBlob('valueInBytes'). Do you think, it will work in case column name are double? -Vivek On Sat, Aug 31, 2013 at 12:03 AM, Peter Lin wool...@gmail.com wrote: In the interest of education and discussion. I didn't mean to say CQL3 doesn't support dynamic columns. The example from the page shows default type defined in the create statement. create column family data with key_validation_class=Int32Type and comparator=DateType and default_validation_class=FloatType; If I try to insert a dynamic column that uses double for column name and string for column value, it will throw an error. The kind of use case I'm talking about defines a minimum number of static columns. Most of the columns that are added at runtime are different name and value type. This is specific to my use case. Having said that, I believe it would be possible to provide that kind of feature in CQL, but the trade off is it deviates from SQL. The grammar would have to allow type declaration in the columns list and functions in the values. Something like insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values ('abc123', some string, double(102.211)) doubleType(newcol1) and string(newcol2) are dynamic columns. I know many people find thrift hard to grok and struggle with it, but I'm a firm believer in taking time to learn. Every developer should take time to read cassandra source code and the source code for the driver they're using. On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.comwrote: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.comwrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY
Re: CQL Thrift
On Fri, Aug 30, 2013 at 11:56 AM, Vivek Mishra mishra.v...@gmail.comwrote: @lhazlewood https://issues.apache.org/jira/browse/CASSANDRA-5959 Begin batch multiple insert statements. apply batch It doesn't work for you? -Vivek According to the OP batching inserts is slow. The SO thread [1] mentions that the in their environment BATCH takes 1.5min, while the Thrift-based approach is around 235millis. [1] http://stackoverflow.com/questions/18522191/using-cassandra-and-cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque -- :- a) Alex Popescu Sen. Product Manager @ DataStax @al3xandru
Re: CQL Thrift
It seems really strange to me that you're create a table with specific types then try to deviate from it. Why not just use the blob type, then you can store whatever you want in there? The whole point of adding strong typing is to adhere to it. I wouldn't consider it a fault of the database that it does what you asked it to. On Aug 30, 2013, at 11:33 AM, Peter Lin wool...@gmail.com wrote: In the interest of education and discussion. I didn't mean to say CQL3 doesn't support dynamic columns. The example from the page shows default type defined in the create statement. create column family data with key_validation_class=Int32Type and comparator=DateType and default_validation_class=FloatType; If I try to insert a dynamic column that uses double for column name and string for column value, it will throw an error. The kind of use case I'm talking about defines a minimum number of static columns. Most of the columns that are added at runtime are different name and value type. This is specific to my use case. Having said that, I believe it would be possible to provide that kind of feature in CQL, but the trade off is it deviates from SQL. The grammar would have to allow type declaration in the columns list and functions in the values. Something like insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values ('abc123', some string, double(102.211)) doubleType(newcol1) and string(newcol2) are dynamic columns. I know many people find thrift hard to grok and struggle with it, but I'm a firm believer in taking time to learn. Every developer should take time to read cassandra source code and the source code for the driver they're using. On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.com wrote: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek -- Jonathan Ellis Project Chair, Apache Cassandra co-founder, http://www.datastax.com @spyced
Re: CQL Thrift
create a column family as: create table dynamicTable(key text, nameAsDouble double, valueAsBlob blob); insert into dynamicTable(key, nameAsDouble, valueAsBlob) values ( key, double(102.211), textAsBlob('valueInBytes'). Do you think, it will work in case column name are double? -Vivek On Sat, Aug 31, 2013 at 12:03 AM, Peter Lin wool...@gmail.com wrote: In the interest of education and discussion. I didn't mean to say CQL3 doesn't support dynamic columns. The example from the page shows default type defined in the create statement. create column family data with key_validation_class=Int32Type and comparator=DateType and default_validation_class=FloatType; If I try to insert a dynamic column that uses double for column name and string for column value, it will throw an error. The kind of use case I'm talking about defines a minimum number of static columns. Most of the columns that are added at runtime are different name and value type. This is specific to my use case. Having said that, I believe it would be possible to provide that kind of feature in CQL, but the trade off is it deviates from SQL. The grammar would have to allow type declaration in the columns list and functions in the values. Something like insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values ('abc123', some string, double(102.211)) doubleType(newcol1) and string(newcol2) are dynamic columns. I know many people find thrift hard to grok and struggle with it, but I'm a firm believer in taking time to learn. Every developer should take time to read cassandra source code and the source code for the driver they're using. On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.com wrote: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek -- Jonathan Ellis Project Chair, Apache Cassandra co-founder, http://www.datastax.com @spyced
Re: CQL Thrift
Yes, that's correct - and that's a scaled number. In practice: On the local dev machine, CQL3 inserting 10,000 columns (for 1 row) in a BATCH took 1.5 minutes. 50,000 columns (the desired amount) in a BATCH took 7.5 minutes. The same Thrift functionality took _235 milliseconds_. That's almost 2,000 times faster (3 orders of magnitude difference)! However, according to Aleksey Yeschenko, this performance problem has been addressed in 2.0 beta 1 via https://issues.apache.org/jira/browse/CASSANDRA-4693. I'll reserve judgement until I can performance-test 2.0 beta 1 ;) Cheers, -- Les Hazlewood | @lhazlewood CTO, Stormpath | http://stormpath.com | @goStormpath | 888.391.5282 On Fri, Aug 30, 2013 at 12:50 PM, Alex Popescu al...@datastax.com wrote: On Fri, Aug 30, 2013 at 11:56 AM, Vivek Mishra mishra.v...@gmail.comwrote: @lhazlewood https://issues.apache.org/jira/browse/CASSANDRA-5959 Begin batch multiple insert statements. apply batch It doesn't work for you? -Vivek According to the OP batching inserts is slow. The SO thread [1] mentions that the in their environment BATCH takes 1.5min, while the Thrift-based approach is around 235millis. [1] http://stackoverflow.com/questions/18522191/using-cassandra-and-cql3-how-do-you-insert-an-entire-wide-row-in-a-single-reque -- :- a) Alex Popescu Sen. Product Manager @ DataStax @al3xandru
Re: CQL Thrift
This has nothing to do with compact storage. Cassandra supports arbitrary dynamic columns of different name/value type today. If people are happy with SQL metaphor, then CQL is fine. Then again, if SQL metaphor was good for temporal databases, there wouldn't be so many failed temporal databases built on RDB. I've built over 4 bi-temporal databases on RDB over the last 12 years, so it's not something that was done lightly. it was from years of pain. I won't bore others about the challenges of building temporal databases. On Fri, Aug 30, 2013 at 2:51 PM, Jon Haddad j...@jonhaddad.com wrote: It sounds like you want this: create table data ( pk int, colname blob, value blob, primary key (pk, colname)); that gives you arbitrary columns (cleverly labeled colname) in a single row, where the value is value. If you don't want the overhead of storing colname in every row, try with compact storage. Does this solve the problem, or am I missing something? On Aug 30, 2013, at 11:45 AM, Peter Lin wool...@gmail.com wrote: you could dynamically create new tables at runtime and insert rows into the new table, but is that better than using thrift and putting it into a regular dynamic column with the exact name type and value type? that would mean if there's 20 dynamic columns of different types, you'd have to execute 21 queries to rebuild the data. That's basically the same as using EVA tables in relational databases. Having used that approach in the past to build temporal databases, it doesn't scale well. On Fri, Aug 30, 2013 at 2:40 PM, Vivek Mishra mishra.v...@gmail.comwrote: create a column family as: create table dynamicTable(key text, nameAsDouble double, valueAsBlob blob); insert into dynamicTable(key, nameAsDouble, valueAsBlob) values ( key, double(102.211), textAsBlob('valueInBytes'). Do you think, it will work in case column name are double? -Vivek On Sat, Aug 31, 2013 at 12:03 AM, Peter Lin wool...@gmail.com wrote: In the interest of education and discussion. I didn't mean to say CQL3 doesn't support dynamic columns. The example from the page shows default type defined in the create statement. create column family data with key_validation_class=Int32Type and comparator=DateType and default_validation_class=FloatType; If I try to insert a dynamic column that uses double for column name and string for column value, it will throw an error. The kind of use case I'm talking about defines a minimum number of static columns. Most of the columns that are added at runtime are different name and value type. This is specific to my use case. Having said that, I believe it would be possible to provide that kind of feature in CQL, but the trade off is it deviates from SQL. The grammar would have to allow type declaration in the columns list and functions in the values. Something like insert into mytable (KEY, doubleType(newcol1), string(newcol2)) values ('abc123', some string, double(102.211)) doubleType(newcol1) and string(newcol2) are dynamic columns. I know many people find thrift hard to grok and struggle with it, but I'm a firm believer in taking time to learn. Every developer should take time to read cassandra source code and the source code for the driver they're using. On Fri, Aug 30, 2013 at 2:18 PM, Jonathan Ellis jbel...@gmail.comwrote: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows On Fri, Aug 30, 2013 at 12:53 PM, Peter Lin wool...@gmail.com wrote: my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND
Re: CQL Thrift
my bias perspective, I find the sweet spot is thrift for insert/update and CQL for select queries. CQL is too limiting and negates the power of storing arbitrary data types in dynamic columns. On Fri, Aug 30, 2013 at 1:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're going to work with CQL, work with CQL. If you're going to work with Thrift, work with Thrift. Don't mix. On Aug 30, 2013, at 10:38 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, If i a create a table with CQL3 as create table user(user_id text PRIMARY KEY, first_name text, last_name text, emailid text); and create index as: create index on user(first_name); then inserted some data as: insert into user(user_id,first_name,last_name,emailId) values('@mevivs','vivek','mishra','vivek.mis...@impetus.co.in'); Then if update same column family using Cassandra-cli as: update column family user with key_validation_class='UTF8Type' and column_metadata=[{column_name:last_name, validation_class:'UTF8Type', index_type:KEYS},{column_name:first_name, validation_class:'UTF8Type', index_type:KEYS}]; Now if i connect via cqlsh and explore user table, i can see column first_name,last_name are not part of table structure anymore. Here is the output: CREATE TABLE user ( key text PRIMARY KEY ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; cqlsh:cql3usage select * from user; user_id - @mevivs I understand that, CQL3 and thrift interoperability is an issue. But this looks to me a very basic scenario. Any suggestions? Or If anybody can explain a reason behind this? -Vivek
Re: SQL Injection C* (via CQL Thrift)
As for the thrift side (i.e. using Hector or Astyanax), anyone have a crafty way to inject something? The only thing I've ever heard of coming close was a thrift bug that allowed a malformed request to crash the server. But that was a while ago https://issues.apache.org/jira/browse/CASSANDRA-475 Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 19/06/2013, at 1:46 AM, Brian O'Neill b...@alumni.brown.edu wrote: Perfect. Thanks Sylvain. That is exactly the input I was looking for, and I agree completely. (t's easy enough to protect against) As for the thrift side (i.e. using Hector or Astyanax), anyone have a crafty way to inject something? At first glance, it doesn't appear possible, but I'm not 100% confident making that assertion. -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive • King of Prussia, PA • 19406 M: 215.588.6024 • @boneill42 • healthmarketscience.com This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at the email above and delete this email and any attachments and destroy any copies thereof. Any review, retransmission, dissemination, copying or other use of, or taking any action in reliance upon, this information by persons or entities other than the intended recipient is strictly prohibited. From: Sylvain Lebresne sylv...@datastax.com Reply-To: user@cassandra.apache.org Date: Tuesday, June 18, 2013 8:51 AM To: user@cassandra.apache.org user@cassandra.apache.org Subject: Re: SQL Injection C* (via CQL Thrift) If you're not careful, then CQL injection is possible. Say you naively build you query with UPDATE foo SET col=' + user_input + ' WHERE key = 'k' then if user_input is foo' AND col2='bar, your user will have overwritten a column it shouldn't have been able to. And something equivalent in a BATCH statement could allow to overwrite/delete some random row in some random table. Now CQL being much more restricted than SQL (no subqueries, no generic transaction, ...), the extent of what you can do with a CQL injection is way smaller than in SQL. But you do have to be careful. As far as the Datastax java driver is concerned, you can fairly easily protect yourself by using either: 1) prepared statements: if the user input is a prepared variable, there is nothing the user can do (it's equivalent to the thrift situation). 2) using the query builder: it will escape quotes in the strings you provided, thuse avoiding injection. So I would say that injections are definitively possible if you concatenate strings too naively, but I don't think preventing them is very hard. -- Sylvain On Tue, Jun 18, 2013 at 2:02 PM, Brian O'Neill b...@alumni.brown.edu wrote: Mostly for fun, I wanted to throw this out there... We are undergoing a security audit for our platform (C* + Elastic Search + Storm). One component of that audit is susceptibility to SQL injection. I was wondering if anyone has attempted to construct a SQL injection attack against Cassandra? Is it even possible? I know the code paths fairly well, but... Does there exists a path in the code whereby user data gets interpreted, which could be exploited to perform user operations? From the Thrift side of things, I've always felt safe. Data is opaque. Serializers are used to convert it to Bytes, and C* doesn't ever really do anything with the data. In examining the CQL java-driver, it looks like there might be a bit more exposure to injection. (or even CQL over Thrift) I haven't dug into the code yet, but dependent on which flavor of the API you are using, you may be including user data in your statements. Does anyone know if the CQL java-driver does anything to protect against injection? Or is it possible to say that the syntax is strict enough that any embedded operations in data would not parse? just some food for thought... I'll be digging into this over the next couple weeks. If people are interested, I can throw a blog post out there with the findings. -brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://brianoneill.blogspot.com/ twitter: @boneill42
Re: SQL Injection C* (via CQL Thrift)
On Thu, Jun 20, 2013 at 2:15 AM, aaron morton aa...@thelastpickle.com wrote: As for the thrift side (i.e. using Hector or Astyanax), anyone have a crafty way to inject something? The only thing I've ever heard of coming close was a thrift bug that allowed a malformed request to crash the server. But that was a while ago https://issues.apache.org/jira/browse/CASSANDRA-475 Oh, that brings me back. Literally my first interaction with a cassandra server : - start cassandra - telnet localhost 9160 - asdasdasdasdsa - Connection reset by peer - notice server has crashed Not *really* a Cassandra bug, but hilarious nonetheless. :) =Rob
Re: SQL Injection C* (via CQL Thrift)
My first interaction with cassandra: ../nodeprobe -p 9160 ... Hum I can't seem to reach it :) Ow its no longer running... You've come along way baby. On Thu, Jun 20, 2013 at 12:59 PM, Robert Coli rc...@eventbrite.com wrote: On Thu, Jun 20, 2013 at 2:15 AM, aaron morton aa...@thelastpickle.com wrote: As for the thrift side (i.e. using Hector or Astyanax), anyone have a crafty way to inject something? The only thing I've ever heard of coming close was a thrift bug that allowed a malformed request to crash the server. But that was a while ago https://issues.apache.org/jira/browse/CASSANDRA-475 Oh, that brings me back. Literally my first interaction with a cassandra server : - start cassandra - telnet localhost 9160 - asdasdasdasdsa - Connection reset by peer - notice server has crashed Not *really* a Cassandra bug, but hilarious nonetheless. :) =Rob
SQL Injection C* (via CQL Thrift)
Mostly for fun, I wanted to throw this out there... We are undergoing a security audit for our platform (C* + Elastic Search + Storm). One component of that audit is susceptibility to SQL injection. I was wondering if anyone has attempted to construct a SQL injection attack against Cassandra? Is it even possible? I know the code paths fairly well, but... Does there exists a path in the code whereby user data gets interpreted, which could be exploited to perform user operations? From the Thrift side of things, I've always felt safe. Data is opaque. Serializers are used to convert it to Bytes, and C* doesn't ever really do anything with the data. In examining the CQL java-driver, it looks like there might be a bit more exposure to injection. (or even CQL over Thrift) I haven't dug into the code yet, but dependent on which flavor of the API you are using, you may be including user data in your statements. Does anyone know if the CQL java-driver does anything to protect against injection? Or is it possible to say that the syntax is strict enough that any embedded operations in data would not parse? just some food for thought... I'll be digging into this over the next couple weeks. If people are interested, I can throw a blog post out there with the findings. -brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://brianoneill.blogspot.com/ twitter: @boneill42
Re: SQL Injection C* (via CQL Thrift)
If you're not careful, then CQL injection is possible. Say you naively build you query with UPDATE foo SET col=' + user_input + ' WHERE key = 'k' then if user_input is foo' AND col2='bar, your user will have overwritten a column it shouldn't have been able to. And something equivalent in a BATCH statement could allow to overwrite/delete some random row in some random table. Now CQL being much more restricted than SQL (no subqueries, no generic transaction, ...), the extent of what you can do with a CQL injection is way smaller than in SQL. But you do have to be careful. As far as the Datastax java driver is concerned, you can fairly easily protect yourself by using either: 1) prepared statements: if the user input is a prepared variable, there is nothing the user can do (it's equivalent to the thrift situation). 2) using the query builder: it will escape quotes in the strings you provided, thuse avoiding injection. So I would say that injections are definitively possible if you concatenate strings too naively, but I don't think preventing them is very hard. -- Sylvain On Tue, Jun 18, 2013 at 2:02 PM, Brian O'Neill b...@alumni.brown.eduwrote: Mostly for fun, I wanted to throw this out there... We are undergoing a security audit for our platform (C* + Elastic Search + Storm). One component of that audit is susceptibility to SQL injection. I was wondering if anyone has attempted to construct a SQL injection attack against Cassandra? Is it even possible? I know the code paths fairly well, but... Does there exists a path in the code whereby user data gets interpreted, which could be exploited to perform user operations? From the Thrift side of things, I've always felt safe. Data is opaque. Serializers are used to convert it to Bytes, and C* doesn't ever really do anything with the data. In examining the CQL java-driver, it looks like there might be a bit more exposure to injection. (or even CQL over Thrift) I haven't dug into the code yet, but dependent on which flavor of the API you are using, you may be including user data in your statements. Does anyone know if the CQL java-driver does anything to protect against injection? Or is it possible to say that the syntax is strict enough that any embedded operations in data would not parse? just some food for thought... I'll be digging into this over the next couple weeks. If people are interested, I can throw a blog post out there with the findings. -brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://brianoneill.blogspot.com/ twitter: @boneill42
Re: SQL Injection C* (via CQL Thrift)
Perfect. Thanks Sylvain. That is exactly the input I was looking for, and I agree completely. (t's easy enough to protect against) As for the thrift side (i.e. using Hector or Astyanax), anyone have a crafty way to inject something? At first glance, it doesn't appear possible, but I'm not 100% confident making that assertion. -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive King of Prussia, PA 19406 M: 215.588.6024 @boneill42 http://www.twitter.com/boneill42 healthmarketscience.com This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at the email above and delete this email and any attachments and destroy any copies thereof. Any review, retransmission, dissemination, copying or other use of, or taking any action in reliance upon, this information by persons or entities other than the intended recipient is strictly prohibited. From: Sylvain Lebresne sylv...@datastax.com Reply-To: user@cassandra.apache.org Date: Tuesday, June 18, 2013 8:51 AM To: user@cassandra.apache.org user@cassandra.apache.org Subject: Re: SQL Injection C* (via CQL Thrift) If you're not careful, then CQL injection is possible. Say you naively build you query with UPDATE foo SET col=' + user_input + ' WHERE key = 'k' then if user_input is foo' AND col2='bar, your user will have overwritten a column it shouldn't have been able to. And something equivalent in a BATCH statement could allow to overwrite/delete some random row in some random table. Now CQL being much more restricted than SQL (no subqueries, no generic transaction, ...), the extent of what you can do with a CQL injection is way smaller than in SQL. But you do have to be careful. As far as the Datastax java driver is concerned, you can fairly easily protect yourself by using either: 1) prepared statements: if the user input is a prepared variable, there is nothing the user can do (it's equivalent to the thrift situation). 2) using the query builder: it will escape quotes in the strings you provided, thuse avoiding injection. So I would say that injections are definitively possible if you concatenate strings too naively, but I don't think preventing them is very hard. -- Sylvain On Tue, Jun 18, 2013 at 2:02 PM, Brian O'Neill b...@alumni.brown.edu wrote: Mostly for fun, I wanted to throw this out there... We are undergoing a security audit for our platform (C* + Elastic Search + Storm). One component of that audit is susceptibility to SQL injection. I was wondering if anyone has attempted to construct a SQL injection attack against Cassandra? Is it even possible? I know the code paths fairly well, but... Does there exists a path in the code whereby user data gets interpreted, which could be exploited to perform user operations? From the Thrift side of things, I've always felt safe. Data is opaque. Serializers are used to convert it to Bytes, and C* doesn't ever really do anything with the data. In examining the CQL java-driver, it looks like there might be a bit more exposure to injection. (or even CQL over Thrift) I haven't dug into the code yet, but dependent on which flavor of the API you are using, you may be including user data in your statements. Does anyone know if the CQL java-driver does anything to protect against injection? Or is it possible to say that the syntax is strict enough that any embedded operations in data would not parse? just some food for thought... I'll be digging into this over the next couple weeks. If people are interested, I can throw a blog post out there with the findings. -brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://brianoneill.blogspot.com/ twitter: @boneill42
Re: Cassandra, CQL, Thrift Deprecation?? and Erlang
The Thrift API is not going anywhere any time soon. I'm not aware of anyone working on an erlang CQL client. On Fri, Sep 2, 2011 at 7:39 AM, J T jt4websi...@googlemail.com wrote: Hi, I'm a fan of erlang, and have been using successive cassandra versions via the erlang thrift interface for a couple of years now. I see that cassandra seems to be moving to using CQL instead and so I was wondering if that means the thrift api will be deprecated and if so is there any effort underway to by anyone to create (whatever would be neccessary) to use cassandra via cql from erlang ? JT -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Cassandra, CQL, Thrift Deprecation?? and Erlang
Ok, thats good to know. If push came to shove I could probably write such a client myself after doing the necessary research but I'd prefer to save myself the hassle. Thanks. On Fri, Sep 2, 2011 at 1:59 PM, Jonathan Ellis jbel...@gmail.com wrote: The Thrift API is not going anywhere any time soon. I'm not aware of anyone working on an erlang CQL client. On Fri, Sep 2, 2011 at 7:39 AM, J T jt4websi...@googlemail.com wrote: Hi, I'm a fan of erlang, and have been using successive cassandra versions via the erlang thrift interface for a couple of years now. I see that cassandra seems to be moving to using CQL instead and so I was wondering if that means the thrift api will be deprecated and if so is there any effort underway to by anyone to create (whatever would be neccessary) to use cassandra via cql from erlang ? JT -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com