Why Cassandra 2.1.2 couldn't populate row cache in between

2015-01-20 Thread nitin padalia
Hi,

If I've enable row cache for some column family, when I request some
row which is not from the begining of the partition, then cassandra
doesn't populate, row cache.

Why it is so? For older version I think it was because we're saying
the its caching complete merged partition so, incomplete partition
can't reside in row cache.

However in new version since we could resize the cache, so why not we
populate from other than the start?

Nitin Padalia


Comparison of multiple ways to query cassandra

2015-01-20 Thread Parth Setya
hi

Could someone please shed some light on which is an efficient way to
retrieve data from cassandra- Using a Range Slice Query(I'm Using Hector)
or filtering using secondary indexes?

best
Parth


How to know disk utilization by each row on a node

2015-01-20 Thread Edson Marquezani Filho
Hello, everybody.

Does anyone know a way to list, for an arbitrary column family, all
the rows owned (including replicas) by a given node and the data size
(real size or disk occupation) of each one of them on that node?

I would like to do that because I have data on one of my nodes growing
faster than the others, although rows (and replicas) seem evenly
distributed across the cluster. So, I would like to verify if I have
some specific rows growing too much.

Thank you.


Re: Dynamic Columns

2015-01-20 Thread Jonathan Lacefield
Hello,

  Have you looked at solving this challenge with clustering columns?  Also,
please describe the problem set details for more specific advice from this
group.

  Starting new projects on Thrift isn't the recommended approach.

Jonathan

[image: datastax_logo.png]

Jonathan Lacefield

Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

[image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
facebook.png] https://www.facebook.com/datastax [image: twitter.png]
https://twitter.com/datastax [image: g+.png]
https://plus.google.com/+Datastax/about
http://feeds.feedburner.com/datastax https://github.com/datastax/

On Tue, Jan 20, 2015 at 1:24 PM, chetan verma chetanverm...@gmail.com
wrote:

 Hi,

 I am starting a new project with cassandra as database.
 I have unstructured data so I need dynamic columns,
 though in CQL3 we can achive this via Collections but there are some
 downsides to it.
 1. Collections are used to store small amount of data.
 2. The maximum size of an item in a collection is 64K.
 3. Cassandra reads a collection in its entirety.
 4. Restrictions on number of items in collections is 64,000

 And no support to get single column by map key, which is possible via
 cassandra cli.
 Please suggest whether I should use CQL3 or Thrift and which driver is
 best.

 --
 *Regards,*
 *Chetan Verma*
 *+91 99860 86634 %2B91%2099860%2086634*



Dynamic Columns

2015-01-20 Thread chetan verma
Hi,

I am starting a new project with cassandra as database.
I have unstructured data so I need dynamic columns,
though in CQL3 we can achive this via Collections but there are some
downsides to it.
1. Collections are used to store small amount of data.
2. The maximum size of an item in a collection is 64K.
3. Cassandra reads a collection in its entirety.
4. Restrictions on number of items in collections is 64,000

And no support to get single column by map key, which is possible via
cassandra cli.
Please suggest whether I should use CQL3 or Thrift and which driver is best.

-- 
*Regards,*
*Chetan Verma*
*+91 99860 86634*


[RELEASE] Apache Cassandra 2.0.12 released

2015-01-20 Thread Jake Luciani
The Cassandra team is pleased to announce the release of Apache Cassandra
version 2.0.12.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is a bug fix release[1] on the 2.0 series. As always, please pay
attention to the release notes[2] and Let us know[3] if you were to encounter
any problem.

Enjoy!

[1]: http://goo.gl/ZeeTfs (CHANGES.txt)
[2]: http://goo.gl/1zEijH (NEWS.txt)
[3]: https://issues.apache.org/jira/browse/CASSANDRA


Re: Dynamic Columns

2015-01-20 Thread chetan verma
Could you please explain how we can achieve dynamic column behavior by
clustering columns.

On Wed, Jan 21, 2015 at 12:10 AM, chetan verma chetanverm...@gmail.com
wrote:

 Hi,

 I am creating a review system. for instance lets assume following are the
 attibutes of system:

 Review{
 id bigint,
 product_id bigint,
 created_at timestamp,
 summary text,
 description text,
 pros settext,
 cons settext,
 feature_rating maptext, int
 etc
 }
 I created partition key as product_id (so that all the reviews for a given
 product will reside on same node)
 and clustering key as created_at and id (Desc) so that  reviews will be
 sorted by time.

 I can have more column and that requirement I want to fulfil by dynamic
 columns but there are limitations to it explained above.
 Could you please let me know the best way.

 On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield 
 jlacefi...@datastax.com wrote:

 Hello,

   Have you looked at solving this challenge with clustering columns?
 Also, please describe the problem set details for more specific advice from
 this group.

   Starting new projects on Thrift isn't the recommended approach.

 Jonathan

 [image: datastax_logo.png]

 Jonathan Lacefield

 Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

 [image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax https://github.com/datastax/

 On Tue, Jan 20, 2015 at 1:24 PM, chetan verma chetanverm...@gmail.com
 wrote:

 Hi,

 I am starting a new project with cassandra as database.
 I have unstructured data so I need dynamic columns,
 though in CQL3 we can achive this via Collections but there are some
 downsides to it.
 1. Collections are used to store small amount of data.
 2. The maximum size of an item in a collection is 64K.
 3. Cassandra reads a collection in its entirety.
 4. Restrictions on number of items in collections is 64,000

 And no support to get single column by map key, which is possible via
 cassandra cli.
 Please suggest whether I should use CQL3 or Thrift and which driver is
 best.

 --
 *Regards,*
 *Chetan Verma*
 *+91 99860 86634 %2B91%2099860%2086634*





 --
 *Regards,*
 *Chetan Verma*
 *+91 99860 86634 %2B91%2099860%2086634*




-- 
*Regards,*
*Chetan Verma*
*+91 99860 86634*


Re: Dynamic Columns

2015-01-20 Thread chetan verma
Hi,

I am creating a review system. for instance lets assume following are the
attibutes of system:

Review{
id bigint,
product_id bigint,
created_at timestamp,
summary text,
description text,
pros settext,
cons settext,
feature_rating maptext, int
etc
}
I created partition key as product_id (so that all the reviews for a given
product will reside on same node)
and clustering key as created_at and id (Desc) so that  reviews will be
sorted by time.

I can have more column and that requirement I want to fulfil by dynamic
columns but there are limitations to it explained above.
Could you please let me know the best way.

On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield 
jlacefi...@datastax.com wrote:

 Hello,

   Have you looked at solving this challenge with clustering columns?
 Also, please describe the problem set details for more specific advice from
 this group.

   Starting new projects on Thrift isn't the recommended approach.

 Jonathan

 [image: datastax_logo.png]

 Jonathan Lacefield

 Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

 [image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax https://github.com/datastax/

 On Tue, Jan 20, 2015 at 1:24 PM, chetan verma chetanverm...@gmail.com
 wrote:

 Hi,

 I am starting a new project with cassandra as database.
 I have unstructured data so I need dynamic columns,
 though in CQL3 we can achive this via Collections but there are some
 downsides to it.
 1. Collections are used to store small amount of data.
 2. The maximum size of an item in a collection is 64K.
 3. Cassandra reads a collection in its entirety.
 4. Restrictions on number of items in collections is 64,000

 And no support to get single column by map key, which is possible via
 cassandra cli.
 Please suggest whether I should use CQL3 or Thrift and which driver is
 best.

 --
 *Regards,*
 *Chetan Verma*
 *+91 99860 86634 %2B91%2099860%2086634*





-- 
*Regards,*
*Chetan Verma*
*+91 99860 86634*


Re: number of replicas per data center?

2015-01-20 Thread Robert Coli
On Sun, Jan 18, 2015 at 8:50 PM, Kevin Burton bur...@spinn3r.com wrote:

 Ah.. six replicas.  At least its super inexpensive that way (sarcasm!)


People with larger numbers of data centers do tend to reduce their
replication factor per DC. It's all about how much consistency you want to
risk, rebuild over the
WAN, etc..

=Rob


Re: How do replica become out of sync

2015-01-20 Thread Robert Coli
On Mon, Jan 19, 2015 at 5:44 PM, Flavien Charlon flavien.char...@gmail.com
wrote:

 Thanks Andi. The reason I was asking is that even though my nodes have
 been 100% available and no write has been rejected, when running an
 incremental repair, the logs still indicate that some ranges are out of
 sync (which then results in large amounts of compaction), how can this be
 possible?


This is most likely, as you conjecture, due to slight differences between
nodes at the time of Merkle Tree calculation.

How many rows differ?

=Rob


Re: Why Cassandra 2.1.2 couldn't populate row cache in between

2015-01-20 Thread Robert Coli
On Mon, Jan 19, 2015 at 11:57 PM, nitin padalia padalia.ni...@gmail.com
wrote:

 If I've enable row cache for some column family, when I request some
 row which is not from the begining of the partition, then cassandra
 doesn't populate, row cache.

 Why it is so? For older version I think it was because we're saying
 the its caching complete merged partition so, incomplete partition
 can't reside in row cache.

 However in new version since we could resize the cache, so why not we
 populate from other than the start?


https://issues.apache.org/jira/browse/CASSANDRA-5357

Has the details of the new row version of the row cache.

=Rob


Re: Compaction failing to trigger

2015-01-20 Thread Robert Coli
On Sun, Jan 18, 2015 at 6:06 PM, Flavien Charlon flavien.char...@gmail.com
wrote:

 It's set on all the tables, as I'm using the default for all the tables.
 But for that particular table there are 41 SSTables between 60MB and 85MB,
 it should only take 4 for the compaction to kick in.


What version of Cassandra are you running?

Are they all live? Are there pending compactions, or exceptions regarding
compactions in your logs?


 As this is probably a bug and going back in the mailing list archive, it
 seems it's already been reported:


This is a weird statement. Are you saying that you've found it in the
mailing list archives? If so, why not paste the threads so those of us who
might remember can refer to them?


- Will it be fixed in 2.1.3?

 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/


=Rob


Re: keyspace not exists?

2015-01-20 Thread Robert Coli
On Sun, Jan 18, 2015 at 8:55 PM, Jason Wee peich...@gmail.com wrote:

 two nodes running cassandra 2.1.2 and one running cassandra 2.1.1


For the record, this is an unsupported persistent configuration. You are
only supposed to have split minor versions during an upgrade.

I have no idea if it is causing the problem you are having.

=Rob


Re: Compaction failing to trigger

2015-01-20 Thread Eric Stevens
@Rob - he's probably referring to the thread titled Reasons for nodes not
compacting? where Tyler speculates that the tables are falling below the
cold read threshold for compaction.  He speculated it may be a bug.  At the
same time in a different thread, Roland had a similar problem, and Tyler's
proposed workaround seemed to work for him.

On Tue, Jan 20, 2015 at 3:35 PM, Robert Coli rc...@eventbrite.com wrote:

 On Sun, Jan 18, 2015 at 6:06 PM, Flavien Charlon 
 flavien.char...@gmail.com wrote:

 It's set on all the tables, as I'm using the default for all the tables.
 But for that particular table there are 41 SSTables between 60MB and 85MB,
 it should only take 4 for the compaction to kick in.


 What version of Cassandra are you running?

 Are they all live? Are there pending compactions, or exceptions
 regarding compactions in your logs?


 As this is probably a bug and going back in the mailing list archive, it
 seems it's already been reported:


 This is a weird statement. Are you saying that you've found it in the
 mailing list archives? If so, why not paste the threads so those of us who
 might remember can refer to them?


- Will it be fixed in 2.1.3?

 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/


 =Rob




Re: Dynamic Columns

2015-01-20 Thread Xu Zhongxing
Maybe this is the closest thing to dynamic columns in CQL 3.


create table reivew (
product_id bigint,
created_at timestamp,
data_key text,
data_tvalue text,
data_ivalue int,
primary key ((priduct_id, created_at), data_key)
);


data_tvalue and data_ivalue is optional.


At 2015-01-21 04:44:07, chetan verma chetanverm...@gmail.com wrote:

Hi,


Adding to previous mail. For example: We have a column family named review 
(with some arbitrary data in map).


CREATE TABLE review(
product_id bigint,
created_at timestamp,
data_int maptext, int,
data_text maptext, text,
PRIMARY KEY (product_id, created_at)
);


Assume that these 2 maps I use to store arbitrary data (i.e. data_int and 
data_text for int and text values)
when we see output on cassandra-cli, it looks like in a partition as :
clustering_key:data_int:map_key as column name and value as map value.
suppose I need to get this value, I couldn't do that with CQL3 but in thrift 
its possible. Any Solution?


On Wed, Jan 21, 2015 at 1:06 AM, chetan verma chetanverm...@gmail.com wrote:

Hi,


Most of the time I will  be querying on product_id and created_at, but for 
analytic I need to query almost on all column.
Multiple collections ideas is good but the only is cassandra reads a collection 
entirely, what if I need a slice of it, I mean 
columns for certain keys which is possible with thrift. Please suggest.


On Wed, Jan 21, 2015 at 12:36 AM, Jonathan Lacefield jlacefi...@datastax.com 
wrote:

Hello,


There are probably lots of options to this challenge.  The more details around 
your use case that you can provide, the easier it will be for this group to 
offer advice.


A few follow-up questions: 
  - How will you query this data?  
  - Do your queries require filtering on specific columns other than product_id 
and created_at, i.e. the dynamic columns?


Depending on the answers to these questions, you have several options, of which 
here are a few:
Cassandra efficiently stores sparse data, so you could create columns and not 
populate them, without much of a penalty
Could use a clustering column to store a columns type and another col 
(potentially clustering) to store the value
i.e. CREATE TABLE foo (col1 int, attname text, attvalue text, col4...n, PRIMARY 
KEY (col1, attname, attvalue));
where attname stores the name of the attribute/column and attvalue stores the 
value of that attribute
have seen users use this model and create a main attribute row within a 
partition that stores the values associated with col4...n
Could store multiple collections
Others probably have ideas as well
You may want to look in the archives for a similar discussion topic.  Believe 
this item was asked a few months ago as well.



Jonathan Lacefield

Solution Architect |(404) 822 3487 | jlacefi...@datastax.com





On Tue, Jan 20, 2015 at 1:40 PM, chetan verma chetanverm...@gmail.com wrote:

Hi,


I am creating a review system. for instance lets assume following are the 
attibutes of system:


Review{
id bigint,
product_id bigint,
created_at timestamp,
summary text,
description text,
pros settext,
cons settext,
feature_rating maptext, int
etc
}
I created partition key as product_id (so that all the reviews for a given 
product will reside on same node)
and clustering key as created_at and id (Desc) so that  reviews will be sorted 
by time.


I can have more column and that requirement I want to fulfil by dynamic columns 
but there are limitations to it explained above.
Could you please let me know the best way.


On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield jlacefi...@datastax.com 
wrote:

Hello,


  Have you looked at solving this challenge with clustering columns?  Also, 
please describe the problem set details for more specific advice from this 
group.


  Starting new projects on Thrift isn't the recommended approach.  


Jonathan



Jonathan Lacefield

Solution Architect |(404) 822 3487 | jlacefi...@datastax.com





On Tue, Jan 20, 2015 at 1:24 PM, chetan verma chetanverm...@gmail.com wrote:

Hi,


I am starting a new project with cassandra as database.
I have unstructured data so I need dynamic columns, 
though in CQL3 we can achive this via Collections but there are some downsides 
to it.
1. Collections are used to store small amount of data.
2. The maximum size of an item in a collection is 64K.
3. Cassandra reads a collection in its entirety.
4. Restrictions on number of items in collections is 64,000


And no support to get single column by map key, which is possible via cassandra 
cli.
Please suggest whether I should use CQL3 or Thrift and which driver is best.


--

Regards,
Chetan Verma
+91 99860 86634







--

Regards,
Chetan Verma
+91 99860 86634







--

Regards,
Chetan Verma
+91 99860 86634





--

Regards,
Chetan Verma
+91 99860 86634

Re: Dynamic Columns

2015-01-20 Thread Peter Lin
I think that table example misses the point of chetan's functional
requirement. he actually needs dynamic columns.

On Tue, Jan 20, 2015 at 8:12 PM, Xu Zhongxing xu_zhong_x...@163.com wrote:

 Maybe this is the closest thing to dynamic columns in CQL 3.

 create table reivew (
 product_id bigint,
 created_at timestamp,
 data_key text,
 data_tvalue text,
 data_ivalue int,
 primary key ((priduct_id, created_at), data_key)
 );

 data_tvalue and data_ivalue is optional.

 At 2015-01-21 04:44:07, chetan verma chetanverm...@gmail.com wrote:

 Hi,

 Adding to previous mail. For example: We have a column family named review
 (with some arbitrary data in map).

 CREATE TABLE review(
 product_id bigint,
 created_at timestamp,
 data_int maptext, int,
 data_text maptext, text,
 PRIMARY KEY (product_id, created_at)
 );

 Assume that these 2 maps I use to store arbitrary data (i.e. data_int and
 data_text for int and text values)
 when we see output on cassandra-cli, it looks like in a partition as :
 clustering_key:data_int:map_key as column name and value as map value.
 suppose I need to get this value, I couldn't do that with CQL3 but in
 thrift its possible. Any Solution?

 On Wed, Jan 21, 2015 at 1:06 AM, chetan verma chetanverm...@gmail.com
 wrote:

 Hi,

 Most of the time I will  be querying on product_id and created_at, but
 for analytic I need to query almost on all column.
 Multiple collections ideas is good but the only is cassandra reads a
 collection entirely, what if I need a slice of it, I mean
 columns for certain keys which is possible with thrift. Please suggest.

 On Wed, Jan 21, 2015 at 12:36 AM, Jonathan Lacefield 
 jlacefi...@datastax.com wrote:

 Hello,

 There are probably lots of options to this challenge.  The more details
 around your use case that you can provide, the easier it will be for this
 group to offer advice.

 A few follow-up questions:
   - How will you query this data?
   - Do your queries require filtering on specific columns other than
 product_id and created_at, i.e. the dynamic columns?

 Depending on the answers to these questions, you have several options,
 of which here are a few:

- Cassandra efficiently stores sparse data, so you could create
columns and not populate them, without much of a penalty
- Could use a clustering column to store a columns type and another
col (potentially clustering) to store the value
   - i.e. CREATE TABLE foo (col1 int, attname text, attvalue text,
   col4...n, PRIMARY KEY (col1, attname, attvalue));
   - where attname stores the name of the attribute/column and
   attvalue stores the value of that attribute
   - have seen users use this model and create a main attribute
   row within a partition that stores the values associated with col4...n
- Could store multiple collections
- Others probably have ideas as well

 You may want to look in the archives for a similar discussion topic.
 Believe this item was asked a few months ago as well.

 [image: datastax_logo.png]

 Jonathan Lacefield

 Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

 [image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax https://github.com/datastax/

 On Tue, Jan 20, 2015 at 1:40 PM, chetan verma chetanverm...@gmail.com
 wrote:

 Hi,

 I am creating a review system. for instance lets assume following are
 the attibutes of system:

 Review{
 id bigint,
 product_id bigint,
 created_at timestamp,
 summary text,
 description text,
 pros settext,
 cons settext,
 feature_rating maptext, int
 etc
 }
 I created partition key as product_id (so that all the reviews for a
 given product will reside on same node)
 and clustering key as created_at and id (Desc) so that  reviews will be
 sorted by time.

 I can have more column and that requirement I want to fulfil by dynamic
 columns but there are limitations to it explained above.
 Could you please let me know the best way.

 On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield 
 jlacefi...@datastax.com wrote:

 Hello,

   Have you looked at solving this challenge with clustering columns?
 Also, please describe the problem set details for more specific advice 
 from
 this group.

   Starting new projects on Thrift isn't the recommended approach.

 Jonathan

 [image: datastax_logo.png]

 Jonathan Lacefield

 Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

 [image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax https://github.com/datastax/

 On Tue, Jan 20, 2015 at 1:24 PM, chetan verma chetanverm...@gmail.com
  wrote:

 

Re:Re: Dynamic Columns

2015-01-20 Thread Xu Zhongxing
I approximate dynamic columns by data_key and data_value columns.
Is there a better way to get dynamic columns in CQL 3?

At 2015-01-21 09:41:02, Peter Lin wool...@gmail.com wrote:



I think that table example misses the point of chetan's functional requirement. 
he actually needs dynamic columns.



On Tue, Jan 20, 2015 at 8:12 PM, Xu Zhongxing xu_zhong_x...@163.com wrote:

Maybe this is the closest thing to dynamic columns in CQL 3.


create table reivew (
product_id bigint,
created_at timestamp,
data_key text,
data_tvalue text,
data_ivalue int,
primary key ((priduct_id, created_at), data_key)
);


data_tvalue and data_ivalue is optional.


At 2015-01-21 04:44:07, chetan verma chetanverm...@gmail.com wrote:

Hi,


Adding to previous mail. For example: We have a column family named review 
(with some arbitrary data in map).


CREATE TABLE review(
product_id bigint,
created_at timestamp,
data_int maptext, int,
data_text maptext, text,
PRIMARY KEY (product_id, created_at)
);


Assume that these 2 maps I use to store arbitrary data (i.e. data_int and 
data_text for int and text values)
when we see output on cassandra-cli, it looks like in a partition as :
clustering_key:data_int:map_key as column name and value as map value.
suppose I need to get this value, I couldn't do that with CQL3 but in thrift 
its possible. Any Solution?


On Wed, Jan 21, 2015 at 1:06 AM, chetan verma chetanverm...@gmail.com wrote:

Hi,


Most of the time I will  be querying on product_id and created_at, but for 
analytic I need to query almost on all column.
Multiple collections ideas is good but the only is cassandra reads a collection 
entirely, what if I need a slice of it, I mean 
columns for certain keys which is possible with thrift. Please suggest.


On Wed, Jan 21, 2015 at 12:36 AM, Jonathan Lacefield jlacefi...@datastax.com 
wrote:

Hello,


There are probably lots of options to this challenge.  The more details around 
your use case that you can provide, the easier it will be for this group to 
offer advice.


A few follow-up questions: 
  - How will you query this data?  
  - Do your queries require filtering on specific columns other than product_id 
and created_at, i.e. the dynamic columns?


Depending on the answers to these questions, you have several options, of which 
here are a few:
Cassandra efficiently stores sparse data, so you could create columns and not 
populate them, without much of a penalty
Could use a clustering column to store a columns type and another col 
(potentially clustering) to store the value
i.e. CREATE TABLE foo (col1 int, attname text, attvalue text, col4...n, PRIMARY 
KEY (col1, attname, attvalue));
where attname stores the name of the attribute/column and attvalue stores the 
value of that attribute
have seen users use this model and create a main attribute row within a 
partition that stores the values associated with col4...n
Could store multiple collections
Others probably have ideas as well
You may want to look in the archives for a similar discussion topic.  Believe 
this item was asked a few months ago as well.



Jonathan Lacefield

Solution Architect |(404) 822 3487 | jlacefi...@datastax.com





On Tue, Jan 20, 2015 at 1:40 PM, chetan verma chetanverm...@gmail.com wrote:

Hi,


I am creating a review system. for instance lets assume following are the 
attibutes of system:


Review{
id bigint,
product_id bigint,
created_at timestamp,
summary text,
description text,
pros settext,
cons settext,
feature_rating maptext, int
etc
}
I created partition key as product_id (so that all the reviews for a given 
product will reside on same node)
and clustering key as created_at and id (Desc) so that  reviews will be sorted 
by time.


I can have more column and that requirement I want to fulfil by dynamic columns 
but there are limitations to it explained above.
Could you please let me know the best way.


On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield jlacefi...@datastax.com 
wrote:

Hello,


  Have you looked at solving this challenge with clustering columns?  Also, 
please describe the problem set details for more specific advice from this 
group.


  Starting new projects on Thrift isn't the recommended approach.  


Jonathan



Jonathan Lacefield

Solution Architect |(404) 822 3487 | jlacefi...@datastax.com





On Tue, Jan 20, 2015 at 1:24 PM, chetan verma chetanverm...@gmail.com wrote:

Hi,


I am starting a new project with cassandra as database.
I have unstructured data so I need dynamic columns, 
though in CQL3 we can achive this via Collections but there are some downsides 
to it.
1. Collections are used to store small amount of data.
2. The maximum size of an item in a collection is 64K.
3. Cassandra reads a collection in its entirety.
4. Restrictions on number of items in collections is 64,000


And no support to get single column by map key, which is possible via cassandra 
cli.
Please suggest whether I should use CQL3 or Thrift and which 

Re: Dynamic Columns

2015-01-20 Thread Xu Zhongxing
The original dynamic column idea in Google BigTable paper is a mapping of:


(row key, raw bytes) - raw bytes


The restriction imposed by CQL is, as far as I understand, you need to have a 
type for each column. 


If the value types involved in the schema is limited, e.g. text or int or 
timestamp, we can approximate the raw bytes mapping by setting up a few value 
columns of explicit type.





At 2015-01-21 10:46:27, Peter Lin wool...@gmail.com wrote:



the thing is, CQL only handles some types of dynamic column use cases. There's 
plenty of examples on datastax.com that shows how to do CQL style dynamic 
columns.


based on what was described by Chetan, I don't feel CQL3 is a perfect fit for 
what he wants to do. To use CQL3, he'd have to change his approach.

In my temporal database, I use both Thrift and CQL. They compliment each other 
very nice. I don't understand why people have to put down Thrift or pretend it 
supports 100% of the use cases. Lots of people who started using Cassandra pre 
CQL and had no problems using thrift. Yes you have to understand more and the 
learning curve is steeper, but taking time to learn the internals of cassandra 
is a good thing.


Using CQL3 lists or maps, it would force the query to load the enter 
collection, but that is by design. To get the full power of the old style of 
dynamic columns, thrift is a better fit. I hope CQL continues to improve so 
that it supports 100% of the existing use cases.





On Tue, Jan 20, 2015 at 8:50 PM, Xu Zhongxing xu_zhong_x...@163.com wrote:

I approximate dynamic columns by data_key and data_value columns.
Is there a better way to get dynamic columns in CQL 3?

At 2015-01-21 09:41:02, Peter Lin wool...@gmail.com wrote:



I think that table example misses the point of chetan's functional requirement. 
he actually needs dynamic columns.



On Tue, Jan 20, 2015 at 8:12 PM, Xu Zhongxing xu_zhong_x...@163.com wrote:

Maybe this is the closest thing to dynamic columns in CQL 3.


create table reivew (
product_id bigint,
created_at timestamp,
data_key text,
data_tvalue text,
data_ivalue int,
primary key ((priduct_id, created_at), data_key)
);


data_tvalue and data_ivalue is optional.


At 2015-01-21 04:44:07, chetan verma chetanverm...@gmail.com wrote:

Hi,


Adding to previous mail. For example: We have a column family named review 
(with some arbitrary data in map).


CREATE TABLE review(
product_id bigint,
created_at timestamp,
data_int maptext, int,
data_text maptext, text,
PRIMARY KEY (product_id, created_at)
);


Assume that these 2 maps I use to store arbitrary data (i.e. data_int and 
data_text for int and text values)
when we see output on cassandra-cli, it looks like in a partition as :
clustering_key:data_int:map_key as column name and value as map value.
suppose I need to get this value, I couldn't do that with CQL3 but in thrift 
its possible. Any Solution?


On Wed, Jan 21, 2015 at 1:06 AM, chetan verma chetanverm...@gmail.com wrote:

Hi,


Most of the time I will  be querying on product_id and created_at, but for 
analytic I need to query almost on all column.
Multiple collections ideas is good but the only is cassandra reads a collection 
entirely, what if I need a slice of it, I mean 
columns for certain keys which is possible with thrift. Please suggest.


On Wed, Jan 21, 2015 at 12:36 AM, Jonathan Lacefield jlacefi...@datastax.com 
wrote:

Hello,


There are probably lots of options to this challenge.  The more details around 
your use case that you can provide, the easier it will be for this group to 
offer advice.


A few follow-up questions: 
  - How will you query this data?  
  - Do your queries require filtering on specific columns other than product_id 
and created_at, i.e. the dynamic columns?


Depending on the answers to these questions, you have several options, of which 
here are a few:
Cassandra efficiently stores sparse data, so you could create columns and not 
populate them, without much of a penalty
Could use a clustering column to store a columns type and another col 
(potentially clustering) to store the value
i.e. CREATE TABLE foo (col1 int, attname text, attvalue text, col4...n, PRIMARY 
KEY (col1, attname, attvalue));
where attname stores the name of the attribute/column and attvalue stores the 
value of that attribute
have seen users use this model and create a main attribute row within a 
partition that stores the values associated with col4...n
Could store multiple collections
Others probably have ideas as well
You may want to look in the archives for a similar discussion topic.  Believe 
this item was asked a few months ago as well.



Jonathan Lacefield

Solution Architect |(404) 822 3487 | jlacefi...@datastax.com





On Tue, Jan 20, 2015 at 1:40 PM, chetan verma chetanverm...@gmail.com wrote:

Hi,


I am creating a review system. for instance lets assume following are the 
attibutes of system:


Review{
id bigint,
product_id bigint,
created_at timestamp,

Re: Re: Dynamic Columns

2015-01-20 Thread Peter Lin
the thing is, CQL only handles some types of dynamic column use cases.
There's plenty of examples on datastax.com that shows how to do CQL style
dynamic columns.

based on what was described by Chetan, I don't feel CQL3 is a perfect fit
for what he wants to do. To use CQL3, he'd have to change his approach.

In my temporal database, I use both Thrift and CQL. They compliment each
other very nice. I don't understand why people have to put down Thrift or
pretend it supports 100% of the use cases. Lots of people who started using
Cassandra pre CQL and had no problems using thrift. Yes you have to
understand more and the learning curve is steeper, but taking time to learn
the internals of cassandra is a good thing.

Using CQL3 lists or maps, it would force the query to load the enter
collection, but that is by design. To get the full power of the old style
of dynamic columns, thrift is a better fit. I hope CQL continues to improve
so that it supports 100% of the existing use cases.



On Tue, Jan 20, 2015 at 8:50 PM, Xu Zhongxing xu_zhong_x...@163.com wrote:

 I approximate dynamic columns by data_key and data_value columns.
 Is there a better way to get dynamic columns in CQL 3?

 At 2015-01-21 09:41:02, Peter Lin wool...@gmail.com wrote:


 I think that table example misses the point of chetan's functional
 requirement. he actually needs dynamic columns.

 On Tue, Jan 20, 2015 at 8:12 PM, Xu Zhongxing xu_zhong_x...@163.com
 wrote:

 Maybe this is the closest thing to dynamic columns in CQL 3.

 create table reivew (
 product_id bigint,
 created_at timestamp,
 data_key text,
 data_tvalue text,
 data_ivalue int,
 primary key ((priduct_id, created_at), data_key)
 );

 data_tvalue and data_ivalue is optional.

 At 2015-01-21 04:44:07, chetan verma chetanverm...@gmail.com wrote:

 Hi,

 Adding to previous mail. For example: We have a column family named
 review (with some arbitrary data in map).

 CREATE TABLE review(
 product_id bigint,
 created_at timestamp,
 data_int maptext, int,
 data_text maptext, text,
 PRIMARY KEY (product_id, created_at)
 );

 Assume that these 2 maps I use to store arbitrary data (i.e. data_int and
 data_text for int and text values)
 when we see output on cassandra-cli, it looks like in a partition as :
 clustering_key:data_int:map_key as column name and value as map value.
 suppose I need to get this value, I couldn't do that with CQL3 but in
 thrift its possible. Any Solution?

 On Wed, Jan 21, 2015 at 1:06 AM, chetan verma chetanverm...@gmail.com
 wrote:

 Hi,

 Most of the time I will  be querying on product_id and created_at, but
 for analytic I need to query almost on all column.
 Multiple collections ideas is good but the only is cassandra reads a
 collection entirely, what if I need a slice of it, I mean
 columns for certain keys which is possible with thrift. Please suggest.

 On Wed, Jan 21, 2015 at 12:36 AM, Jonathan Lacefield 
 jlacefi...@datastax.com wrote:

 Hello,

 There are probably lots of options to this challenge.  The more details
 around your use case that you can provide, the easier it will be for this
 group to offer advice.

 A few follow-up questions:
   - How will you query this data?
   - Do your queries require filtering on specific columns other than
 product_id and created_at, i.e. the dynamic columns?

 Depending on the answers to these questions, you have several options,
 of which here are a few:

- Cassandra efficiently stores sparse data, so you could create
columns and not populate them, without much of a penalty
- Could use a clustering column to store a columns type and another
col (potentially clustering) to store the value
   - i.e. CREATE TABLE foo (col1 int, attname text, attvalue text,
   col4...n, PRIMARY KEY (col1, attname, attvalue));
   - where attname stores the name of the attribute/column and
   attvalue stores the value of that attribute
   - have seen users use this model and create a main attribute
   row within a partition that stores the values associated with 
 col4...n
- Could store multiple collections
- Others probably have ideas as well

 You may want to look in the archives for a similar discussion topic.
 Believe this item was asked a few months ago as well.

 [image: datastax_logo.png]

 Jonathan Lacefield

 Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

 [image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax https://github.com/datastax/

 On Tue, Jan 20, 2015 at 1:40 PM, chetan verma chetanverm...@gmail.com
 wrote:

 Hi,

 I am creating a review system. for instance lets assume following are
 the attibutes of system:

 Review{
 id bigint,
 product_id bigint,
 created_at timestamp,
 summary text,
 description text,
 pros settext,
 

Re: Should one expect to see hints being stored/delivered occasionally?

2015-01-20 Thread Robert Coli
On Sat, Jan 17, 2015 at 3:32 PM, Vasileios Vlachos 
vasileiosvlac...@gmail.com wrote:

 Is there any other occasion that hints are stored and then being sent in a
 cluster, other than network or other temporary or permanent failure? Could
 it be that the client responsible for establishing a connection is causing
 this? We use the Datastax C# driver for connecting to the cluster and we
 run C* 1.2.18 on Ubuntu 12.04.


Other than restarting nodes manually (which I consider a temporary
failure for the purposes of this question), no. Seeing hints being stored
and delivered outside of this context is a warning sign that something may
be wrong with your cluster.

Probably what is happening is that you have stop the world GCs long enough
to trigger queueing of hints via timeouts during these GCs.

=Rob


Versioning in cassandra while indexing ?

2015-01-20 Thread Pandian R
Hi,

I just wanted to know if there is any kind of versioning system in
cassandra while indexing new data(like the one we have for ElasticSearch,
for example).

For example, I have a series of payloads each coming with an id and
'updatedAt' timestamp. I just want to maintain the latest state of any
payload for all the ids ie, index the data only if the current payload has
greater 'updatedAt' than the previously stored timestamp. I can do this
with one additional self-lookup, but is there a way to achieve this without
overhead of additional lookup ?

Thanks !

-- 
Regards,
Pandian


Re: Dynamic Columns

2015-01-20 Thread chetan verma
Hi,

Adding to previous mail. For example: We have a column family named review
(with some arbitrary data in map).

CREATE TABLE review(
product_id bigint,
created_at timestamp,
data_int maptext, int,
data_text maptext, text,
PRIMARY KEY (product_id, created_at)
);

Assume that these 2 maps I use to store arbitrary data (i.e. data_int and
data_text for int and text values)
when we see output on cassandra-cli, it looks like in a partition as :
clustering_key:data_int:map_key as column name and value as map value.
suppose I need to get this value, I couldn't do that with CQL3 but in
thrift its possible. Any Solution?

On Wed, Jan 21, 2015 at 1:06 AM, chetan verma chetanverm...@gmail.com
wrote:

 Hi,

 Most of the time I will  be querying on product_id and created_at, but for
 analytic I need to query almost on all column.
 Multiple collections ideas is good but the only is cassandra reads a
 collection entirely, what if I need a slice of it, I mean
 columns for certain keys which is possible with thrift. Please suggest.

 On Wed, Jan 21, 2015 at 12:36 AM, Jonathan Lacefield 
 jlacefi...@datastax.com wrote:

 Hello,

 There are probably lots of options to this challenge.  The more details
 around your use case that you can provide, the easier it will be for this
 group to offer advice.

 A few follow-up questions:
   - How will you query this data?
   - Do your queries require filtering on specific columns other than
 product_id and created_at, i.e. the dynamic columns?

 Depending on the answers to these questions, you have several options, of
 which here are a few:

- Cassandra efficiently stores sparse data, so you could create
columns and not populate them, without much of a penalty
- Could use a clustering column to store a columns type and another
col (potentially clustering) to store the value
   - i.e. CREATE TABLE foo (col1 int, attname text, attvalue text,
   col4...n, PRIMARY KEY (col1, attname, attvalue));
   - where attname stores the name of the attribute/column and
   attvalue stores the value of that attribute
   - have seen users use this model and create a main attribute row
   within a partition that stores the values associated with col4...n
- Could store multiple collections
- Others probably have ideas as well

 You may want to look in the archives for a similar discussion topic.
 Believe this item was asked a few months ago as well.

 [image: datastax_logo.png]

 Jonathan Lacefield

 Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

 [image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax https://github.com/datastax/

 On Tue, Jan 20, 2015 at 1:40 PM, chetan verma chetanverm...@gmail.com
 wrote:

 Hi,

 I am creating a review system. for instance lets assume following are
 the attibutes of system:

 Review{
 id bigint,
 product_id bigint,
 created_at timestamp,
 summary text,
 description text,
 pros settext,
 cons settext,
 feature_rating maptext, int
 etc
 }
 I created partition key as product_id (so that all the reviews for a
 given product will reside on same node)
 and clustering key as created_at and id (Desc) so that  reviews will be
 sorted by time.

 I can have more column and that requirement I want to fulfil by dynamic
 columns but there are limitations to it explained above.
 Could you please let me know the best way.

 On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield 
 jlacefi...@datastax.com wrote:

 Hello,

   Have you looked at solving this challenge with clustering columns?
 Also, please describe the problem set details for more specific advice from
 this group.

   Starting new projects on Thrift isn't the recommended approach.

 Jonathan

 [image: datastax_logo.png]

 Jonathan Lacefield

 Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

 [image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax https://github.com/datastax/

 On Tue, Jan 20, 2015 at 1:24 PM, chetan verma chetanverm...@gmail.com
 wrote:

 Hi,

 I am starting a new project with cassandra as database.
 I have unstructured data so I need dynamic columns,
 though in CQL3 we can achive this via Collections but there are some
 downsides to it.
 1. Collections are used to store small amount of data.
 2. The maximum size of an item in a collection is 64K.
 3. Cassandra reads a collection in its entirety.
 4. Restrictions on number of items in collections is 64,000

 And no support to get single column by map key, which is possible via
 cassandra cli.
 Please suggest whether I should use CQL3 or Thrift and which driver is
 

Re: How to know disk utilization by each row on a node

2015-01-20 Thread Jens Rantil
Hi,

Datastax comes with sstablekeys that does that. You could also use sstable2json 
script to find keys.

Cheers,
Jens

On Tue, Jan 20, 2015 at 2:53 PM, Edson Marquezani Filho
edsonmarquez...@gmail.com wrote:

 Hello, everybody.
 Does anyone know a way to list, for an arbitrary column family, all
 the rows owned (including replicas) by a given node and the data size
 (real size or disk occupation) of each one of them on that node?
 I would like to do that because I have data on one of my nodes growing
 faster than the others, although rows (and replicas) seem evenly
 distributed across the cluster. So, I would like to verify if I have
 some specific rows growing too much.
 Thank you.

Re: Dynamic Columns

2015-01-20 Thread chetan verma
Hi,

Most of the time I will  be querying on product_id and created_at, but for
analytic I need to query almost on all column.
Multiple collections ideas is good but the only is cassandra reads a
collection entirely, what if I need a slice of it, I mean
columns for certain keys which is possible with thrift. Please suggest.

On Wed, Jan 21, 2015 at 12:36 AM, Jonathan Lacefield 
jlacefi...@datastax.com wrote:

 Hello,

 There are probably lots of options to this challenge.  The more details
 around your use case that you can provide, the easier it will be for this
 group to offer advice.

 A few follow-up questions:
   - How will you query this data?
   - Do your queries require filtering on specific columns other than
 product_id and created_at, i.e. the dynamic columns?

 Depending on the answers to these questions, you have several options, of
 which here are a few:

- Cassandra efficiently stores sparse data, so you could create
columns and not populate them, without much of a penalty
- Could use a clustering column to store a columns type and another
col (potentially clustering) to store the value
   - i.e. CREATE TABLE foo (col1 int, attname text, attvalue text,
   col4...n, PRIMARY KEY (col1, attname, attvalue));
   - where attname stores the name of the attribute/column and
   attvalue stores the value of that attribute
   - have seen users use this model and create a main attribute row
   within a partition that stores the values associated with col4...n
- Could store multiple collections
- Others probably have ideas as well

 You may want to look in the archives for a similar discussion topic.
 Believe this item was asked a few months ago as well.

 [image: datastax_logo.png]

 Jonathan Lacefield

 Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

 [image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax https://github.com/datastax/

 On Tue, Jan 20, 2015 at 1:40 PM, chetan verma chetanverm...@gmail.com
 wrote:

 Hi,

 I am creating a review system. for instance lets assume following are the
 attibutes of system:

 Review{
 id bigint,
 product_id bigint,
 created_at timestamp,
 summary text,
 description text,
 pros settext,
 cons settext,
 feature_rating maptext, int
 etc
 }
 I created partition key as product_id (so that all the reviews for a
 given product will reside on same node)
 and clustering key as created_at and id (Desc) so that  reviews will be
 sorted by time.

 I can have more column and that requirement I want to fulfil by dynamic
 columns but there are limitations to it explained above.
 Could you please let me know the best way.

 On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield 
 jlacefi...@datastax.com wrote:

 Hello,

   Have you looked at solving this challenge with clustering columns?
 Also, please describe the problem set details for more specific advice from
 this group.

   Starting new projects on Thrift isn't the recommended approach.

 Jonathan

 [image: datastax_logo.png]

 Jonathan Lacefield

 Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

 [image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax https://github.com/datastax/

 On Tue, Jan 20, 2015 at 1:24 PM, chetan verma chetanverm...@gmail.com
 wrote:

 Hi,

 I am starting a new project with cassandra as database.
 I have unstructured data so I need dynamic columns,
 though in CQL3 we can achive this via Collections but there are some
 downsides to it.
 1. Collections are used to store small amount of data.
 2. The maximum size of an item in a collection is 64K.
 3. Cassandra reads a collection in its entirety.
 4. Restrictions on number of items in collections is 64,000

 And no support to get single column by map key, which is possible via
 cassandra cli.
 Please suggest whether I should use CQL3 or Thrift and which driver is
 best.

 --
 *Regards,*
 *Chetan Verma*
 *+91 99860 86634 %2B91%2099860%2086634*





 --
 *Regards,*
 *Chetan Verma*
 *+91 99860 86634 %2B91%2099860%2086634*





-- 
*Regards,*
*Chetan Verma*
*+91 99860 86634*


Re: Dynamic Columns

2015-01-20 Thread Jonathan Lacefield
Hello,

There are probably lots of options to this challenge.  The more details
around your use case that you can provide, the easier it will be for this
group to offer advice.

A few follow-up questions:
  - How will you query this data?
  - Do your queries require filtering on specific columns other than
product_id and created_at, i.e. the dynamic columns?

Depending on the answers to these questions, you have several options, of
which here are a few:

   - Cassandra efficiently stores sparse data, so you could create columns
   and not populate them, without much of a penalty
   - Could use a clustering column to store a columns type and another col
   (potentially clustering) to store the value
  - i.e. CREATE TABLE foo (col1 int, attname text, attvalue text,
  col4...n, PRIMARY KEY (col1, attname, attvalue));
  - where attname stores the name of the attribute/column and attvalue
  stores the value of that attribute
  - have seen users use this model and create a main attribute row
  within a partition that stores the values associated with col4...n
   - Could store multiple collections
   - Others probably have ideas as well

You may want to look in the archives for a similar discussion topic.
Believe this item was asked a few months ago as well.

[image: datastax_logo.png]

Jonathan Lacefield

Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

[image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
facebook.png] https://www.facebook.com/datastax [image: twitter.png]
https://twitter.com/datastax [image: g+.png]
https://plus.google.com/+Datastax/about
http://feeds.feedburner.com/datastax https://github.com/datastax/

On Tue, Jan 20, 2015 at 1:40 PM, chetan verma chetanverm...@gmail.com
wrote:

 Hi,

 I am creating a review system. for instance lets assume following are the
 attibutes of system:

 Review{
 id bigint,
 product_id bigint,
 created_at timestamp,
 summary text,
 description text,
 pros settext,
 cons settext,
 feature_rating maptext, int
 etc
 }
 I created partition key as product_id (so that all the reviews for a given
 product will reside on same node)
 and clustering key as created_at and id (Desc) so that  reviews will be
 sorted by time.

 I can have more column and that requirement I want to fulfil by dynamic
 columns but there are limitations to it explained above.
 Could you please let me know the best way.

 On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield 
 jlacefi...@datastax.com wrote:

 Hello,

   Have you looked at solving this challenge with clustering columns?
 Also, please describe the problem set details for more specific advice from
 this group.

   Starting new projects on Thrift isn't the recommended approach.

 Jonathan

 [image: datastax_logo.png]

 Jonathan Lacefield

 Solution Architect | (404) 822 3487 | jlacefi...@datastax.com

 [image: linkedin.png] http://www.linkedin.com/in/jlacefield/ [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax https://github.com/datastax/

 On Tue, Jan 20, 2015 at 1:24 PM, chetan verma chetanverm...@gmail.com
 wrote:

 Hi,

 I am starting a new project with cassandra as database.
 I have unstructured data so I need dynamic columns,
 though in CQL3 we can achive this via Collections but there are some
 downsides to it.
 1. Collections are used to store small amount of data.
 2. The maximum size of an item in a collection is 64K.
 3. Cassandra reads a collection in its entirety.
 4. Restrictions on number of items in collections is 64,000

 And no support to get single column by map key, which is possible via
 cassandra cli.
 Please suggest whether I should use CQL3 or Thrift and which driver is
 best.

 --
 *Regards,*
 *Chetan Verma*
 *+91 99860 86634 %2B91%2099860%2086634*





 --
 *Regards,*
 *Chetan Verma*
 *+91 99860 86634 %2B91%2099860%2086634*