Hi, Most of the time I will be querying on product_id and created_at, but for analytic I need to query almost on all column. Multiple collections ideas is good but the only is cassandra reads a collection entirely, what if I need a slice of it, I mean columns for certain keys which is possible with thrift. Please suggest.
On Wed, Jan 21, 2015 at 12:36 AM, Jonathan Lacefield < jlacefi...@datastax.com> wrote: > Hello, > > There are probably lots of options to this challenge. The more details > around your use case that you can provide, the easier it will be for this > group to offer advice. > > A few follow-up questions: > - How will you query this data? > - Do your queries require filtering on specific columns other than > product_id and created_at, i.e. the dynamic columns? > > Depending on the answers to these questions, you have several options, of > which here are a few: > > - Cassandra efficiently stores sparse data, so you could create > columns and not populate them, without much of a penalty > - Could use a clustering column to store a columns type and another > col (potentially clustering) to store the value > - i.e. CREATE TABLE foo (col1 int, attname text, attvalue text, > col4...n, PRIMARY KEY (col1, attname, attvalue)); > - where attname stores the name of the attribute/column and > attvalue stores the value of that attribute > - have seen users use this model and create a "main" attribute row > within a partition that stores the values associated with col4...n > - Could store multiple collections > - Others probably have ideas as well > > You may want to look in the archives for a similar discussion topic. > Believe this item was asked a few months ago as well. > > [image: datastax_logo.png] > > Jonathan Lacefield > > Solution Architect | (404) 822 3487 | jlacefi...@datastax.com > > [image: linkedin.png] <http://www.linkedin.com/in/jlacefield/> [image: > facebook.png] <https://www.facebook.com/datastax> [image: twitter.png] > <https://twitter.com/datastax> [image: g+.png] > <https://plus.google.com/+Datastax/about> > <http://feeds.feedburner.com/datastax> <https://github.com/datastax/> > > On Tue, Jan 20, 2015 at 1:40 PM, chetan verma <chetanverm...@gmail.com> > wrote: > >> Hi, >> >> I am creating a review system. for instance lets assume following are the >> attibutes of system: >> >> Review{ >> id bigint, >> product_id bigint, >> created_at timestamp, >> summary text, >> description text, >> pros set<text>, >> cons set<text>, >> feature_rating map<text, int> >> etc.... >> } >> I created partition key as product_id (so that all the reviews for a >> given product will reside on same node) >> and clustering key as created_at and id (Desc) so that reviews will be >> sorted by time. >> >> I can have more column and that requirement I want to fulfil by dynamic >> columns but there are limitations to it explained above. >> Could you please let me know the best way. >> >> On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield < >> jlacefi...@datastax.com> wrote: >> >>> Hello, >>> >>> Have you looked at solving this challenge with clustering columns? >>> Also, please describe the problem set details for more specific advice from >>> this group. >>> >>> Starting new projects on Thrift isn't the recommended approach. >>> >>> Jonathan >>> >>> [image: datastax_logo.png] >>> >>> Jonathan Lacefield >>> >>> Solution Architect | (404) 822 3487 | jlacefi...@datastax.com >>> >>> [image: linkedin.png] <http://www.linkedin.com/in/jlacefield/> [image: >>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png] >>> <https://twitter.com/datastax> [image: g+.png] >>> <https://plus.google.com/+Datastax/about> >>> <http://feeds.feedburner.com/datastax> <https://github.com/datastax/> >>> >>> On Tue, Jan 20, 2015 at 1:24 PM, chetan verma <chetanverm...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> I am starting a new project with cassandra as database. >>>> I have unstructured data so I need dynamic columns, >>>> though in CQL3 we can achive this via Collections but there are some >>>> downsides to it. >>>> 1. Collections are used to store small amount of data. >>>> 2. The maximum size of an item in a collection is 64K. >>>> 3. Cassandra reads a collection in its entirety. >>>> 4. Restrictions on number of items in collections is 64,000 >>>> >>>> And no support to get single column by map key, which is possible via >>>> cassandra cli. >>>> Please suggest whether I should use CQL3 or Thrift and which driver is >>>> best. >>>> >>>> -- >>>> *Regards,* >>>> *Chetan Verma* >>>> *+91 99860 86634 <%2B91%2099860%2086634>* >>>> >>> >>> >> >> >> -- >> *Regards,* >> *Chetan Verma* >> *+91 99860 86634 <%2B91%2099860%2086634>* >> > > -- *Regards,* *Chetan Verma* *+91 99860 86634*