Re: Dynamic Columns

chetan verma Tue, 20 Jan 2015 12:45:06 -0800

Hi,

Adding to previous mail. For example: We have a column family named review
(with some arbitrary data in map).


CREATE TABLE review(
product_id bigint,
created_at timestamp,
data_int map<text, int>,
data_text map<text, text>,
PRIMARY KEY (product_id, created_at)
);

Assume that these 2 maps I use to store arbitrary data (i.e. data_int and
data_text for int and text values)
when we see output on cassandra-cli, it looks like in a partition as :
<clustering_key>:data_int:map_key as column name and value as map value.
suppose I need to get this value, I couldn't do that with CQL3 but in
thrift its possible. Any Solution?

On Wed, Jan 21, 2015 at 1:06 AM, chetan verma <chetanverm...@gmail.com>
wrote:

> Hi,
>
> Most of the time I will  be querying on product_id and created_at, but for
> analytic I need to query almost on all column.
> Multiple collections ideas is good but the only is cassandra reads a
> collection entirely, what if I need a slice of it, I mean
> columns for certain keys which is possible with thrift. Please suggest.
>
> On Wed, Jan 21, 2015 at 12:36 AM, Jonathan Lacefield <
> jlacefi...@datastax.com> wrote:
>
>> Hello,
>>
>> There are probably lots of options to this challenge.  The more details
>> around your use case that you can provide, the easier it will be for this
>> group to offer advice.
>>
>> A few follow-up questions:
>>   - How will you query this data?
>>   - Do your queries require filtering on specific columns other than
>> product_id and created_at, i.e. the dynamic columns?
>>
>> Depending on the answers to these questions, you have several options, of
>> which here are a few:
>>
>>    - Cassandra efficiently stores sparse data, so you could create
>>    columns and not populate them, without much of a penalty
>>    - Could use a clustering column to store a columns type and another
>>    col (potentially clustering) to store the value
>>       - i.e. CREATE TABLE foo (col1 int, attname text, attvalue text,
>>       col4...n, PRIMARY KEY (col1, attname, attvalue));
>>       - where attname stores the name of the attribute/column and
>>       attvalue stores the value of that attribute
>>       - have seen users use this model and create a "main" attribute row
>>       within a partition that stores the values associated with col4...n
>>    - Could store multiple collections
>>    - Others probably have ideas as well
>>
>> You may want to look in the archives for a similar discussion topic.
>> Believe this item was asked a few months ago as well.
>>
>> [image: datastax_logo.png]
>>
>> Jonathan Lacefield
>>
>> Solution Architect | (404) 822 3487 | jlacefi...@datastax.com
>>
>> [image: linkedin.png] <http://www.linkedin.com/in/jlacefield/> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax> <https://github.com/datastax/>
>>
>> On Tue, Jan 20, 2015 at 1:40 PM, chetan verma <chetanverm...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I am creating a review system. for instance lets assume following are
>>> the attibutes of system:
>>>
>>> Review{
>>> id bigint,
>>> product_id bigint,
>>> created_at timestamp,
>>> summary text,
>>> description text,
>>> pros set<text>,
>>> cons set<text>,
>>> feature_rating map<text, int>
>>> etc....
>>> }
>>> I created partition key as product_id (so that all the reviews for a
>>> given product will reside on same node)
>>> and clustering key as created_at and id (Desc) so that  reviews will be
>>> sorted by time.
>>>
>>> I can have more column and that requirement I want to fulfil by dynamic
>>> columns but there are limitations to it explained above.
>>> Could you please let me know the best way.
>>>
>>> On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield <
>>> jlacefi...@datastax.com> wrote:
>>>
>>>> Hello,
>>>>
>>>>   Have you looked at solving this challenge with clustering columns?
>>>> Also, please describe the problem set details for more specific advice from
>>>> this group.
>>>>
>>>>   Starting new projects on Thrift isn't the recommended approach.
>>>>
>>>> Jonathan
>>>>
>>>> [image: datastax_logo.png]
>>>>
>>>> Jonathan Lacefield
>>>>
>>>> Solution Architect | (404) 822 3487 | jlacefi...@datastax.com
>>>>
>>>> [image: linkedin.png] <http://www.linkedin.com/in/jlacefield/> [image:
>>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>>>> <https://twitter.com/datastax> [image: g+.png]
>>>> <https://plus.google.com/+Datastax/about>
>>>> <http://feeds.feedburner.com/datastax> <https://github.com/datastax/>
>>>>
>>>> On Tue, Jan 20, 2015 at 1:24 PM, chetan verma <chetanverm...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am starting a new project with cassandra as database.
>>>>> I have unstructured data so I need dynamic columns,
>>>>> though in CQL3 we can achive this via Collections but there are some
>>>>> downsides to it.
>>>>> 1. Collections are used to store small amount of data.
>>>>> 2. The maximum size of an item in a collection is 64K.
>>>>> 3. Cassandra reads a collection in its entirety.
>>>>> 4. Restrictions on number of items in collections is 64,000
>>>>>
>>>>> And no support to get single column by map key, which is possible via
>>>>> cassandra cli.
>>>>> Please suggest whether I should use CQL3 or Thrift and which driver is
>>>>> best.
>>>>>
>>>>> --
>>>>> *Regards,*
>>>>> *Chetan Verma*
>>>>> *+91 99860 86634 <%2B91%2099860%2086634>*
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> *Regards,*
>>> *Chetan Verma*
>>> *+91 99860 86634 <%2B91%2099860%2086634>*
>>>
>>
>>
>
>
> --
> *Regards,*
> *Chetan Verma*
> *+91 99860 86634 <%2B91%2099860%2086634>*
>



-- 
*Regards,*
*Chetan Verma*
*+91 99860 86634*

Re: Dynamic Columns

Reply via email to