Re: DataModel Question

2013-02-07 Thread Edward Capriolo
Go day / phone instead of phone / day this way you won't have a rk growing
forever .

A comprise would be month / phone as the row key and then use the date time
as the first part of a composite column.

On Thursday, February 7, 2013, Kanwar Sangha kan...@mavenir.com wrote:
 Thanks Aaron !



 My use case is modeled like “skype” which stores IM + SMS + MMS in one
conversation.



 I need to have the following functionality –



 ·When I go offline and come online again, I need to retrieve all
pending messages from all my conversations.

 ·I should be able to select a contact and view the ‘history’ of
the messages (last 7 days, last 14 days, last 21 days…)

 ·If I log in to a different device, I should be able to synch at
least a “few days” of messages.

 ·One conversation can have multiple participants.

 ·Support full synch or delta synch based on number of
messages/history.



 I guess this makes the data model span across many CFs ?









 From: aaron morton [mailto:aa...@thelastpickle.com]
 Sent: 06 February 2013 22:20
 To: user@cassandra.apache.org
 Subject: Re: DataModel Question



 2)  DynamicComposites : I read somewhere that they are not
recommended ?

 You probably wont need them.



 Your current model will not sort message by the time they arrive in a
day. The sort order will be based on Message type and the message ID.



 I'm assuming you want to order messages, so put the time uuid at the
start of the composite columns. If you often want to get the most recent
messages use a reverse comparator.



 You could probably also have wider rows if you want to, not sure how many
messages kids send a day but you may get by with weekly partitions.



 The CLI model could be:

 row_key: phone_number : day

 column: time_uuid : message_id : message_type



 You could also pack extra data used JSON, ProtoBuffers etc and store more
that just the message in the column value.



 If you use using CQL 3 consider this:



 create table messages (

 phone_numbertext,

 day
timestamp,

 message_sequence timeuuid, # your timestamp

 message_id integer,

 message_type text,

 message_bodytext

 ) with PRIMARY KEY ( (phone_number, day), message_sequence, message_id)



 (phone_number, day) is the partition key, same the thrift row key.



  message_sequence, message_id is the grouping columns, all instances will
be grouped / ordered by these columns.



 Hope that helps.







 -

 Aaron Morton

 Freelance Cassandra Developer

 New Zealand



 @aaronmorton

 http://www.thelastpickle.com


Re: DataModel Question

2013-02-07 Thread aaron morton
 Go day / phone instead of phone / day this way you won't have a rk growing 
 forever .
Not sure I understand. 

+1 for month partition.

 When I go offline and come online again, I need to retrieve all pending 
 messages from all my conversations.
You need to have some sort of token that includes the last time stamp seen by 
the client. Then make as many queries as necessary to get the missing data. 

  I guess this makes the data model span across many CFs ?
Yes. 
Sorry I have not considered conversations. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 8/02/2013, at 3:04 AM, Edward Capriolo edlinuxg...@gmail.com wrote:

 Go day / phone instead of phone / day this way you won't have a rk growing 
 forever .
 
 A comprise would be month / phone as the row key and then use the date time 
 as the first part of a composite column. 
 
 On Thursday, February 7, 2013, Kanwar Sangha kan...@mavenir.com wrote:
  Thanks Aaron !
 
   
 
  My use case is modeled like “skype” which stores IM + SMS + MMS in one 
  conversation.
 
   
 
  I need to have the following functionality –
 
   
 
  ·When I go offline and come online again, I need to retrieve all 
  pending messages from all my conversations.
 
  ·I should be able to select a contact and view the ‘history’ of the 
  messages (last 7 days, last 14 days, last 21 days…)
 
  ·If I log in to a different device, I should be able to synch at 
  least a “few days” of messages.
 
  ·One conversation can have multiple participants.
 
  ·Support full synch or delta synch based on number of 
  messages/history.
 
   
 
  I guess this makes the data model span across many CFs ?
 
   
 
   
 
   
 
   
 
  From: aaron morton [mailto:aa...@thelastpickle.com]
  Sent: 06 February 2013 22:20
  To: user@cassandra.apache.org
  Subject: Re: DataModel Question
 
   
 
  2)  DynamicComposites : I read somewhere that they are not recommended ?
 
  You probably wont need them. 
 
   
 
  Your current model will not sort message by the time they arrive in a day. 
  The sort order will be based on Message type and the message ID. 
 
   
 
  I'm assuming you want to order messages, so put the time uuid at the start 
  of the composite columns. If you often want to get the most recent messages 
  use a reverse comparator. 
 
   
 
  You could probably also have wider rows if you want to, not sure how many 
  messages kids send a day but you may get by with weekly partitions. 
 
   
 
  The CLI model could be:
 
  row_key: phone_number : day
 
  column: time_uuid : message_id : message_type 
 
   
 
  You could also pack extra data used JSON, ProtoBuffers etc and store more 
  that just the message in the column value. 
 
   
 
  If you use using CQL 3 consider this:
 
   
 
  create table messages (
 
  phone_numbertext, 
 
  day  
  timestamp, 
 
  message_sequence timeuuid, # your timestamp
 
  message_id integer, 
 
  message_type text, 
 
  message_bodytext
 
  ) with PRIMARY KEY ( (phone_number, day), message_sequence, message_id)
 
   
 
  (phone_number, day) is the partition key, same the thrift row key. 
 
   
 
   message_sequence, message_id is the grouping columns, all instances will 
  be grouped / ordered by these columns. 
 
   
 
  Hope that helps. 
 
   
 
   
 
   
 
  -
 
  Aaron Morton
 
  Freelance Cassandra Developer
 
  New Zealand
 
   
 
  @aaronmorton
 
  http://www.thelastpickle.com



RE: DataModel Question

2013-02-06 Thread Kanwar Sangha
1)  Version is 1.2

2)  DynamicComposites : I read somewhere that they are not recommended ?

3)  Good point. I need to think about that one.



From: Tamar Fraenkel [mailto:ta...@tok-media.com]
Sent: 06 February 2013 00:50
To: user@cassandra.apache.org
Subject: Re: DataModel Question

Hi!
I have couple of questions regarding your model:
 1. What Cassandra version are you using? I am still working with 1.0 and this 
seems to make sense, but 1.2 gives you much more power I think.
 2. Maybe I don't understand your model, but I think you need  DynamicComposite 
columns, as user columns are different in number of components and maybe type.
 3. How do you associate between the SMS or MMS and the user you are chating 
with. Is it done by a separate CF?
Thanks,
Tamar


Tamar Fraenkel
Senior Software Engineer, TOK Media
[Inline image 1]

ta...@tok-media.commailto:ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956



On Wed, Feb 6, 2013 at 8:23 AM, Vivek Mishra 
mishra.v...@gmail.commailto:mishra.v...@gmail.com wrote:
Avoid super columns. If you need Sorted, wide rows then go for Composite 
columns.

-Vivek

On Wed, Feb 6, 2013 at 7:09 AM, Kanwar Sangha 
kan...@mavenir.commailto:kan...@mavenir.com wrote:
Hi -  We are designing a Cassandra based storage for the following use cases-


*Store SMS messages

*Store MMS messages

*Store Chat history

What would be the ideal was to design the data model for this kind of 
application ? I am thinking on these lines ..

Row-Key :  Composite key [ PhoneNum : Day]


*Example:   19876543456:05022013

Dynamic Column Families


*Composite column key for SMS [SMS:MessageId:TimeUUID]

*Composite column key for MMS [MMS:MessageId:TimeUUID]

*Composite column key for user I am chatting with [UserId:198765432345] 
- This can have multiple values since each chat conv can have many messages. 
Should this be a super column ?


198:05022013

SMS::ttt

SMS:xxx12:ttt

MMS::ttt

:19

198:05022013









1987888:05022013











Thanks,
Kanwar




inline: image001.png

Re: DataModel Question

2013-02-06 Thread aaron morton
 2)  DynamicComposites : I read somewhere that they are not recommended ?
You probably wont need them. 

Your current model will not sort message by the time they arrive in a day. The 
sort order will be based on Message type and the message ID. 

I'm assuming you want to order messages, so put the time uuid at the start of 
the composite columns. If you often want to get the most recent messages use a 
reverse comparator. 

You could probably also have wider rows if you want to, not sure how many 
messages kids send a day but you may get by with weekly partitions. 

The CLI model could be:
row_key: phone_number : day
column: time_uuid : message_id : message_type 

You could also pack extra data used JSON, ProtoBuffers etc and store more that 
just the message in the column value. 

If you use using CQL 3 consider this:

create table messages (
phone_numbertext, 
day timestamp, 
message_sequencetimeuuid, # your timestamp
message_id  integer, 
message_typetext, 
message_bodytext
) with PRIMARY KEY ( (phone_number, day), message_sequence, message_id)

(phone_number, day) is the partition key, same the thrift row key. 

 message_sequence, message_id is the grouping columns, all instances will be 
grouped / ordered by these columns. 

Hope that helps. 



-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 7/02/2013, at 1:47 AM, Kanwar Sangha kan...@mavenir.com wrote:

 1)  Version is 1.2
 2)  DynamicComposites : I read somewhere that they are not recommended ?
 3)  Good point. I need to think about that one.
  
  
  
 From: Tamar Fraenkel [mailto:ta...@tok-media.com] 
 Sent: 06 February 2013 00:50
 To: user@cassandra.apache.org
 Subject: Re: DataModel Question
  
 Hi!
 I have couple of questions regarding your model:
 
  1. What Cassandra version are you using? I am still working with 1.0 and 
 this seems to make sense, but 1.2 gives you much more power I think.
  2. Maybe I don't understand your model, but I think you need  
 DynamicComposite columns, as user columns are different in number of 
 components and maybe type.
  3. How do you associate between the SMS or MMS and the user you are chating 
 with. Is it done by a separate CF?
 
 Thanks,
 Tamar
  
 
 Tamar Fraenkel 
 Senior Software Engineer, TOK Media 
 
 image001.png
 
 ta...@tok-media.com
 Tel:   +972 2 6409736 
 Mob:  +972 54 8356490 
 Fax:   +972 2 5612956 
  
  
  
 
 On Wed, Feb 6, 2013 at 8:23 AM, Vivek Mishra mishra.v...@gmail.com wrote:
 Avoid super columns. If you need Sorted, wide rows then go for Composite 
 columns. 
 
 -Vivek
  
 
 On Wed, Feb 6, 2013 at 7:09 AM, Kanwar Sangha kan...@mavenir.com wrote:
 Hi –  We are designing a Cassandra based storage for the following use cases-
  
 ·Store SMS messages
 
 ·Store MMS messages
 
 ·Store Chat history
 
  
 What would be the ideal was to design the data model for this kind of 
 application ? I am thinking on these lines ..
  
 Row-Key :  Composite key [ PhoneNum : Day]
  
 ·Example:   19876543456:05022013
 
  
 Dynamic Column Families
  
 ·Composite column key for SMS [SMS:MessageId:TimeUUID]
 
 ·Composite column key for MMS [MMS:MessageId:TimeUUID]
 
 ·Composite column key for user I am chatting with 
 [UserId:198765432345] – This can have multiple values since each chat conv 
 can have many messages. Should this be a super column ?
 
  
  
 198:05022013
 SMS::ttt
 SMS:xxx12:ttt
 MMS::ttt
 :19
 198:05022013
  
  
  
  
 1987888:05022013
  
  
  
  
  
  
 Thanks,
 Kanwar
  
 
  



RE: DataModel Question

2013-02-06 Thread Kanwar Sangha
Thanks Aaron !

My use case is modeled like skype which stores IM + SMS + MMS in one 
conversation.

I need to have the following functionality -


*When I go offline and come online again, I need to retrieve all 
pending messages from all my conversations.

*I should be able to select a contact and view the 'history' of the 
messages (last 7 days, last 14 days, last 21 days...)

*If I log in to a different device, I should be able to synch at least 
a few days of messages.

*One conversation can have multiple participants.

*Support full synch or delta synch based on number of messages/history.

I guess this makes the data model span across many CFs ?




From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: 06 February 2013 22:20
To: user@cassandra.apache.org
Subject: Re: DataModel Question

2)  DynamicComposites : I read somewhere that they are not recommended ?
You probably wont need them.

Your current model will not sort message by the time they arrive in a day. The 
sort order will be based on Message type and the message ID.

I'm assuming you want to order messages, so put the time uuid at the start of 
the composite columns. If you often want to get the most recent messages use a 
reverse comparator.

You could probably also have wider rows if you want to, not sure how many 
messages kids send a day but you may get by with weekly partitions.

The CLI model could be:
row_key: phone_number : day
column: time_uuid : message_id : message_type

You could also pack extra data used JSON, ProtoBuffers etc and store more that 
just the message in the column value.

If you use using CQL 3 consider this:

create table messages (
phone_numbertext,
day  timestamp,
message_sequence timeuuid, # your timestamp
message_id integer,
message_type text,
message_bodytext
) with PRIMARY KEY ( (phone_number, day), message_sequence, message_id)

(phone_number, day) is the partition key, same the thrift row key.

 message_sequence, message_id is the grouping columns, all instances will be 
grouped / ordered by these columns.

Hope that helps.



-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 7/02/2013, at 1:47 AM, Kanwar Sangha 
kan...@mavenir.commailto:kan...@mavenir.com wrote:


1)  Version is 1.2
2)  DynamicComposites : I read somewhere that they are not recommended ?
3)  Good point. I need to think about that one.



From: Tamar Fraenkel [mailto:ta...@tok-media.comhttp://tok-media.com]
Sent: 06 February 2013 00:50
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: DataModel Question

Hi!
I have couple of questions regarding your model:
 1. What Cassandra version are you using? I am still working with 1.0 and this 
seems to make sense, but 1.2 gives you much more power I think.
 2. Maybe I don't understand your model, but I think you need  DynamicComposite 
columns, as user columns are different in number of components and maybe type.
 3. How do you associate between the SMS or MMS and the user you are chating 
with. Is it done by a separate CF?
Thanks,
Tamar


Tamar Fraenkel
Senior Software Engineer, TOK Media
image001.png

ta...@tok-media.commailto:ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956



On Wed, Feb 6, 2013 at 8:23 AM, Vivek Mishra 
mishra.v...@gmail.commailto:mishra.v...@gmail.com wrote:
Avoid super columns. If you need Sorted, wide rows then go for Composite 
columns.

-Vivek

On Wed, Feb 6, 2013 at 7:09 AM, Kanwar Sangha 
kan...@mavenir.commailto:kan...@mavenir.com wrote:
Hi -  We are designing a Cassandra based storage for the following use cases-


*Store SMS messages

*Store MMS messages

*Store Chat history

What would be the ideal was to design the data model for this kind of 
application ? I am thinking on these lines ..

Row-Key :  Composite key [ PhoneNum : Day]


*Example:   19876543456:05022013

Dynamic Column Families


*Composite column key for SMS [SMS:MessageId:TimeUUID]

*Composite column key for MMS [MMS:MessageId:TimeUUID]

*Composite column key for user I am chatting with [UserId:198765432345] 
- This can have multiple values since each chat conv can have many messages. 
Should this be a super column ?


198:05022013

SMS::ttt

SMS:xxx12:ttt

MMS::ttt

:19

198:05022013









1987888:05022013











Thanks,
Kanwar






DataModel Question

2013-02-05 Thread Kanwar Sangha
Hi -  We are designing a Cassandra based storage for the following use cases-


*Store SMS messages

*Store MMS messages

*Store Chat history

What would be the ideal was to design the data model for this kind of 
application ? I am thinking on these lines ..

Row-Key :  Composite key [ PhoneNum : Day]


*Example:   19876543456:05022013

Dynamic Column Families


*Composite column key for SMS [SMS:MessageId:TimeUUID]

*Composite column key for MMS [MMS:MessageId:TimeUUID]

*Composite column key for user I am chatting with [UserId:198765432345] 
- This can have multiple values since each chat conv can have many messages. 
Should this be a super column ?


198:05022013

SMS::ttt

SMS:xxx12:ttt

MMS::ttt

:19

198:05022013









1987888:05022013











Thanks,
Kanwar




RE: DataModel Question

2013-02-05 Thread Rishabh Agrawal
Hello,

Composite keys are always good  and model looks clean to me. Run pilot with 
around 10 GB or more data and compare it with RDBMS and make changes 
accordingly.

Thanks and Regards
Rishabh Agrawal

From: Kanwar Sangha [mailto:kan...@mavenir.com]
Sent: Wednesday, February 06, 2013 7:10 AM
To: user@cassandra.apache.org
Subject: DataModel Question

Hi -  We are designing a Cassandra based storage for the following use cases-


* Store SMS messages

* Store MMS messages

* Store Chat history

What would be the ideal was to design the data model for this kind of 
application ? I am thinking on these lines ..

Row-Key :  Composite key [ PhoneNum : Day]


* Example:   19876543456:05022013

Dynamic Column Families


* Composite column key for SMS [SMS:MessageId:TimeUUID]

* Composite column key for MMS [MMS:MessageId:TimeUUID]

* Composite column key for user I am chatting with 
[UserId:198765432345] - This can have multiple values since each chat conv can 
have many messages. Should this be a super column ?


198:05022013

SMS::ttt

SMS:xxx12:ttt

MMS::ttt

:19

198:05022013









1987888:05022013











Thanks,
Kanwar










NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.


Re: DataModel Question

2013-02-05 Thread Vivek Mishra
Avoid super columns. If you need Sorted, wide rows then go for Composite
columns.

-Vivek

On Wed, Feb 6, 2013 at 7:09 AM, Kanwar Sangha kan...@mavenir.com wrote:

  Hi –  We are designing a Cassandra based storage for the following use
 cases-

 ** **

 **·**Store SMS messages

 **·**Store MMS messages

 **·**Store Chat history

 ** **

 What would be the ideal was to design the data model for this kind of
 application ? I am thinking on these lines ..

 ** **

 Row-Key :  Composite key [ PhoneNum : Day]

 ** **

 **·**Example:   19876543456:05022013

 ** **

 Dynamic Column Families

 ** **

 **·**Composite column key for SMS [SMS:MessageId:TimeUUID]

 **·**Composite column key for MMS [MMS:MessageId:TimeUUID]

 **·**Composite column key for user I am chatting with
 [UserId:198765432345] – This can have multiple values since each chat conv
 can have many messages. Should this be a super column ?

 ** **

 ** **

 198:05022013

 SMS::ttt

 SMS:xxx12:ttt

 MMS::ttt

 :19

 198:05022013

 ** **

 ** **

 ** **

 ** **

 1987888:05022013

 ** **

 ** **

 ** **

 ** **

 ** **

 ** **

 Thanks,

 Kanwar

 ** **



Re: DataModel Question

2013-02-05 Thread Tamar Fraenkel
Hi!
I have couple of questions regarding your model:

 1. What Cassandra version are you using? I am still working with 1.0 and
this seems to make sense, but 1.2 gives you much more power I think.
 2. Maybe I don't understand your model, but I think you need
DynamicComposite columns, as user columns are different in number of
components and maybe type.
 3. How do you associate between the SMS or MMS and the user you are
chating with. Is it done by a separate CF?

Thanks,
Tamar


*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956




On Wed, Feb 6, 2013 at 8:23 AM, Vivek Mishra mishra.v...@gmail.com wrote:

 Avoid super columns. If you need Sorted, wide rows then go for Composite
 columns.

 -Vivek


 On Wed, Feb 6, 2013 at 7:09 AM, Kanwar Sangha kan...@mavenir.com wrote:

  Hi –  We are designing a Cassandra based storage for the following use
 cases-

 ** **

 **·**Store SMS messages

 **·**Store MMS messages

 **·**Store Chat history

 ** **

 What would be the ideal was to design the data model for this kind of
 application ? I am thinking on these lines ..

 ** **

 Row-Key :  Composite key [ PhoneNum : Day]

 ** **

 **·**Example:   19876543456:05022013

 ** **

 Dynamic Column Families

 ** **

 **·**Composite column key for SMS [SMS:MessageId:TimeUUID]

 **·**Composite column key for MMS [MMS:MessageId:TimeUUID]

 **·**Composite column key for user I am chatting with
 [UserId:198765432345] – This can have multiple values since each chat conv
 can have many messages. Should this be a super column ?

 ** **

 ** **

 198:05022013

 SMS::ttt

 SMS:xxx12:ttt

 MMS::ttt

 :19

 198:05022013

 ** **

 ** **

 ** **

 ** **

 1987888:05022013

 ** **

 ** **

 ** **

 ** **

 ** **

 ** **

 Thanks,

 Kanwar

 ** **



tokLogo.png