Re: DataModel Question
Go day / phone instead of phone / day this way you won't have a rk growing forever . A comprise would be month / phone as the row key and then use the date time as the first part of a composite column. On Thursday, February 7, 2013, Kanwar Sangha kan...@mavenir.com wrote: Thanks Aaron ! My use case is modeled like “skype” which stores IM + SMS + MMS in one conversation. I need to have the following functionality – ·When I go offline and come online again, I need to retrieve all pending messages from all my conversations. ·I should be able to select a contact and view the ‘history’ of the messages (last 7 days, last 14 days, last 21 days…) ·If I log in to a different device, I should be able to synch at least a “few days” of messages. ·One conversation can have multiple participants. ·Support full synch or delta synch based on number of messages/history. I guess this makes the data model span across many CFs ? From: aaron morton [mailto:aa...@thelastpickle.com] Sent: 06 February 2013 22:20 To: user@cassandra.apache.org Subject: Re: DataModel Question 2) DynamicComposites : I read somewhere that they are not recommended ? You probably wont need them. Your current model will not sort message by the time they arrive in a day. The sort order will be based on Message type and the message ID. I'm assuming you want to order messages, so put the time uuid at the start of the composite columns. If you often want to get the most recent messages use a reverse comparator. You could probably also have wider rows if you want to, not sure how many messages kids send a day but you may get by with weekly partitions. The CLI model could be: row_key: phone_number : day column: time_uuid : message_id : message_type You could also pack extra data used JSON, ProtoBuffers etc and store more that just the message in the column value. If you use using CQL 3 consider this: create table messages ( phone_numbertext, day timestamp, message_sequence timeuuid, # your timestamp message_id integer, message_type text, message_bodytext ) with PRIMARY KEY ( (phone_number, day), message_sequence, message_id) (phone_number, day) is the partition key, same the thrift row key. message_sequence, message_id is the grouping columns, all instances will be grouped / ordered by these columns. Hope that helps. - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com
Re: DataModel Question
Go day / phone instead of phone / day this way you won't have a rk growing forever . Not sure I understand. +1 for month partition. When I go offline and come online again, I need to retrieve all pending messages from all my conversations. You need to have some sort of token that includes the last time stamp seen by the client. Then make as many queries as necessary to get the missing data. I guess this makes the data model span across many CFs ? Yes. Sorry I have not considered conversations. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 8/02/2013, at 3:04 AM, Edward Capriolo edlinuxg...@gmail.com wrote: Go day / phone instead of phone / day this way you won't have a rk growing forever . A comprise would be month / phone as the row key and then use the date time as the first part of a composite column. On Thursday, February 7, 2013, Kanwar Sangha kan...@mavenir.com wrote: Thanks Aaron ! My use case is modeled like “skype” which stores IM + SMS + MMS in one conversation. I need to have the following functionality – ·When I go offline and come online again, I need to retrieve all pending messages from all my conversations. ·I should be able to select a contact and view the ‘history’ of the messages (last 7 days, last 14 days, last 21 days…) ·If I log in to a different device, I should be able to synch at least a “few days” of messages. ·One conversation can have multiple participants. ·Support full synch or delta synch based on number of messages/history. I guess this makes the data model span across many CFs ? From: aaron morton [mailto:aa...@thelastpickle.com] Sent: 06 February 2013 22:20 To: user@cassandra.apache.org Subject: Re: DataModel Question 2) DynamicComposites : I read somewhere that they are not recommended ? You probably wont need them. Your current model will not sort message by the time they arrive in a day. The sort order will be based on Message type and the message ID. I'm assuming you want to order messages, so put the time uuid at the start of the composite columns. If you often want to get the most recent messages use a reverse comparator. You could probably also have wider rows if you want to, not sure how many messages kids send a day but you may get by with weekly partitions. The CLI model could be: row_key: phone_number : day column: time_uuid : message_id : message_type You could also pack extra data used JSON, ProtoBuffers etc and store more that just the message in the column value. If you use using CQL 3 consider this: create table messages ( phone_numbertext, day timestamp, message_sequence timeuuid, # your timestamp message_id integer, message_type text, message_bodytext ) with PRIMARY KEY ( (phone_number, day), message_sequence, message_id) (phone_number, day) is the partition key, same the thrift row key. message_sequence, message_id is the grouping columns, all instances will be grouped / ordered by these columns. Hope that helps. - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com
RE: DataModel Question
1) Version is 1.2 2) DynamicComposites : I read somewhere that they are not recommended ? 3) Good point. I need to think about that one. From: Tamar Fraenkel [mailto:ta...@tok-media.com] Sent: 06 February 2013 00:50 To: user@cassandra.apache.org Subject: Re: DataModel Question Hi! I have couple of questions regarding your model: 1. What Cassandra version are you using? I am still working with 1.0 and this seems to make sense, but 1.2 gives you much more power I think. 2. Maybe I don't understand your model, but I think you need DynamicComposite columns, as user columns are different in number of components and maybe type. 3. How do you associate between the SMS or MMS and the user you are chating with. Is it done by a separate CF? Thanks, Tamar Tamar Fraenkel Senior Software Engineer, TOK Media [Inline image 1] ta...@tok-media.commailto:ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, Feb 6, 2013 at 8:23 AM, Vivek Mishra mishra.v...@gmail.commailto:mishra.v...@gmail.com wrote: Avoid super columns. If you need Sorted, wide rows then go for Composite columns. -Vivek On Wed, Feb 6, 2013 at 7:09 AM, Kanwar Sangha kan...@mavenir.commailto:kan...@mavenir.com wrote: Hi - We are designing a Cassandra based storage for the following use cases- *Store SMS messages *Store MMS messages *Store Chat history What would be the ideal was to design the data model for this kind of application ? I am thinking on these lines .. Row-Key : Composite key [ PhoneNum : Day] *Example: 19876543456:05022013 Dynamic Column Families *Composite column key for SMS [SMS:MessageId:TimeUUID] *Composite column key for MMS [MMS:MessageId:TimeUUID] *Composite column key for user I am chatting with [UserId:198765432345] - This can have multiple values since each chat conv can have many messages. Should this be a super column ? 198:05022013 SMS::ttt SMS:xxx12:ttt MMS::ttt :19 198:05022013 1987888:05022013 Thanks, Kanwar inline: image001.png
Re: DataModel Question
2) DynamicComposites : I read somewhere that they are not recommended ? You probably wont need them. Your current model will not sort message by the time they arrive in a day. The sort order will be based on Message type and the message ID. I'm assuming you want to order messages, so put the time uuid at the start of the composite columns. If you often want to get the most recent messages use a reverse comparator. You could probably also have wider rows if you want to, not sure how many messages kids send a day but you may get by with weekly partitions. The CLI model could be: row_key: phone_number : day column: time_uuid : message_id : message_type You could also pack extra data used JSON, ProtoBuffers etc and store more that just the message in the column value. If you use using CQL 3 consider this: create table messages ( phone_numbertext, day timestamp, message_sequencetimeuuid, # your timestamp message_id integer, message_typetext, message_bodytext ) with PRIMARY KEY ( (phone_number, day), message_sequence, message_id) (phone_number, day) is the partition key, same the thrift row key. message_sequence, message_id is the grouping columns, all instances will be grouped / ordered by these columns. Hope that helps. - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 7/02/2013, at 1:47 AM, Kanwar Sangha kan...@mavenir.com wrote: 1) Version is 1.2 2) DynamicComposites : I read somewhere that they are not recommended ? 3) Good point. I need to think about that one. From: Tamar Fraenkel [mailto:ta...@tok-media.com] Sent: 06 February 2013 00:50 To: user@cassandra.apache.org Subject: Re: DataModel Question Hi! I have couple of questions regarding your model: 1. What Cassandra version are you using? I am still working with 1.0 and this seems to make sense, but 1.2 gives you much more power I think. 2. Maybe I don't understand your model, but I think you need DynamicComposite columns, as user columns are different in number of components and maybe type. 3. How do you associate between the SMS or MMS and the user you are chating with. Is it done by a separate CF? Thanks, Tamar Tamar Fraenkel Senior Software Engineer, TOK Media image001.png ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, Feb 6, 2013 at 8:23 AM, Vivek Mishra mishra.v...@gmail.com wrote: Avoid super columns. If you need Sorted, wide rows then go for Composite columns. -Vivek On Wed, Feb 6, 2013 at 7:09 AM, Kanwar Sangha kan...@mavenir.com wrote: Hi – We are designing a Cassandra based storage for the following use cases- ·Store SMS messages ·Store MMS messages ·Store Chat history What would be the ideal was to design the data model for this kind of application ? I am thinking on these lines .. Row-Key : Composite key [ PhoneNum : Day] ·Example: 19876543456:05022013 Dynamic Column Families ·Composite column key for SMS [SMS:MessageId:TimeUUID] ·Composite column key for MMS [MMS:MessageId:TimeUUID] ·Composite column key for user I am chatting with [UserId:198765432345] – This can have multiple values since each chat conv can have many messages. Should this be a super column ? 198:05022013 SMS::ttt SMS:xxx12:ttt MMS::ttt :19 198:05022013 1987888:05022013 Thanks, Kanwar
RE: DataModel Question
Thanks Aaron ! My use case is modeled like skype which stores IM + SMS + MMS in one conversation. I need to have the following functionality - *When I go offline and come online again, I need to retrieve all pending messages from all my conversations. *I should be able to select a contact and view the 'history' of the messages (last 7 days, last 14 days, last 21 days...) *If I log in to a different device, I should be able to synch at least a few days of messages. *One conversation can have multiple participants. *Support full synch or delta synch based on number of messages/history. I guess this makes the data model span across many CFs ? From: aaron morton [mailto:aa...@thelastpickle.com] Sent: 06 February 2013 22:20 To: user@cassandra.apache.org Subject: Re: DataModel Question 2) DynamicComposites : I read somewhere that they are not recommended ? You probably wont need them. Your current model will not sort message by the time they arrive in a day. The sort order will be based on Message type and the message ID. I'm assuming you want to order messages, so put the time uuid at the start of the composite columns. If you often want to get the most recent messages use a reverse comparator. You could probably also have wider rows if you want to, not sure how many messages kids send a day but you may get by with weekly partitions. The CLI model could be: row_key: phone_number : day column: time_uuid : message_id : message_type You could also pack extra data used JSON, ProtoBuffers etc and store more that just the message in the column value. If you use using CQL 3 consider this: create table messages ( phone_numbertext, day timestamp, message_sequence timeuuid, # your timestamp message_id integer, message_type text, message_bodytext ) with PRIMARY KEY ( (phone_number, day), message_sequence, message_id) (phone_number, day) is the partition key, same the thrift row key. message_sequence, message_id is the grouping columns, all instances will be grouped / ordered by these columns. Hope that helps. - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 7/02/2013, at 1:47 AM, Kanwar Sangha kan...@mavenir.commailto:kan...@mavenir.com wrote: 1) Version is 1.2 2) DynamicComposites : I read somewhere that they are not recommended ? 3) Good point. I need to think about that one. From: Tamar Fraenkel [mailto:ta...@tok-media.comhttp://tok-media.com] Sent: 06 February 2013 00:50 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: DataModel Question Hi! I have couple of questions regarding your model: 1. What Cassandra version are you using? I am still working with 1.0 and this seems to make sense, but 1.2 gives you much more power I think. 2. Maybe I don't understand your model, but I think you need DynamicComposite columns, as user columns are different in number of components and maybe type. 3. How do you associate between the SMS or MMS and the user you are chating with. Is it done by a separate CF? Thanks, Tamar Tamar Fraenkel Senior Software Engineer, TOK Media image001.png ta...@tok-media.commailto:ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, Feb 6, 2013 at 8:23 AM, Vivek Mishra mishra.v...@gmail.commailto:mishra.v...@gmail.com wrote: Avoid super columns. If you need Sorted, wide rows then go for Composite columns. -Vivek On Wed, Feb 6, 2013 at 7:09 AM, Kanwar Sangha kan...@mavenir.commailto:kan...@mavenir.com wrote: Hi - We are designing a Cassandra based storage for the following use cases- *Store SMS messages *Store MMS messages *Store Chat history What would be the ideal was to design the data model for this kind of application ? I am thinking on these lines .. Row-Key : Composite key [ PhoneNum : Day] *Example: 19876543456:05022013 Dynamic Column Families *Composite column key for SMS [SMS:MessageId:TimeUUID] *Composite column key for MMS [MMS:MessageId:TimeUUID] *Composite column key for user I am chatting with [UserId:198765432345] - This can have multiple values since each chat conv can have many messages. Should this be a super column ? 198:05022013 SMS::ttt SMS:xxx12:ttt MMS::ttt :19 198:05022013 1987888:05022013 Thanks, Kanwar
DataModel Question
Hi - We are designing a Cassandra based storage for the following use cases- *Store SMS messages *Store MMS messages *Store Chat history What would be the ideal was to design the data model for this kind of application ? I am thinking on these lines .. Row-Key : Composite key [ PhoneNum : Day] *Example: 19876543456:05022013 Dynamic Column Families *Composite column key for SMS [SMS:MessageId:TimeUUID] *Composite column key for MMS [MMS:MessageId:TimeUUID] *Composite column key for user I am chatting with [UserId:198765432345] - This can have multiple values since each chat conv can have many messages. Should this be a super column ? 198:05022013 SMS::ttt SMS:xxx12:ttt MMS::ttt :19 198:05022013 1987888:05022013 Thanks, Kanwar
RE: DataModel Question
Hello, Composite keys are always good and model looks clean to me. Run pilot with around 10 GB or more data and compare it with RDBMS and make changes accordingly. Thanks and Regards Rishabh Agrawal From: Kanwar Sangha [mailto:kan...@mavenir.com] Sent: Wednesday, February 06, 2013 7:10 AM To: user@cassandra.apache.org Subject: DataModel Question Hi - We are designing a Cassandra based storage for the following use cases- * Store SMS messages * Store MMS messages * Store Chat history What would be the ideal was to design the data model for this kind of application ? I am thinking on these lines .. Row-Key : Composite key [ PhoneNum : Day] * Example: 19876543456:05022013 Dynamic Column Families * Composite column key for SMS [SMS:MessageId:TimeUUID] * Composite column key for MMS [MMS:MessageId:TimeUUID] * Composite column key for user I am chatting with [UserId:198765432345] - This can have multiple values since each chat conv can have many messages. Should this be a super column ? 198:05022013 SMS::ttt SMS:xxx12:ttt MMS::ttt :19 198:05022013 1987888:05022013 Thanks, Kanwar NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: DataModel Question
Avoid super columns. If you need Sorted, wide rows then go for Composite columns. -Vivek On Wed, Feb 6, 2013 at 7:09 AM, Kanwar Sangha kan...@mavenir.com wrote: Hi – We are designing a Cassandra based storage for the following use cases- ** ** **·**Store SMS messages **·**Store MMS messages **·**Store Chat history ** ** What would be the ideal was to design the data model for this kind of application ? I am thinking on these lines .. ** ** Row-Key : Composite key [ PhoneNum : Day] ** ** **·**Example: 19876543456:05022013 ** ** Dynamic Column Families ** ** **·**Composite column key for SMS [SMS:MessageId:TimeUUID] **·**Composite column key for MMS [MMS:MessageId:TimeUUID] **·**Composite column key for user I am chatting with [UserId:198765432345] – This can have multiple values since each chat conv can have many messages. Should this be a super column ? ** ** ** ** 198:05022013 SMS::ttt SMS:xxx12:ttt MMS::ttt :19 198:05022013 ** ** ** ** ** ** ** ** 1987888:05022013 ** ** ** ** ** ** ** ** ** ** ** ** Thanks, Kanwar ** **
Re: DataModel Question
Hi! I have couple of questions regarding your model: 1. What Cassandra version are you using? I am still working with 1.0 and this seems to make sense, but 1.2 gives you much more power I think. 2. Maybe I don't understand your model, but I think you need DynamicComposite columns, as user columns are different in number of components and maybe type. 3. How do you associate between the SMS or MMS and the user you are chating with. Is it done by a separate CF? Thanks, Tamar *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, Feb 6, 2013 at 8:23 AM, Vivek Mishra mishra.v...@gmail.com wrote: Avoid super columns. If you need Sorted, wide rows then go for Composite columns. -Vivek On Wed, Feb 6, 2013 at 7:09 AM, Kanwar Sangha kan...@mavenir.com wrote: Hi – We are designing a Cassandra based storage for the following use cases- ** ** **·**Store SMS messages **·**Store MMS messages **·**Store Chat history ** ** What would be the ideal was to design the data model for this kind of application ? I am thinking on these lines .. ** ** Row-Key : Composite key [ PhoneNum : Day] ** ** **·**Example: 19876543456:05022013 ** ** Dynamic Column Families ** ** **·**Composite column key for SMS [SMS:MessageId:TimeUUID] **·**Composite column key for MMS [MMS:MessageId:TimeUUID] **·**Composite column key for user I am chatting with [UserId:198765432345] – This can have multiple values since each chat conv can have many messages. Should this be a super column ? ** ** ** ** 198:05022013 SMS::ttt SMS:xxx12:ttt MMS::ttt :19 198:05022013 ** ** ** ** ** ** ** ** 1987888:05022013 ** ** ** ** ** ** ** ** ** ** ** ** Thanks, Kanwar ** ** tokLogo.png