Re: Denormalization leads to terrible, rather than better, Cassandra performance -- I am really puzzled

2015-05-04 Thread Steve Robenalt
 in
 the normalized case, or Query 3 in the denormalized case. All queries is
 set with LOCAL_QUORUM consistency level.

 Then I created 1 or more instances of the program to simultaneously
 retrieve the SAME set of 1 million events stored in Cassandra. Each test
 runs for 5 minutes, and the results are shown below.



 1 instance

 5 instances

 10 instances

 Normalized

 89

 315

 417

 Denormalized

 100

 *43*

 *3*

 Note that the unit of measure is number of operations. So in the
 normalized case, the programs runs 89 times and retrieves 178K events for a
 single instance, 315 times and 630K events to 5 instances (each instance
 gets about 126K events), and 417 times and 834K events to 10 instances
 simultaneously (each instance gets about 83.4K events).

 Well for the de-normalized case, the performance is little better for a
 single instance case, in which the program runs 100 times and retrieves
 200K events. However, it turns sharply south for multiple simultaneous
 instances. All 5 instances completed successfully only 43 operations
 together, and all 10 instances completed successfully only 3 operations
 together. For the latter case, the log showed that 3 instances each
 retrieved 2000 events successfully, and 7 other instances retrieved 0.

 In the de-normalized case, the program reported a lot of exceptions like
 below:

 com.datastax.driver.core.exceptions.ReadTimeoutException, Cassandra
 timeout during read query at consistency LOCAL_QUORUM (2 responses were
 required but only 1 replica responded)

 com.datastax.driver.core.exceptions.NoHostAvailableException, All host(s)
 tried for query failed (no host was tried)

 I repeated the two cases back and forth several times, and the results
 remained the same.

 I also observed CPU usage on the 3 Cassandra servers, and they were all
 much higher for the de-normalized case.



 1 instance

 5 instances

 10 instances

 Normalized

 7% usr, 2% sys

 30% usr, 8% sys

 40% usr, 10% sys

 Denormalized

 44% usr, 0.3% sys

 65% usr, 1% sys

 70% usr, 2% sys

 *Questions*

 This is really not what I expected, and I am puzzled and have not figured
 out a good explanation.

- Why are there so many exceptions in the de-normalized case? I would
think Cassandra should be able to handle simultaneous accesses to the same
data. Why are there NO exceptions for the normalized case? I meant that 
 the
environments for the two cases are basically the same.
- Is (internally) wide row only good for small amount of data under
each column name?
- Or is it an issue with Java Driver?
- Or did I do something wrong?


 --
 View this message in context: Denormalization leads to terrible, rather
 than better, Cassandra performance -- I am really puzzled
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Denormalization-leads-to-terrible-rather-than-better-Cassandra-performance-I-am-really-puzzled-tp7600561.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/ at
 Nabble.com.





Re: Denormalization leads to terrible, rather than better, Cassandra performance -- I am really puzzled

2015-05-03 Thread Erick Ramirez
)

 com.datastax.driver.core.exceptions.NoHostAvailableException, All host(s)
 tried for query failed (no host was tried)

 I repeated the two cases back and forth several times, and the results
 remained the same.

 I also observed CPU usage on the 3 Cassandra servers, and they were all
 much higher for the de-normalized case.



 1 instance

 5 instances

 10 instances

 Normalized

 7% usr, 2% sys

 30% usr, 8% sys

 40% usr, 10% sys

 Denormalized

 44% usr, 0.3% sys

 65% usr, 1% sys

 70% usr, 2% sys

 *Questions*

 This is really not what I expected, and I am puzzled and have not figured
 out a good explanation.

- Why are there so many exceptions in the de-normalized case? I would
think Cassandra should be able to handle simultaneous accesses to the same
data. Why are there NO exceptions for the normalized case? I meant that the
environments for the two cases are basically the same.
- Is (internally) wide row only good for small amount of data under
each column name?
- Or is it an issue with Java Driver?
- Or did I do something wrong?


 --
 View this message in context: Denormalization leads to terrible, rather
 than better, Cassandra performance -- I am really puzzled
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Denormalization-leads-to-terrible-rather-than-better-Cassandra-performance-I-am-really-puzzled-tp7600561.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/ at
 Nabble.com.



Denormalization leads to terrible, rather than better, Cassandra performance -- I am really puzzled

2015-04-28 Thread dlu66061
, and I am puzzled andhave not figured out
a good explanation.
Why are there so many exceptions in the de-normalized case? Iwould think
Cassandra should be able to handle simultaneous accesses to thesame data.
Why are there NO exceptions for the normalized case? I meant that the
environments for the two cases are basically the same.
Is (internally) wide row only good for small amount of data undereach column
name?
Or is it an issue with Java Driver?
Or did I do something wrong?




--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Denormalization-leads-to-terrible-rather-than-better-Cassandra-performance-I-am-really-puzzled-tp7600561.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Denormalization

2013-01-28 Thread chandra Varahala
My experience we can design main column families  and lookup column
families.
Main column family have all denormalized data,lookup column  families have
rowkey of denormalized column families's column.

In users column family  all user's  denormalized data and  lookup column
family name like  userByemail.
when i first make request to userByemail retuns unique key which is rowkey
of User column family then call to User column family returns all data,
same other lookup column families too.

-
Chandra



On Sun, Jan 27, 2013 at 8:53 PM, Hiller, Dean dean.hil...@nrel.gov wrote:

 Agreed, was just making sure others knew ;).

 Dean

 From: Edward Capriolo edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
 
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Date: Sunday, January 27, 2013 6:51 PM
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Re: Denormalization

 When I said that writes were cheap, I was speaking that in a normal case
 people are making 2-10 inserts what in a relational database might be one.
 30K inserts is certainly not cheap.

 Your use case with 30,000 inserts is probably a special case. Most
 directory services that I am aware of OpenLDAP, Active Directory, Sun
 Directory server do eventually consistent master/slave and multi-master
 replication. So no worries about having to background something. You just
 want the replication to be fast enough so that when you call the employee
 about to be fired into the office, that by the time he leaves and gets home
 he can not VPN to rm -rf / your main file server :)


 On Sun, Jan 27, 2013 at 7:57 PM, Hiller, Dean dean.hil...@nrel.gov
 mailto:dean.hil...@nrel.gov wrote:
 Sometimes this is true, sometimes not…..….We have a use case where we have
 an admin tool where we choose to do this denorm for ACL on permission
 checks to make permission checks extremely fast.  That said, we have one
 issue with one object that too many children(30,000) so when someone gives
 a user access to this one object with 30,000 children, we end up with a bad
 60 second wait and users ended up getting frustrated and trying to
 cancel(our plan since admin activity hardly ever happens is to do it on our
 background thread and just return immediately to the user and tell him his
 changes will take affect in 1 minute ).  After all, admin changes are
 infrequent anyways.  This example demonstrates how sometimes it could
 almost burn you.

 I guess my real point is it really depends on your use cases ;).  In a lot
 of cases denorm can work but in some cases it burns you so you have to
 balance it all.  In 90% of our cases our denorm is working great and for
 this one case, we need to background the permission change as we still LOVE
 the performance of our ACL checks.

 Ps. 30,000 writes in cassandra is not cheap when done from one server ;)
 but in general parallized writes is very fast for like 500.

 Later,
 Dean

 From: Edward Capriolo edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
 mailto:edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
 mailto:user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Date: Sunday, January 27, 2013 5:50 PM
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:
 user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Re: Denormalization

 One technique is on the client side you build a tool that takes the even
 and produces N mutations. In c* writes are cheap so essentially, re-write
 everything on all changes.

 On Sun, Jan 27, 2013 at 4:03 PM, Fredrik Stigbäck 
 fredrik.l.stigb...@sitevision.semailto:fredrik.l.stigb...@sitevision.se
 mailto:fredrik.l.stigb...@sitevision.semailto:
 fredrik.l.stigb...@sitevision.se wrote:
 Hi.
 Since denormalized data is first-class citizen in Cassandra, how to
 handle updating denormalized data.
 E.g. If we have  a USER cf with name, email etc. and denormalize user
 data into many other CF:s and then
 update the information about a user (name, email...). What is the best
 way to handle updating those user data properties
 which might be spread out over many cf:s and many rows?

 Regards
 /Fredrik





Denormalization

2013-01-27 Thread Fredrik Stigbäck
Hi.
Since denormalized data is first-class citizen in Cassandra, how to
handle updating denormalized data.
E.g. If we have  a USER cf with name, email etc. and denormalize user
data into many other CF:s and then
update the information about a user (name, email...). What is the best
way to handle updating those user data properties
which might be spread out over many cf:s and many rows?

Regards
/Fredrik


Re: Denormalization

2013-01-27 Thread Hiller, Dean
There is a really a mix of denormalization and normalization.  It really
depends on specific use-cases.  To get better help on the email list, a
more specific use case may be appropriate.

Dean

On 1/27/13 2:03 PM, Fredrik Stigbäck fredrik.l.stigb...@sitevision.se
wrote:

Hi.
Since denormalized data is first-class citizen in Cassandra, how to
handle updating denormalized data.
E.g. If we have  a USER cf with name, email etc. and denormalize user
data into many other CF:s and then
update the information about a user (name, email...). What is the best
way to handle updating those user data properties
which might be spread out over many cf:s and many rows?

Regards
/Fredrik



Re: Denormalization

2013-01-27 Thread Fredrik Stigbäck
I don't have a current use-case. I was just curious how applications
handle and how to think when modelling, since I guess denormalization
might increase the complexity of the application.

Fredrik

2013/1/27 Hiller, Dean dean.hil...@nrel.gov:
 There is a really a mix of denormalization and normalization.  It really
 depends on specific use-cases.  To get better help on the email list, a
 more specific use case may be appropriate.

 Dean

 On 1/27/13 2:03 PM, Fredrik Stigbäck fredrik.l.stigb...@sitevision.se
 wrote:

Hi.
Since denormalized data is first-class citizen in Cassandra, how to
handle updating denormalized data.
E.g. If we have  a USER cf with name, email etc. and denormalize user
data into many other CF:s and then
update the information about a user (name, email...). What is the best
way to handle updating those user data properties
which might be spread out over many cf:s and many rows?

Regards
/Fredrik




-- 
Fredrik Larsson Stigbäck
SiteVision AB Vasagatan 10, 107 10 Örebro
019-17 30 30


Re: Denormalization

2013-01-27 Thread Adam Venturella
In my experience, if you foresee needing to do a lot of updates where a
master record would need to propagate its changes to other
records, then in general a non-sql based data store may be the wrong fit
for your data.

If you have a lot of data that doesn't really change or is not linked in
some way to other rows (in Cassandra's case). Then a non-sql based data
store could be a great fit.

Yes, you can do some fancy stuff to force things like Cassandra to behave
like an RDBMS, but it's at the cost of application complexity; more code,
more bugs.

I often end up mixing the data stores sql/non-sql to play to their
respective strengths.

If I start seeing a lot of related data, relational databases are really
good at solving that problem.


On Sunday, January 27, 2013, Fredrik Stigbäck wrote:

 I don't have a current use-case. I was just curious how applications
 handle and how to think when modelling, since I guess denormalization
 might increase the complexity of the application.

 Fredrik

 2013/1/27 Hiller, Dean dean.hil...@nrel.gov javascript:;:
  There is a really a mix of denormalization and normalization.  It really
  depends on specific use-cases.  To get better help on the email list, a
  more specific use case may be appropriate.
 
  Dean
 
  On 1/27/13 2:03 PM, Fredrik Stigbäck 
  fredrik.l.stigb...@sitevision.sejavascript:;
 
  wrote:
 
 Hi.
 Since denormalized data is first-class citizen in Cassandra, how to
 handle updating denormalized data.
 E.g. If we have  a USER cf with name, email etc. and denormalize user
 data into many other CF:s and then
 update the information about a user (name, email...). What is the best
 way to handle updating those user data properties
 which might be spread out over many cf:s and many rows?
 
 Regards
 /Fredrik
 



 --
 Fredrik Larsson Stigbäck
 SiteVision AB Vasagatan 10, 107 10 Örebro
 019-17 30 30



Re: Denormalization

2013-01-27 Thread Hiller, Dean
Things like PlayOrm exist to help you with half and half of denormalized and 
normalized data.  There are more and more patterns out there of denormalization 
and normalization but allowing for scalability still.  Here is one patterns page

https://github.com/deanhiller/playorm/wiki/Patterns-Page

Dean

From: Adam Venturella aventure...@gmail.commailto:aventure...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Sunday, January 27, 2013 3:44 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Denormalization

In my experience, if you foresee needing to do a lot of updates where a 
master record would need to propagate its changes to other records, then in 
general a non-sql based data store may be the wrong fit for your data.

If you have a lot of data that doesn't really change or is not linked in some 
way to other rows (in Cassandra's case). Then a non-sql based data store could 
be a great fit.

Yes, you can do some fancy stuff to force things like Cassandra to behave like 
an RDBMS, but it's at the cost of application complexity; more code, more bugs.

I often end up mixing the data stores sql/non-sql to play to their respective 
strengths.

If I start seeing a lot of related data, relational databases are really good 
at solving that problem.


On Sunday, January 27, 2013, Fredrik Stigbäck wrote:
I don't have a current use-case. I was just curious how applications
handle and how to think when modelling, since I guess denormalization
might increase the complexity of the application.

Fredrik

2013/1/27 Hiller, Dean dean.hil...@nrel.govjavascript:;:
 There is a really a mix of denormalization and normalization.  It really
 depends on specific use-cases.  To get better help on the email list, a
 more specific use case may be appropriate.

 Dean

 On 1/27/13 2:03 PM, Fredrik Stigbäck 
 fredrik.l.stigb...@sitevision.sejavascript:;
 wrote:

Hi.
Since denormalized data is first-class citizen in Cassandra, how to
handle updating denormalized data.
E.g. If we have  a USER cf with name, email etc. and denormalize user
data into many other CF:s and then
update the information about a user (name, email...). What is the best
way to handle updating those user data properties
which might be spread out over many cf:s and many rows?

Regards
/Fredrik




--
Fredrik Larsson Stigbäck
SiteVision AB Vasagatan 10, 107 10 Örebro
019-17 30 30


Re: Denormalization

2013-01-27 Thread Hiller, Dean
Oh and check out the last pattern Scalable equals only index which can allow 
you to still have normalized data though the pattern does denormalization just 
enough that you can

 1.  Update just two pieces of info (the users email for instance and the Xref 
table email as well).
 2.  Allow everyone else to have foreign references into that piece. (everyone 
references the guid not the email….while the xref table has an email to guid 
for your use…this can be quite a common pattern actually when you may be having 
issues denormalizing)

Dean

From: Adam Venturella aventure...@gmail.commailto:aventure...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Sunday, January 27, 2013 3:44 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Denormalization

In my experience, if you foresee needing to do a lot of updates where a 
master record would need to propagate its changes to other records, then in 
general a non-sql based data store may be the wrong fit for your data.

If you have a lot of data that doesn't really change or is not linked in some 
way to other rows (in Cassandra's case). Then a non-sql based data store could 
be a great fit.

Yes, you can do some fancy stuff to force things like Cassandra to behave like 
an RDBMS, but it's at the cost of application complexity; more code, more bugs.

I often end up mixing the data stores sql/non-sql to play to their respective 
strengths.

If I start seeing a lot of related data, relational databases are really good 
at solving that problem.


On Sunday, January 27, 2013, Fredrik Stigbäck wrote:
I don't have a current use-case. I was just curious how applications
handle and how to think when modelling, since I guess denormalization
might increase the complexity of the application.

Fredrik

2013/1/27 Hiller, Dean dean.hil...@nrel.govjavascript:;:
 There is a really a mix of denormalization and normalization.  It really
 depends on specific use-cases.  To get better help on the email list, a
 more specific use case may be appropriate.

 Dean

 On 1/27/13 2:03 PM, Fredrik Stigbäck 
 fredrik.l.stigb...@sitevision.sejavascript:;
 wrote:

Hi.
Since denormalized data is first-class citizen in Cassandra, how to
handle updating denormalized data.
E.g. If we have  a USER cf with name, email etc. and denormalize user
data into many other CF:s and then
update the information about a user (name, email...). What is the best
way to handle updating those user data properties
which might be spread out over many cf:s and many rows?

Regards
/Fredrik




--
Fredrik Larsson Stigbäck
SiteVision AB Vasagatan 10, 107 10 Örebro
019-17 30 30


Re: Denormalization

2013-01-27 Thread Edward Capriolo
One technique is on the client side you build a tool that takes the even
and produces N mutations. In c* writes are cheap so essentially, re-write
everything on all changes.

On Sun, Jan 27, 2013 at 4:03 PM, Fredrik Stigbäck 
fredrik.l.stigb...@sitevision.se wrote:

 Hi.
 Since denormalized data is first-class citizen in Cassandra, how to
 handle updating denormalized data.
 E.g. If we have  a USER cf with name, email etc. and denormalize user
 data into many other CF:s and then
 update the information about a user (name, email...). What is the best
 way to handle updating those user data properties
 which might be spread out over many cf:s and many rows?

 Regards
 /Fredrik



Re: Denormalization

2013-01-27 Thread Edward Capriolo
When I said that writes were cheap, I was speaking that in a normal case
people are making 2-10 inserts what in a relational database might be one.
30K inserts is certainly not cheap.

Your use case with 30,000 inserts is probably a special case. Most
directory services that I am aware of OpenLDAP, Active Directory, Sun
Directory server do eventually consistent master/slave and multi-master
replication. So no worries about having to background something. You just
want the replication to be fast enough so that when you call the employee
about to be fired into the office, that by the time he leaves and gets home
he can not VPN to rm -rf / your main file server :)


On Sun, Jan 27, 2013 at 7:57 PM, Hiller, Dean dean.hil...@nrel.gov wrote:

 Sometimes this is true, sometimes not…..….We have a use case where we have
 an admin tool where we choose to do this denorm for ACL on permission
 checks to make permission checks extremely fast.  That said, we have one
 issue with one object that too many children(30,000) so when someone gives
 a user access to this one object with 30,000 children, we end up with a bad
 60 second wait and users ended up getting frustrated and trying to
 cancel(our plan since admin activity hardly ever happens is to do it on our
 background thread and just return immediately to the user and tell him his
 changes will take affect in 1 minute ).  After all, admin changes are
 infrequent anyways.  This example demonstrates how sometimes it could
 almost burn you.

 I guess my real point is it really depends on your use cases ;).  In a lot
 of cases denorm can work but in some cases it burns you so you have to
 balance it all.  In 90% of our cases our denorm is working great and for
 this one case, we need to background the permission change as we still LOVE
 the performance of our ACL checks.

 Ps. 30,000 writes in cassandra is not cheap when done from one server ;)
 but in general parallized writes is very fast for like 500.

 Later,
 Dean

 From: Edward Capriolo edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
 
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Date: Sunday, January 27, 2013 5:50 PM
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Re: Denormalization

 One technique is on the client side you build a tool that takes the even
 and produces N mutations. In c* writes are cheap so essentially, re-write
 everything on all changes.

 On Sun, Jan 27, 2013 at 4:03 PM, Fredrik Stigbäck 
 fredrik.l.stigb...@sitevision.semailto:fredrik.l.stigb...@sitevision.se
 wrote:
 Hi.
 Since denormalized data is first-class citizen in Cassandra, how to
 handle updating denormalized data.
 E.g. If we have  a USER cf with name, email etc. and denormalize user
 data into many other CF:s and then
 update the information about a user (name, email...). What is the best
 way to handle updating those user data properties
 which might be spread out over many cf:s and many rows?

 Regards
 /Fredrik




Re: Denormalization

2013-01-27 Thread Hiller, Dean
Agreed, was just making sure others knew ;).

Dean

From: Edward Capriolo edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Sunday, January 27, 2013 6:51 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Denormalization

When I said that writes were cheap, I was speaking that in a normal case people 
are making 2-10 inserts what in a relational database might be one. 30K inserts 
is certainly not cheap.

Your use case with 30,000 inserts is probably a special case. Most directory 
services that I am aware of OpenLDAP, Active Directory, Sun Directory server do 
eventually consistent master/slave and multi-master replication. So no worries 
about having to background something. You just want the replication to be fast 
enough so that when you call the employee about to be fired into the office, 
that by the time he leaves and gets home he can not VPN to rm -rf / your main 
file server :)


On Sun, Jan 27, 2013 at 7:57 PM, Hiller, Dean 
dean.hil...@nrel.govmailto:dean.hil...@nrel.gov wrote:
Sometimes this is true, sometimes not…..….We have a use case where we have an 
admin tool where we choose to do this denorm for ACL on permission checks to 
make permission checks extremely fast.  That said, we have one issue with one 
object that too many children(30,000) so when someone gives a user access to 
this one object with 30,000 children, we end up with a bad 60 second wait and 
users ended up getting frustrated and trying to cancel(our plan since admin 
activity hardly ever happens is to do it on our background thread and just 
return immediately to the user and tell him his changes will take affect in 1 
minute ).  After all, admin changes are infrequent anyways.  This example 
demonstrates how sometimes it could almost burn you.

I guess my real point is it really depends on your use cases ;).  In a lot of 
cases denorm can work but in some cases it burns you so you have to balance it 
all.  In 90% of our cases our denorm is working great and for this one case, we 
need to background the permission change as we still LOVE the performance of 
our ACL checks.

Ps. 30,000 writes in cassandra is not cheap when done from one server ;) but in 
general parallized writes is very fast for like 500.

Later,
Dean

From: Edward Capriolo 
edlinuxg...@gmail.commailto:edlinuxg...@gmail.commailto:edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
Reply-To: 
user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.org
 
user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Sunday, January 27, 2013 5:50 PM
To: 
user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.org
 
user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Denormalization

One technique is on the client side you build a tool that takes the even and 
produces N mutations. In c* writes are cheap so essentially, re-write 
everything on all changes.

On Sun, Jan 27, 2013 at 4:03 PM, Fredrik Stigbäck 
fredrik.l.stigb...@sitevision.semailto:fredrik.l.stigb...@sitevision.semailto:fredrik.l.stigb...@sitevision.semailto:fredrik.l.stigb...@sitevision.se
 wrote:
Hi.
Since denormalized data is first-class citizen in Cassandra, how to
handle updating denormalized data.
E.g. If we have  a USER cf with name, email etc. and denormalize user
data into many other CF:s and then
update the information about a user (name, email...). What is the best
way to handle updating those user data properties
which might be spread out over many cf:s and many rows?

Regards
/Fredrik