Re: Stability of MaterializedView in 3.11.x | 4.0
Hi Pankaj, There aren't plans to include substantial changes to the materialized views implementation in C* 4.0, and I'm not aware of project contributors who plan major work on MV's post-4.0 at present. – Scott From: Pankaj Gajjar Sent: Tuesday, September 3, 2019 5:47 AM To: dev@cassandra.apache.org Subject: Re: Stability of MaterializedView in 3.11.x | 4.0 Hi Team, Thanks but this is not point, question again in mind, do we have any plan to fix this MVs issue into upcoming any Cassandra release ? 4.0 ? if yes then it would be great to wait. Or is there any plugin or workaround to resolve this issue well on Cassandra setup ? -- Regards Pankaj G. On 31/08/19, 00:33, "Jon Haddad" wrote: If you don't have any intent on running across multiple nodes, Cassandra is probably the wrong DB for you. Postgres will give you a better feature set for a single node. On Fri, Aug 30, 2019 at 5:23 AM Pankaj Gajjar wrote: > Understand it well, how about Cassandra running on single node, we don’t > have cluster setup (3 nodes+ i.e). > > Does MVs perform well on single node machine ? > > Note: I know about HA, so lets keep it side for now and it's only possible > when we have cluster setup. > > On 29/08/19, 06:21, "Dor Laor" wrote: > > On Wed, Aug 28, 2019 at 5:43 PM Jon Haddad wrote: > > > > Arguably, the other alternative to server-side denormalization is > to do > > the denormalization client-side which comes with the same axes of > costs and > > complexity, just with more of each. > > > > That's not completely true. You can write to any number of tables > without > > doing a read, and the cost of reading data off disk is significantly > > greater than an insert alone. You can crush a cluster with a write > heavy > > workload and MVs that would otherwise be completely fine to do all > writes. > > > > The other issue with MVs is that you still need to understand > fundamentals > > of data modeling, that don't magically solve the problem of enormous > > partitions. One of the reasons I've had to un-MV a lot of clusters > is > > because people have put an MV on a table with a low-cardinality > field and > > found themselves with a 10GB partition nightmare, so they need to go > back > > and remodel the view as something more complex anyways. In this > case, the > > MV was extremely high cost since now they've not only pushed out a > poor > > implementation to begin with but now have the cost of a migration as > well > > as a rewrite. > > > > +1 > > Moreover, the hard part is that an update for the base table means that > the original data needs to be read and the database (or the poor > developer > who implements the denormalized model) needs to delete the data in the > view > and then to write the new ones. All need to be of course resilient to > all > types of > errors and failures. Had it been simple, there was no need for a > database > MV.. > > > > > > > > > > On Wed, Aug 28, 2019 at 9:58 AM Joshua McKenzie < > jmcken...@apache.org> > > wrote: > > > > > > > > > > so we need to start migration from MVs to manual query base > table ? > > > > > > Arguably, the other alternative to server-side denormalization is > to do > > > the denormalization client-side which comes with the same axes of > costs > > and > > > complexity, just with more of each. > > > > > > Jeff's spot on when he discusses the risk appetite vs. mitigation > aspect > > of > > > it. There's a reason banks do end-of-day close-out validation > analysis > > and > > > have redundant systems for things like this. > > > > > > On Wed, Aug 28, 2019 at 11:49 AM Jon Haddad > wrote: > > > > > > > I've helped a lot of teams (a dozen to two dozen maybe) migrate > away > > from > > > > MVs due to inconsistencies, issues with streaming (have you > added or > > > > removed nodes yet?), and massive performance issues to the point > of > > > cluster > > > > failure under (what I consider) trivial load. I haven't gone > too deep > > > into > > > > analyzing their issues, folks are usually fine with "move off > them", vs > > > > having me do a ton of analysis. > > > > > > > > tlp-stress has a materialized view workload built in, and you > can add > > > > arbitrary CQL via the --cql flag to add a MV to any existing > workloa
Re: Stability of MaterializedView in 3.11.x | 4.0
Hi Team, Thanks but this is not point, question again in mind, do we have any plan to fix this MVs issue into upcoming any Cassandra release ? 4.0 ? if yes then it would be great to wait. Or is there any plugin or workaround to resolve this issue well on Cassandra setup ? -- Regards Pankaj G. On 31/08/19, 00:33, "Jon Haddad" wrote: If you don't have any intent on running across multiple nodes, Cassandra is probably the wrong DB for you. Postgres will give you a better feature set for a single node. On Fri, Aug 30, 2019 at 5:23 AM Pankaj Gajjar wrote: > Understand it well, how about Cassandra running on single node, we don’t > have cluster setup (3 nodes+ i.e). > > Does MVs perform well on single node machine ? > > Note: I know about HA, so lets keep it side for now and it's only possible > when we have cluster setup. > > On 29/08/19, 06:21, "Dor Laor" wrote: > > On Wed, Aug 28, 2019 at 5:43 PM Jon Haddad wrote: > > > > Arguably, the other alternative to server-side denormalization is > to do > > the denormalization client-side which comes with the same axes of > costs and > > complexity, just with more of each. > > > > That's not completely true. You can write to any number of tables > without > > doing a read, and the cost of reading data off disk is significantly > > greater than an insert alone. You can crush a cluster with a write > heavy > > workload and MVs that would otherwise be completely fine to do all > writes. > > > > The other issue with MVs is that you still need to understand > fundamentals > > of data modeling, that don't magically solve the problem of enormous > > partitions. One of the reasons I've had to un-MV a lot of clusters > is > > because people have put an MV on a table with a low-cardinality > field and > > found themselves with a 10GB partition nightmare, so they need to go > back > > and remodel the view as something more complex anyways. In this > case, the > > MV was extremely high cost since now they've not only pushed out a > poor > > implementation to begin with but now have the cost of a migration as > well > > as a rewrite. > > > > +1 > > Moreover, the hard part is that an update for the base table means that > the original data needs to be read and the database (or the poor > developer > who implements the denormalized model) needs to delete the data in the > view > and then to write the new ones. All need to be of course resilient to > all > types of > errors and failures. Had it been simple, there was no need for a > database > MV.. > > > > > > > > > > On Wed, Aug 28, 2019 at 9:58 AM Joshua McKenzie < > jmcken...@apache.org> > > wrote: > > > > > > > > > > so we need to start migration from MVs to manual query base > table ? > > > > > > Arguably, the other alternative to server-side denormalization is > to do > > > the denormalization client-side which comes with the same axes of > costs > > and > > > complexity, just with more of each. > > > > > > Jeff's spot on when he discusses the risk appetite vs. mitigation > aspect > > of > > > it. There's a reason banks do end-of-day close-out validation > analysis > > and > > > have redundant systems for things like this. > > > > > > On Wed, Aug 28, 2019 at 11:49 AM Jon Haddad > wrote: > > > > > > > I've helped a lot of teams (a dozen to two dozen maybe) migrate > away > > from > > > > MVs due to inconsistencies, issues with streaming (have you > added or > > > > removed nodes yet?), and massive performance issues to the point > of > > > cluster > > > > failure under (what I consider) trivial load. I haven't gone > too deep > > > into > > > > analyzing their issues, folks are usually fine with "move off > them", vs > > > > having me do a ton of analysis. > > > > > > > > tlp-stress has a materialized view workload built in, and you > can add > > > > arbitrary CQL via the --cql flag to add a MV to any existing > workload > > > such > > > > as KeyValue or BasicTimeSeries. > > > > > > > > On Wed, Aug 28, 2019 at 8:11 AM Jeff Jirsa > wrote: > > > > > > > > > There have been people who have had operational issues related > to MVs > > > > (many > > > > > of them around running repair), but the biggest concern is > > correctness.