Re: Stability of MaterializedView in 3.11.x | 4.0
Hi Pankaj, There aren't plans to include substantial changes to the materialized views implementation in C* 4.0, and I'm not aware of project contributors who plan major work on MV's post-4.0 at present. – Scott From: Pankaj Gajjar Sent: Tuesday, September 3, 2019 5:47 AM To: dev@cassandra.apache.org Subject: Re: Stability of MaterializedView in 3.11.x | 4.0 Hi Team, Thanks but this is not point, question again in mind, do we have any plan to fix this MVs issue into upcoming any Cassandra release ? 4.0 ? if yes then it would be great to wait. Or is there any plugin or workaround to resolve this issue well on Cassandra setup ? -- Regards Pankaj G. On 31/08/19, 00:33, "Jon Haddad" wrote: If you don't have any intent on running across multiple nodes, Cassandra is probably the wrong DB for you. Postgres will give you a better feature set for a single node. On Fri, Aug 30, 2019 at 5:23 AM Pankaj Gajjar wrote: > Understand it well, how about Cassandra running on single node, we don’t > have cluster setup (3 nodes+ i.e). > > Does MVs perform well on single node machine ? > > Note: I know about HA, so lets keep it side for now and it's only possible > when we have cluster setup. > > On 29/08/19, 06:21, "Dor Laor" wrote: > > On Wed, Aug 28, 2019 at 5:43 PM Jon Haddad wrote: > > > > Arguably, the other alternative to server-side denormalization is > to do > > the denormalization client-side which comes with the same axes of > costs and > > complexity, just with more of each. > > > > That's not completely true. You can write to any number of tables > without > > doing a read, and the cost of reading data off disk is significantly > > greater than an insert alone. You can crush a cluster with a write > heavy > > workload and MVs that would otherwise be completely fine to do all > writes. > > > > The other issue with MVs is that you still need to understand > fundamentals > > of data modeling, that don't magically solve the problem of enormous > > partitions. One of the reasons I've had to un-MV a lot of clusters > is > > because people have put an MV on a table with a low-cardinality > field and > > found themselves with a 10GB partition nightmare, so they need to go > back > > and remodel the view as something more complex anyways. In this > case, the > > MV was extremely high cost since now they've not only pushed out a > poor > > implementation to begin with but now have the cost of a migration as > well > > as a rewrite. > > > > +1 > > Moreover, the hard part is that an update for the base table means that > the original data needs to be read and the database (or the poor > developer > who implements the denormalized model) needs to delete the data in the > view > and then to write the new ones. All need to be of course resilient to > all > types of > errors and failures. Had it been simple, there was no need for a > database > MV.. > > > > > > > > > > On Wed, Aug 28, 2019 at 9:58 AM Joshua McKenzie < > jmcken...@apache.org> > > wrote: > > > > > > > > > > so we need to start migration from MVs to manual query base > table ? > > > > > > Arguably, the other alternative to server-side denormalization is > to do > > > the denormalization client-side which comes with the same axes of > costs > > and > > > complexity, just with more of each. > > > > > > Jeff's spot on when he discusses the risk appetite vs. mitigation > aspect > > of > > > it. There's a reason banks do end-of-day close-out validation > analysis > > and > > > have redundant systems for things like this. > > > > > > On Wed, Aug 28, 2019 at 11:49 AM Jon Haddad > wrote: > > > > > > > I've helped a lot of teams (a dozen to two dozen maybe) migrate > away > > from > > > > MVs due to inconsistencies, issues with streaming (have you > added or > > > > removed nodes yet?), and
Re: Stability of MaterializedView in 3.11.x | 4.0
Hi Team, Thanks but this is not point, question again in mind, do we have any plan to fix this MVs issue into upcoming any Cassandra release ? 4.0 ? if yes then it would be great to wait. Or is there any plugin or workaround to resolve this issue well on Cassandra setup ? -- Regards Pankaj G. On 31/08/19, 00:33, "Jon Haddad" wrote: If you don't have any intent on running across multiple nodes, Cassandra is probably the wrong DB for you. Postgres will give you a better feature set for a single node. On Fri, Aug 30, 2019 at 5:23 AM Pankaj Gajjar wrote: > Understand it well, how about Cassandra running on single node, we don’t > have cluster setup (3 nodes+ i.e). > > Does MVs perform well on single node machine ? > > Note: I know about HA, so lets keep it side for now and it's only possible > when we have cluster setup. > > On 29/08/19, 06:21, "Dor Laor" wrote: > > On Wed, Aug 28, 2019 at 5:43 PM Jon Haddad wrote: > > > > Arguably, the other alternative to server-side denormalization is > to do > > the denormalization client-side which comes with the same axes of > costs and > > complexity, just with more of each. > > > > That's not completely true. You can write to any number of tables > without > > doing a read, and the cost of reading data off disk is significantly > > greater than an insert alone. You can crush a cluster with a write > heavy > > workload and MVs that would otherwise be completely fine to do all > writes. > > > > The other issue with MVs is that you still need to understand > fundamentals > > of data modeling, that don't magically solve the problem of enormous > > partitions. One of the reasons I've had to un-MV a lot of clusters > is > > because people have put an MV on a table with a low-cardinality > field and > > found themselves with a 10GB partition nightmare, so they need to go > back > > and remodel the view as something more complex anyways. In this > case, the > > MV was extremely high cost since now they've not only pushed out a > poor > > implementation to begin with but now have the cost of a migration as > well > > as a rewrite. > > > > +1 > > Moreover, the hard part is that an update for the base table means that > the original data needs to be read and the database (or the poor > developer > who implements the denormalized model) needs to delete the data in the > view > and then to write the new ones. All need to be of course resilient to > all > types of > errors and failures. Had it been simple, there was no need for a > database > MV.. > > > > > > > > > > On Wed, Aug 28, 2019 at 9:58 AM Joshua McKenzie < > jmcken...@apache.org> > > wrote: > > > > > > > > > > so we need to start migration from MVs to manual query base > table ? > > > > > > Arguably, the other alternative to server-side denormalization is > to do > > > the denormalization client-side which comes with the same axes of > costs > > and > > > complexity, just with more of each. > > > > > > Jeff's spot on when he discusses the risk appetite vs. mitigation > aspect > > of > > > it. There's a reason banks do end-of-day close-out validation > analysis > > and > > > have redundant systems for things like this. > > > > > > On Wed, Aug 28, 2019 at 11:49 AM Jon Haddad > wrote: > > > > > > > I've helped a lot of teams (a dozen to two dozen maybe) migrate > away > > from > > > > MVs due to inconsistencies, issues with streaming (have you > added or > > > > removed nodes yet?), and massive performance issues to the point > of > > > cluster > > > > failure under (what I consider) trivial load. I haven't gone > too deep > > > into > > > > analyzing their issues, folks are usually fine with "move off > them", vs > > > > having me do a ton of analysis. > > > > > > > > tlp-stress has a materialized view workload built in, and you > can add > > > > arbitrary CQL via the --cql flag to add a MV to any existing > workload > > > such > > > > as KeyValue or BasicTimeSeries. > > > > > > > > On Wed, Aug 28, 2019 at 8:11 AM Jeff Jirsa > wrote: > > > > > > > > > There have been people who have had operational issues related > to MVs > > > > (many > > > > > of them around running repair), but the biggest concern is > > correctness.
Re: Stability of MaterializedView in 3.11.x | 4.0
If you don't have any intent on running across multiple nodes, Cassandra is probably the wrong DB for you. Postgres will give you a better feature set for a single node. On Fri, Aug 30, 2019 at 5:23 AM Pankaj Gajjar wrote: > Understand it well, how about Cassandra running on single node, we don’t > have cluster setup (3 nodes+ i.e). > > Does MVs perform well on single node machine ? > > Note: I know about HA, so lets keep it side for now and it's only possible > when we have cluster setup. > > On 29/08/19, 06:21, "Dor Laor" wrote: > > On Wed, Aug 28, 2019 at 5:43 PM Jon Haddad wrote: > > > > Arguably, the other alternative to server-side denormalization is > to do > > the denormalization client-side which comes with the same axes of > costs and > > complexity, just with more of each. > > > > That's not completely true. You can write to any number of tables > without > > doing a read, and the cost of reading data off disk is significantly > > greater than an insert alone. You can crush a cluster with a write > heavy > > workload and MVs that would otherwise be completely fine to do all > writes. > > > > The other issue with MVs is that you still need to understand > fundamentals > > of data modeling, that don't magically solve the problem of enormous > > partitions. One of the reasons I've had to un-MV a lot of clusters > is > > because people have put an MV on a table with a low-cardinality > field and > > found themselves with a 10GB partition nightmare, so they need to go > back > > and remodel the view as something more complex anyways. In this > case, the > > MV was extremely high cost since now they've not only pushed out a > poor > > implementation to begin with but now have the cost of a migration as > well > > as a rewrite. > > > > +1 > > Moreover, the hard part is that an update for the base table means that > the original data needs to be read and the database (or the poor > developer > who implements the denormalized model) needs to delete the data in the > view > and then to write the new ones. All need to be of course resilient to > all > types of > errors and failures. Had it been simple, there was no need for a > database > MV.. > > > > > > > > > > On Wed, Aug 28, 2019 at 9:58 AM Joshua McKenzie < > jmcken...@apache.org> > > wrote: > > > > > > > > > > so we need to start migration from MVs to manual query base > table ? > > > > > > Arguably, the other alternative to server-side denormalization is > to do > > > the denormalization client-side which comes with the same axes of > costs > > and > > > complexity, just with more of each. > > > > > > Jeff's spot on when he discusses the risk appetite vs. mitigation > aspect > > of > > > it. There's a reason banks do end-of-day close-out validation > analysis > > and > > > have redundant systems for things like this. > > > > > > On Wed, Aug 28, 2019 at 11:49 AM Jon Haddad > wrote: > > > > > > > I've helped a lot of teams (a dozen to two dozen maybe) migrate > away > > from > > > > MVs due to inconsistencies, issues with streaming (have you > added or > > > > removed nodes yet?), and massive performance issues to the point > of > > > cluster > > > > failure under (what I consider) trivial load. I haven't gone > too deep > > > into > > > > analyzing their issues, folks are usually fine with "move off > them", vs > > > > having me do a ton of analysis. > > > > > > > > tlp-stress has a materialized view workload built in, and you > can add > > > > arbitrary CQL via the --cql flag to add a MV to any existing > workload > > > such > > > > as KeyValue or BasicTimeSeries. > > > > > > > > On Wed, Aug 28, 2019 at 8:11 AM Jeff Jirsa > wrote: > > > > > > > > > There have been people who have had operational issues related > to MVs > > > > (many > > > > > of them around running repair), but the biggest concern is > > correctness. > > > > > > > > > > It probably ultimately depends on what type of database you're > > running. > > > > If > > > > > you're running some sort of IOT / analytics workload and you > just > > want > > > > > another way to SELECT the data, but you won't notice one of a > billion > > > > > records going missing, using MVs may be fine. If you're a > bank, and > > one > > > > of > > > > > a billion records going missing means you lose someone's bank > > account, > > > I > > > > > would avoid using MVs. > > > > > > > > > > It's all just risk management. > > > > > > > > > > On Wed, Aug 28, 2019 at 7:18 AM Pankaj Gajjar < > > > > > pankaj.gaj...@contentserv.com> > > > > > wrote: > > > > > > > > > > > Hi Michael, > > > > > > > > > > > > Thanks for putting
Re: Stability of MaterializedView in 3.11.x | 4.0
Single node indeed doesn't need repair so it's easier. There is an admission control issue with MVs since they can incur a huge amplification, a single change in the base can trigger 1000s of operations in the view and they run async*. Hinted handoff for the MV helps as well but isn't needed for your single node. * In Scylla we have a back pressure mechanism that automatically slows down the client on such cases (not yet cover 100% of the use cases but much better). We also shared (NGCC proposal) a solution we haven't implemented yet for repairs, if there is an interest, we can post it here. On Fri, Aug 30, 2019 at 5:23 AM Pankaj Gajjar wrote: > Understand it well, how about Cassandra running on single node, we don’t > have cluster setup (3 nodes+ i.e). > > Does MVs perform well on single node machine ? > > Note: I know about HA, so lets keep it side for now and it's only possible > when we have cluster setup. > > On 29/08/19, 06:21, "Dor Laor" wrote: > > On Wed, Aug 28, 2019 at 5:43 PM Jon Haddad wrote: > > > > Arguably, the other alternative to server-side denormalization is > to do > > the denormalization client-side which comes with the same axes of > costs and > > complexity, just with more of each. > > > > That's not completely true. You can write to any number of tables > without > > doing a read, and the cost of reading data off disk is significantly > > greater than an insert alone. You can crush a cluster with a write > heavy > > workload and MVs that would otherwise be completely fine to do all > writes. > > > > The other issue with MVs is that you still need to understand > fundamentals > > of data modeling, that don't magically solve the problem of enormous > > partitions. One of the reasons I've had to un-MV a lot of clusters > is > > because people have put an MV on a table with a low-cardinality > field and > > found themselves with a 10GB partition nightmare, so they need to go > back > > and remodel the view as something more complex anyways. In this > case, the > > MV was extremely high cost since now they've not only pushed out a > poor > > implementation to begin with but now have the cost of a migration as > well > > as a rewrite. > > > > +1 > > Moreover, the hard part is that an update for the base table means that > the original data needs to be read and the database (or the poor > developer > who implements the denormalized model) needs to delete the data in the > view > and then to write the new ones. All need to be of course resilient to > all > types of > errors and failures. Had it been simple, there was no need for a > database > MV.. > > > > > > > > > > On Wed, Aug 28, 2019 at 9:58 AM Joshua McKenzie < > jmcken...@apache.org> > > wrote: > > > > > > > > > > so we need to start migration from MVs to manual query base > table ? > > > > > > Arguably, the other alternative to server-side denormalization is > to do > > > the denormalization client-side which comes with the same axes of > costs > > and > > > complexity, just with more of each. > > > > > > Jeff's spot on when he discusses the risk appetite vs. mitigation > aspect > > of > > > it. There's a reason banks do end-of-day close-out validation > analysis > > and > > > have redundant systems for things like this. > > > > > > On Wed, Aug 28, 2019 at 11:49 AM Jon Haddad > wrote: > > > > > > > I've helped a lot of teams (a dozen to two dozen maybe) migrate > away > > from > > > > MVs due to inconsistencies, issues with streaming (have you > added or > > > > removed nodes yet?), and massive performance issues to the point > of > > > cluster > > > > failure under (what I consider) trivial load. I haven't gone > too deep > > > into > > > > analyzing their issues, folks are usually fine with "move off > them", vs > > > > having me do a ton of analysis. > > > > > > > > tlp-stress has a materialized view workload built in, and you > can add > > > > arbitrary CQL via the --cql flag to add a MV to any existing > workload > > > such > > > > as KeyValue or BasicTimeSeries. > > > > > > > > On Wed, Aug 28, 2019 at 8:11 AM Jeff Jirsa > wrote: > > > > > > > > > There have been people who have had operational issues related > to MVs > > > > (many > > > > > of them around running repair), but the biggest concern is > > correctness. > > > > > > > > > > It probably ultimately depends on what type of database you're > > running. > > > > If > > > > > you're running some sort of IOT / analytics workload and you > just > > want > > > > > another way to SELECT the data, but you won't notice one of a > billion > > > > > records going missing, using MVs may be fine. If you're a > bank, and > > one > > > > of > >
Re: Stability of MaterializedView in 3.11.x | 4.0
Understand it well, how about Cassandra running on single node, we don’t have cluster setup (3 nodes+ i.e). Does MVs perform well on single node machine ? Note: I know about HA, so lets keep it side for now and it's only possible when we have cluster setup. On 29/08/19, 06:21, "Dor Laor" wrote: On Wed, Aug 28, 2019 at 5:43 PM Jon Haddad wrote: > > Arguably, the other alternative to server-side denormalization is to do > the denormalization client-side which comes with the same axes of costs and > complexity, just with more of each. > > That's not completely true. You can write to any number of tables without > doing a read, and the cost of reading data off disk is significantly > greater than an insert alone. You can crush a cluster with a write heavy > workload and MVs that would otherwise be completely fine to do all writes. > > The other issue with MVs is that you still need to understand fundamentals > of data modeling, that don't magically solve the problem of enormous > partitions. One of the reasons I've had to un-MV a lot of clusters is > because people have put an MV on a table with a low-cardinality field and > found themselves with a 10GB partition nightmare, so they need to go back > and remodel the view as something more complex anyways. In this case, the > MV was extremely high cost since now they've not only pushed out a poor > implementation to begin with but now have the cost of a migration as well > as a rewrite. > +1 Moreover, the hard part is that an update for the base table means that the original data needs to be read and the database (or the poor developer who implements the denormalized model) needs to delete the data in the view and then to write the new ones. All need to be of course resilient to all types of errors and failures. Had it been simple, there was no need for a database MV.. > > > > On Wed, Aug 28, 2019 at 9:58 AM Joshua McKenzie > wrote: > > > > > > > so we need to start migration from MVs to manual query base table ? > > > > Arguably, the other alternative to server-side denormalization is to do > > the denormalization client-side which comes with the same axes of costs > and > > complexity, just with more of each. > > > > Jeff's spot on when he discusses the risk appetite vs. mitigation aspect > of > > it. There's a reason banks do end-of-day close-out validation analysis > and > > have redundant systems for things like this. > > > > On Wed, Aug 28, 2019 at 11:49 AM Jon Haddad wrote: > > > > > I've helped a lot of teams (a dozen to two dozen maybe) migrate away > from > > > MVs due to inconsistencies, issues with streaming (have you added or > > > removed nodes yet?), and massive performance issues to the point of > > cluster > > > failure under (what I consider) trivial load. I haven't gone too deep > > into > > > analyzing their issues, folks are usually fine with "move off them", vs > > > having me do a ton of analysis. > > > > > > tlp-stress has a materialized view workload built in, and you can add > > > arbitrary CQL via the --cql flag to add a MV to any existing workload > > such > > > as KeyValue or BasicTimeSeries. > > > > > > On Wed, Aug 28, 2019 at 8:11 AM Jeff Jirsa wrote: > > > > > > > There have been people who have had operational issues related to MVs > > > (many > > > > of them around running repair), but the biggest concern is > correctness. > > > > > > > > It probably ultimately depends on what type of database you're > running. > > > If > > > > you're running some sort of IOT / analytics workload and you just > want > > > > another way to SELECT the data, but you won't notice one of a billion > > > > records going missing, using MVs may be fine. If you're a bank, and > one > > > of > > > > a billion records going missing means you lose someone's bank > account, > > I > > > > would avoid using MVs. > > > > > > > > It's all just risk management. > > > > > > > > On Wed, Aug 28, 2019 at 7:18 AM Pankaj Gajjar < > > > > pankaj.gaj...@contentserv.com> > > > > wrote: > > > > > > > > > Hi Michael, > > > > > > > > > > Thanks for putting very clever information " Users of MVs *must* > > > > determine > > > > > for themselves, through > > > > > thorough testing and understanding, if they wish to use them." > > And > > > > > this concluded that if there is any issue occur in future then only > > > > > solution is to rebuild the MVs since Cassandra does not able to > make > > > > > consistent synch well. > > > > > > > > > > Also, we practically using the 10+ MVs and as of now, we have not
Re: Stability of MaterializedView in 3.11.x | 4.0
On Wed, Aug 28, 2019 at 5:43 PM Jon Haddad wrote: > > Arguably, the other alternative to server-side denormalization is to do > the denormalization client-side which comes with the same axes of costs and > complexity, just with more of each. > > That's not completely true. You can write to any number of tables without > doing a read, and the cost of reading data off disk is significantly > greater than an insert alone. You can crush a cluster with a write heavy > workload and MVs that would otherwise be completely fine to do all writes. > > The other issue with MVs is that you still need to understand fundamentals > of data modeling, that don't magically solve the problem of enormous > partitions. One of the reasons I've had to un-MV a lot of clusters is > because people have put an MV on a table with a low-cardinality field and > found themselves with a 10GB partition nightmare, so they need to go back > and remodel the view as something more complex anyways. In this case, the > MV was extremely high cost since now they've not only pushed out a poor > implementation to begin with but now have the cost of a migration as well > as a rewrite. > +1 Moreover, the hard part is that an update for the base table means that the original data needs to be read and the database (or the poor developer who implements the denormalized model) needs to delete the data in the view and then to write the new ones. All need to be of course resilient to all types of errors and failures. Had it been simple, there was no need for a database MV.. > > > > On Wed, Aug 28, 2019 at 9:58 AM Joshua McKenzie > wrote: > > > > > > > so we need to start migration from MVs to manual query base table ? > > > > Arguably, the other alternative to server-side denormalization is to do > > the denormalization client-side which comes with the same axes of costs > and > > complexity, just with more of each. > > > > Jeff's spot on when he discusses the risk appetite vs. mitigation aspect > of > > it. There's a reason banks do end-of-day close-out validation analysis > and > > have redundant systems for things like this. > > > > On Wed, Aug 28, 2019 at 11:49 AM Jon Haddad wrote: > > > > > I've helped a lot of teams (a dozen to two dozen maybe) migrate away > from > > > MVs due to inconsistencies, issues with streaming (have you added or > > > removed nodes yet?), and massive performance issues to the point of > > cluster > > > failure under (what I consider) trivial load. I haven't gone too deep > > into > > > analyzing their issues, folks are usually fine with "move off them", vs > > > having me do a ton of analysis. > > > > > > tlp-stress has a materialized view workload built in, and you can add > > > arbitrary CQL via the --cql flag to add a MV to any existing workload > > such > > > as KeyValue or BasicTimeSeries. > > > > > > On Wed, Aug 28, 2019 at 8:11 AM Jeff Jirsa wrote: > > > > > > > There have been people who have had operational issues related to MVs > > > (many > > > > of them around running repair), but the biggest concern is > correctness. > > > > > > > > It probably ultimately depends on what type of database you're > running. > > > If > > > > you're running some sort of IOT / analytics workload and you just > want > > > > another way to SELECT the data, but you won't notice one of a billion > > > > records going missing, using MVs may be fine. If you're a bank, and > one > > > of > > > > a billion records going missing means you lose someone's bank > account, > > I > > > > would avoid using MVs. > > > > > > > > It's all just risk management. > > > > > > > > On Wed, Aug 28, 2019 at 7:18 AM Pankaj Gajjar < > > > > pankaj.gaj...@contentserv.com> > > > > wrote: > > > > > > > > > Hi Michael, > > > > > > > > > > Thanks for putting very clever information " Users of MVs *must* > > > > determine > > > > > for themselves, through > > > > > thorough testing and understanding, if they wish to use them." > > And > > > > > this concluded that if there is any issue occur in future then only > > > > > solution is to rebuild the MVs since Cassandra does not able to > make > > > > > consistent synch well. > > > > > > > > > > Also, we practically using the 10+ MVs and as of now, we have not > > faced > > > > > any issue, so my question to all community member, does anyone face > > any > > > > > critical issues ? so we need to start migration from MVs to manual > > > query > > > > > base table ? > > > > > > > > > > Also, I can understand now, it's experimental and not ready for > > > > > production, so if possible, please ignore it only right ? > > > > > > > > > > Thanks > > > > > Pankaj > > > > > > > > > > On 27/08/19, 19:03, "Michael Shuler" > > behalf > > > > > of mich...@pbandjelly.org> wrote: > > > > > > > > > > It appears that you found the first message of the chain. I > > suggest > > > > > reading the linked JIRA and the complete dev@ thread that > > arrived > > > at > > > > > this conclusion; there are loads
Re: Stability of MaterializedView in 3.11.x | 4.0
> Arguably, the other alternative to server-side denormalization is to do the denormalization client-side which comes with the same axes of costs and complexity, just with more of each. That's not completely true. You can write to any number of tables without doing a read, and the cost of reading data off disk is significantly greater than an insert alone. You can crush a cluster with a write heavy workload and MVs that would otherwise be completely fine to do all writes. The other issue with MVs is that you still need to understand fundamentals of data modeling, that don't magically solve the problem of enormous partitions. One of the reasons I've had to un-MV a lot of clusters is because people have put an MV on a table with a low-cardinality field and found themselves with a 10GB partition nightmare, so they need to go back and remodel the view as something more complex anyways. In this case, the MV was extremely high cost since now they've not only pushed out a poor implementation to begin with but now have the cost of a migration as well as a rewrite. On Wed, Aug 28, 2019 at 9:58 AM Joshua McKenzie wrote: > > > > so we need to start migration from MVs to manual query base table ? > > Arguably, the other alternative to server-side denormalization is to do > the denormalization client-side which comes with the same axes of costs and > complexity, just with more of each. > > Jeff's spot on when he discusses the risk appetite vs. mitigation aspect of > it. There's a reason banks do end-of-day close-out validation analysis and > have redundant systems for things like this. > > On Wed, Aug 28, 2019 at 11:49 AM Jon Haddad wrote: > > > I've helped a lot of teams (a dozen to two dozen maybe) migrate away from > > MVs due to inconsistencies, issues with streaming (have you added or > > removed nodes yet?), and massive performance issues to the point of > cluster > > failure under (what I consider) trivial load. I haven't gone too deep > into > > analyzing their issues, folks are usually fine with "move off them", vs > > having me do a ton of analysis. > > > > tlp-stress has a materialized view workload built in, and you can add > > arbitrary CQL via the --cql flag to add a MV to any existing workload > such > > as KeyValue or BasicTimeSeries. > > > > On Wed, Aug 28, 2019 at 8:11 AM Jeff Jirsa wrote: > > > > > There have been people who have had operational issues related to MVs > > (many > > > of them around running repair), but the biggest concern is correctness. > > > > > > It probably ultimately depends on what type of database you're running. > > If > > > you're running some sort of IOT / analytics workload and you just want > > > another way to SELECT the data, but you won't notice one of a billion > > > records going missing, using MVs may be fine. If you're a bank, and one > > of > > > a billion records going missing means you lose someone's bank account, > I > > > would avoid using MVs. > > > > > > It's all just risk management. > > > > > > On Wed, Aug 28, 2019 at 7:18 AM Pankaj Gajjar < > > > pankaj.gaj...@contentserv.com> > > > wrote: > > > > > > > Hi Michael, > > > > > > > > Thanks for putting very clever information " Users of MVs *must* > > > determine > > > > for themselves, through > > > > thorough testing and understanding, if they wish to use them." > And > > > > this concluded that if there is any issue occur in future then only > > > > solution is to rebuild the MVs since Cassandra does not able to make > > > > consistent synch well. > > > > > > > > Also, we practically using the 10+ MVs and as of now, we have not > faced > > > > any issue, so my question to all community member, does anyone face > any > > > > critical issues ? so we need to start migration from MVs to manual > > query > > > > base table ? > > > > > > > > Also, I can understand now, it's experimental and not ready for > > > > production, so if possible, please ignore it only right ? > > > > > > > > Thanks > > > > Pankaj > > > > > > > > On 27/08/19, 19:03, "Michael Shuler" > behalf > > > > of mich...@pbandjelly.org> wrote: > > > > > > > > It appears that you found the first message of the chain. I > suggest > > > > reading the linked JIRA and the complete dev@ thread that > arrived > > at > > > > this conclusion; there are loads of well formed opinions and > > > > information. Users of MVs *must* determine for themselves, > through > > > > thorough testing and understanding, if they wish to use them. > > > > > > > > Linkage: > > > > https://issues.apache.org/jira/browse/CASSANDRA-13959 > > > > (sub-linkage..) > > > > https://issues.apache.org/jira/browse/CASSANDRA-13595 > > > > https://issues.apache.org/jira/browse/CASSANDRA-13911 > > > > https://issues.apache.org/jira/browse/CASSANDRA-13880 > > > > https://issues.apache.org/jira/browse/CASSANDRA-12872 > > > > https://issues.apache.org/jira/browse/CASSANDRA-13747 > > > > > > > > Very
Re: Stability of MaterializedView in 3.11.x | 4.0
> > so we need to start migration from MVs to manual query base table ? Arguably, the other alternative to server-side denormalization is to do the denormalization client-side which comes with the same axes of costs and complexity, just with more of each. Jeff's spot on when he discusses the risk appetite vs. mitigation aspect of it. There's a reason banks do end-of-day close-out validation analysis and have redundant systems for things like this. On Wed, Aug 28, 2019 at 11:49 AM Jon Haddad wrote: > I've helped a lot of teams (a dozen to two dozen maybe) migrate away from > MVs due to inconsistencies, issues with streaming (have you added or > removed nodes yet?), and massive performance issues to the point of cluster > failure under (what I consider) trivial load. I haven't gone too deep into > analyzing their issues, folks are usually fine with "move off them", vs > having me do a ton of analysis. > > tlp-stress has a materialized view workload built in, and you can add > arbitrary CQL via the --cql flag to add a MV to any existing workload such > as KeyValue or BasicTimeSeries. > > On Wed, Aug 28, 2019 at 8:11 AM Jeff Jirsa wrote: > > > There have been people who have had operational issues related to MVs > (many > > of them around running repair), but the biggest concern is correctness. > > > > It probably ultimately depends on what type of database you're running. > If > > you're running some sort of IOT / analytics workload and you just want > > another way to SELECT the data, but you won't notice one of a billion > > records going missing, using MVs may be fine. If you're a bank, and one > of > > a billion records going missing means you lose someone's bank account, I > > would avoid using MVs. > > > > It's all just risk management. > > > > On Wed, Aug 28, 2019 at 7:18 AM Pankaj Gajjar < > > pankaj.gaj...@contentserv.com> > > wrote: > > > > > Hi Michael, > > > > > > Thanks for putting very clever information " Users of MVs *must* > > determine > > > for themselves, through > > > thorough testing and understanding, if they wish to use them." And > > > this concluded that if there is any issue occur in future then only > > > solution is to rebuild the MVs since Cassandra does not able to make > > > consistent synch well. > > > > > > Also, we practically using the 10+ MVs and as of now, we have not faced > > > any issue, so my question to all community member, does anyone face any > > > critical issues ? so we need to start migration from MVs to manual > query > > > base table ? > > > > > > Also, I can understand now, it's experimental and not ready for > > > production, so if possible, please ignore it only right ? > > > > > > Thanks > > > Pankaj > > > > > > On 27/08/19, 19:03, "Michael Shuler" behalf > > > of mich...@pbandjelly.org> wrote: > > > > > > It appears that you found the first message of the chain. I suggest > > > reading the linked JIRA and the complete dev@ thread that arrived > at > > > this conclusion; there are loads of well formed opinions and > > > information. Users of MVs *must* determine for themselves, through > > > thorough testing and understanding, if they wish to use them. > > > > > > Linkage: > > > https://issues.apache.org/jira/browse/CASSANDRA-13959 > > > (sub-linkage..) > > > https://issues.apache.org/jira/browse/CASSANDRA-13595 > > > https://issues.apache.org/jira/browse/CASSANDRA-13911 > > > https://issues.apache.org/jira/browse/CASSANDRA-13880 > > > https://issues.apache.org/jira/browse/CASSANDRA-12872 > > > https://issues.apache.org/jira/browse/CASSANDRA-13747 > > > > > > Very much worth reading the complete thread: > > > part1: > > > > > > > > > https://lists.apache.org/thread.html/d81a61da48e1b872d7599df4edfa8e244d34cbd591a18539f724796f@ > > > > > > part2: > > > > > > > > > https://lists.apache.org/thread.html/19b7fcfd3b47f1526d6e993b3bb97f6c43e5ce204bc976ec0701cdd3@ > > > > > > > > > Quick JQL for open tickets with "mv": > > > > > > > > > https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20text%20~%20mv%20AND%20status%20!%3D%20Resolved > > > > > > -- > > > Michael > > > > > > On 8/27/19 5:01 AM, pankaj gajjar wrote: > > > > Hello, > > > > > > > > > > > > > > > > concern about Materialized Views (MVs) in Cassandra. > Unfortunately > > > starting > > > > with version 3.11, MVs are officially considered experimental and > > > not ready > > > > for production use, as you can read here: > > > > > > > > > > > > > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/cassandra-user/201710.mbox/%3cetpan.59f24f38.438f4e99.7...@apple.com%3E > > > > > > > > > > > > > > > > Can you please someone give some productive feedback on this ? it > > > would > > > > help us to further implementation around the MVs in Cassandra. > > > > > > > > > > > > > > > > Does
Re: Stability of MaterializedView in 3.11.x | 4.0
I've helped a lot of teams (a dozen to two dozen maybe) migrate away from MVs due to inconsistencies, issues with streaming (have you added or removed nodes yet?), and massive performance issues to the point of cluster failure under (what I consider) trivial load. I haven't gone too deep into analyzing their issues, folks are usually fine with "move off them", vs having me do a ton of analysis. tlp-stress has a materialized view workload built in, and you can add arbitrary CQL via the --cql flag to add a MV to any existing workload such as KeyValue or BasicTimeSeries. On Wed, Aug 28, 2019 at 8:11 AM Jeff Jirsa wrote: > There have been people who have had operational issues related to MVs (many > of them around running repair), but the biggest concern is correctness. > > It probably ultimately depends on what type of database you're running. If > you're running some sort of IOT / analytics workload and you just want > another way to SELECT the data, but you won't notice one of a billion > records going missing, using MVs may be fine. If you're a bank, and one of > a billion records going missing means you lose someone's bank account, I > would avoid using MVs. > > It's all just risk management. > > On Wed, Aug 28, 2019 at 7:18 AM Pankaj Gajjar < > pankaj.gaj...@contentserv.com> > wrote: > > > Hi Michael, > > > > Thanks for putting very clever information " Users of MVs *must* > determine > > for themselves, through > > thorough testing and understanding, if they wish to use them." And > > this concluded that if there is any issue occur in future then only > > solution is to rebuild the MVs since Cassandra does not able to make > > consistent synch well. > > > > Also, we practically using the 10+ MVs and as of now, we have not faced > > any issue, so my question to all community member, does anyone face any > > critical issues ? so we need to start migration from MVs to manual query > > base table ? > > > > Also, I can understand now, it's experimental and not ready for > > production, so if possible, please ignore it only right ? > > > > Thanks > > Pankaj > > > > On 27/08/19, 19:03, "Michael Shuler" > of mich...@pbandjelly.org> wrote: > > > > It appears that you found the first message of the chain. I suggest > > reading the linked JIRA and the complete dev@ thread that arrived at > > this conclusion; there are loads of well formed opinions and > > information. Users of MVs *must* determine for themselves, through > > thorough testing and understanding, if they wish to use them. > > > > Linkage: > > https://issues.apache.org/jira/browse/CASSANDRA-13959 > > (sub-linkage..) > > https://issues.apache.org/jira/browse/CASSANDRA-13595 > > https://issues.apache.org/jira/browse/CASSANDRA-13911 > > https://issues.apache.org/jira/browse/CASSANDRA-13880 > > https://issues.apache.org/jira/browse/CASSANDRA-12872 > > https://issues.apache.org/jira/browse/CASSANDRA-13747 > > > > Very much worth reading the complete thread: > > part1: > > > > > https://lists.apache.org/thread.html/d81a61da48e1b872d7599df4edfa8e244d34cbd591a18539f724796f@ > > > > part2: > > > > > https://lists.apache.org/thread.html/19b7fcfd3b47f1526d6e993b3bb97f6c43e5ce204bc976ec0701cdd3@ > > > > > > Quick JQL for open tickets with "mv": > > > > > https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20text%20~%20mv%20AND%20status%20!%3D%20Resolved > > > > -- > > Michael > > > > On 8/27/19 5:01 AM, pankaj gajjar wrote: > > > Hello, > > > > > > > > > > > > concern about Materialized Views (MVs) in Cassandra. Unfortunately > > starting > > > with version 3.11, MVs are officially considered experimental and > > not ready > > > for production use, as you can read here: > > > > > > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/cassandra-user/201710.mbox/%3cetpan.59f24f38.438f4e99.7...@apple.com%3E > > > > > > > > > > > > Can you please someone give some productive feedback on this ? it > > would > > > help us to further implementation around the MVs in Cassandra. > > > > > > > > > > > > Does anyone facing some critical issue or data lose or > > synchronization > > > issue ? > > > > > > > > > > > > Regards > > > > > > Pankaj. > > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > > > >
Re: Stability of MaterializedView in 3.11.x | 4.0
There have been people who have had operational issues related to MVs (many of them around running repair), but the biggest concern is correctness. It probably ultimately depends on what type of database you're running. If you're running some sort of IOT / analytics workload and you just want another way to SELECT the data, but you won't notice one of a billion records going missing, using MVs may be fine. If you're a bank, and one of a billion records going missing means you lose someone's bank account, I would avoid using MVs. It's all just risk management. On Wed, Aug 28, 2019 at 7:18 AM Pankaj Gajjar wrote: > Hi Michael, > > Thanks for putting very clever information " Users of MVs *must* determine > for themselves, through > thorough testing and understanding, if they wish to use them." And > this concluded that if there is any issue occur in future then only > solution is to rebuild the MVs since Cassandra does not able to make > consistent synch well. > > Also, we practically using the 10+ MVs and as of now, we have not faced > any issue, so my question to all community member, does anyone face any > critical issues ? so we need to start migration from MVs to manual query > base table ? > > Also, I can understand now, it's experimental and not ready for > production, so if possible, please ignore it only right ? > > Thanks > Pankaj > > On 27/08/19, 19:03, "Michael Shuler" of mich...@pbandjelly.org> wrote: > > It appears that you found the first message of the chain. I suggest > reading the linked JIRA and the complete dev@ thread that arrived at > this conclusion; there are loads of well formed opinions and > information. Users of MVs *must* determine for themselves, through > thorough testing and understanding, if they wish to use them. > > Linkage: > https://issues.apache.org/jira/browse/CASSANDRA-13959 > (sub-linkage..) > https://issues.apache.org/jira/browse/CASSANDRA-13595 > https://issues.apache.org/jira/browse/CASSANDRA-13911 > https://issues.apache.org/jira/browse/CASSANDRA-13880 > https://issues.apache.org/jira/browse/CASSANDRA-12872 > https://issues.apache.org/jira/browse/CASSANDRA-13747 > > Very much worth reading the complete thread: > part1: > > https://lists.apache.org/thread.html/d81a61da48e1b872d7599df4edfa8e244d34cbd591a18539f724796f@ > > part2: > > https://lists.apache.org/thread.html/19b7fcfd3b47f1526d6e993b3bb97f6c43e5ce204bc976ec0701cdd3@ > > > Quick JQL for open tickets with "mv": > > https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20text%20~%20mv%20AND%20status%20!%3D%20Resolved > > -- > Michael > > On 8/27/19 5:01 AM, pankaj gajjar wrote: > > Hello, > > > > > > > > concern about Materialized Views (MVs) in Cassandra. Unfortunately > starting > > with version 3.11, MVs are officially considered experimental and > not ready > > for production use, as you can read here: > > > > > > > > > http://mail-archives.apache.org/mod_mbox/cassandra-user/201710.mbox/%3cetpan.59f24f38.438f4e99.7...@apple.com%3E > > > > > > > > Can you please someone give some productive feedback on this ? it > would > > help us to further implementation around the MVs in Cassandra. > > > > > > > > Does anyone facing some critical issue or data lose or > synchronization > > issue ? > > > > > > > > Regards > > > > Pankaj. > > > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > >
Re: Stability of MaterializedView in 3.11.x | 4.0
Hi Michael, Thanks for putting very clever information " Users of MVs *must* determine for themselves, through thorough testing and understanding, if they wish to use them." And this concluded that if there is any issue occur in future then only solution is to rebuild the MVs since Cassandra does not able to make consistent synch well. Also, we practically using the 10+ MVs and as of now, we have not faced any issue, so my question to all community member, does anyone face any critical issues ? so we need to start migration from MVs to manual query base table ? Also, I can understand now, it's experimental and not ready for production, so if possible, please ignore it only right ? Thanks Pankaj On 27/08/19, 19:03, "Michael Shuler" wrote: It appears that you found the first message of the chain. I suggest reading the linked JIRA and the complete dev@ thread that arrived at this conclusion; there are loads of well formed opinions and information. Users of MVs *must* determine for themselves, through thorough testing and understanding, if they wish to use them. Linkage: https://issues.apache.org/jira/browse/CASSANDRA-13959 (sub-linkage..) https://issues.apache.org/jira/browse/CASSANDRA-13595 https://issues.apache.org/jira/browse/CASSANDRA-13911 https://issues.apache.org/jira/browse/CASSANDRA-13880 https://issues.apache.org/jira/browse/CASSANDRA-12872 https://issues.apache.org/jira/browse/CASSANDRA-13747 Very much worth reading the complete thread: part1: https://lists.apache.org/thread.html/d81a61da48e1b872d7599df4edfa8e244d34cbd591a18539f724796f@ part2: https://lists.apache.org/thread.html/19b7fcfd3b47f1526d6e993b3bb97f6c43e5ce204bc976ec0701cdd3@ Quick JQL for open tickets with "mv": https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20text%20~%20mv%20AND%20status%20!%3D%20Resolved -- Michael On 8/27/19 5:01 AM, pankaj gajjar wrote: > Hello, > > > > concern about Materialized Views (MVs) in Cassandra. Unfortunately starting > with version 3.11, MVs are officially considered experimental and not ready > for production use, as you can read here: > > > > http://mail-archives.apache.org/mod_mbox/cassandra-user/201710.mbox/%3cetpan.59f24f38.438f4e99.7...@apple.com%3E > > > > Can you please someone give some productive feedback on this ? it would > help us to further implementation around the MVs in Cassandra. > > > > Does anyone facing some critical issue or data lose or synchronization > issue ? > > > > Regards > > Pankaj. > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Stability of MaterializedView in 3.11.x | 4.0
It appears that you found the first message of the chain. I suggest reading the linked JIRA and the complete dev@ thread that arrived at this conclusion; there are loads of well formed opinions and information. Users of MVs *must* determine for themselves, through thorough testing and understanding, if they wish to use them. Linkage: https://issues.apache.org/jira/browse/CASSANDRA-13959 (sub-linkage..) https://issues.apache.org/jira/browse/CASSANDRA-13595 https://issues.apache.org/jira/browse/CASSANDRA-13911 https://issues.apache.org/jira/browse/CASSANDRA-13880 https://issues.apache.org/jira/browse/CASSANDRA-12872 https://issues.apache.org/jira/browse/CASSANDRA-13747 Very much worth reading the complete thread: part1: https://lists.apache.org/thread.html/d81a61da48e1b872d7599df4edfa8e244d34cbd591a18539f724796f@ part2: https://lists.apache.org/thread.html/19b7fcfd3b47f1526d6e993b3bb97f6c43e5ce204bc976ec0701cdd3@ Quick JQL for open tickets with "mv": https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20text%20~%20mv%20AND%20status%20!%3D%20Resolved -- Michael On 8/27/19 5:01 AM, pankaj gajjar wrote: Hello, concern about Materialized Views (MVs) in Cassandra. Unfortunately starting with version 3.11, MVs are officially considered experimental and not ready for production use, as you can read here: http://mail-archives.apache.org/mod_mbox/cassandra-user/201710.mbox/%3cetpan.59f24f38.438f4e99.7...@apple.com%3E Can you please someone give some productive feedback on this ? it would help us to further implementation around the MVs in Cassandra. Does anyone facing some critical issue or data lose or synchronization issue ? Regards Pankaj. - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Stability of MaterializedView in 3.11.x | 4.0
Hi Pankaj, The main issues are described in the link you posted and are some synchronization issues * There's no way to determine if a view is out of sync with the base table. * If you do determine that a view is out of sync, the only way to fix it is to drop and rebuild the view. Even in the happy path, there isn’t an upper bound on how long it will take for updates to be reflected in the view. On Tue, Aug 27, 2019 at 2:53 PM pankaj gajjar wrote: > Hello, > > > > concern about Materialized Views (MVs) in Cassandra. Unfortunately starting > with version 3.11, MVs are officially considered experimental and not ready > for production use, as you can read here: > > > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-5Fmbox_cassandra-2Duser_201710.mbox_-253CetPan.59f24f38.438f4e99.74dc-40apple.com-253E=DwIBaQ=adz96Xi0w1RHqtPMowiL2g=Jad7nE1Oab1mebx31r7AOfSsa0by8th6tCxpykmmOBA=gLWrMhkgn6VAhmUaYPdvhXIEHx0FOINcMtH1FxhC7i4=I4W1EfKR6JfWmm0x444DvOXyA4t3WpqsmBX20mlZphA= > > > > Can you please someone give some productive feedback on this ? it would > help us to further implementation around the MVs in Cassandra. > > > > Does anyone facing some critical issue or data lose or synchronization > issue ? > > > > Regards > > Pankaj. > > -- > -- > Regards > Pankkaj. >
Stability of MaterializedView in 3.11.x | 4.0
Hello, concern about Materialized Views (MVs) in Cassandra. Unfortunately starting with version 3.11, MVs are officially considered experimental and not ready for production use, as you can read here: http://mail-archives.apache.org/mod_mbox/cassandra-user/201710.mbox/%3cetpan.59f24f38.438f4e99.7...@apple.com%3E Can you please someone give some productive feedback on this ? it would help us to further implementation around the MVs in Cassandra. Does anyone facing some critical issue or data lose or synchronization issue ? Regards Pankaj. -- -- Regards Pankkaj.