Re: Materialized View inconsistency issue

Regis Le Bretonnic Fri, 18 Aug 2023 00:47:46 -0700

Hi Stefan

Happy to see that our use case interest you :-)
I'm not sure that I explained well what we want.


Imagine that sequence of events :
- Julia visits Joe at t1
- Julia visits Joe at t2
- Karen visits Joe at t3
- Silvia visits Joe at t4
- Karen visits Joe at t5
- Karen visits Joe at t6
- Julia visits Joe at t7

We want to provide to Joe a webpage listing visits he received in that
order :
- Juiia at t7  (the more recent)
- Karen at t6
- Silvia at t4

The 2 tables you propose Stefan can not natively order rows by time (they
will be ordered by visitor_id), excepted if you sort rows after the select.

Keep in mind that some people can received a loooooot on visits in 6 months
(200,000 or 300,000 deduplicated visits, and much more if you keep
duplicate visits) and ordering such volume of rows by code is not easy (in
fact impossible because we use PHP and we can't do that in a FPM memory...)
... and of course, because we can not provide in a single page of 200,000
or 300,000 members stickers in one shot, the webpage requires pagination
(with lot of 100 profiles per page). If you decide that sorting should be
made on the code side, the pagination becomes awful to manage.

PS 1 : when we decide to do this, MV were not yet back to experimental
PS 2 : the code to manage a visit received is very easy... we just do a
insert in the master table without doing any select before... we just don't
care of what happened in past...
PS 3 : the pagination is very easy... we just do a
- select * from visits_received_by_date where receiver_id=111 and
visit_date<now limit 100
and on the next page :
- select * from visits_received_by_date where receiver_id=111 and
visit_date<$last_date_of_previous_page limit 100

PS 4 : I simplify reality
- we have additionnal columns in our table (and not only 3) that take space
in memory...
- the query of pagination is not exactly this one because we can received 2
or 3 visits at the same second (and we manage this correctly).

#simple is very important for us. Our master table + MV simplify a lot of
things on the code side.
0,0007% of error is acceptable for us. In true life, very few people
detects that Karen appears in the page 1 of visits received, and also in
page 7 (because most of the time people don't scroll a lot of pages).

*But if someone has a better proposal, I take it* (we already discussed
with datastax about our need with no better proposal considering our use
case 😎)

Le jeu. 17 août 2023 à 21:37, Miklosovic, Stefan <
stefan.mikloso...@netapp.com> a écrit :

> Why can't you do it like this? You would have two tables:
>
> create table visits (user_id bigint, visitor_id bigint, visit_date
> timestamp, primary key ((user_id, visitor_id), visit_date)) order by
> visit_date desc
>
> create table visitors_by_user_id (user_id bigint, visitor_id bigint,
> primary key ((user_id), visitor_id))
>
> The logic behind the second table, visitors_by_user_id, is that you do not
> care if a user visited you twice, because it is primary key + clustering
> column, if same user visits you twice, the second time it would basically
> do nothing, because such entry is already there.
>
> For example:
>
> user_id | visitor_id
> joe | karen
> joe | julia
>
> If Karen visits me again, nothing happens as that entry is already there.
>
> Then if Karen visits me, I put into the second table
>
> joe | karen | tuesday
> joe | karen | monday
> joe | karen | last friday
> joe | julia | today
>
> So to know who visited me recently, I do
>
> select visitor_id from visitors_by_user_id where user_id = Joe;
>
> So I get Karen and Julia
>
> And then for each such visitor I do
>
> select visit_date from visits where user_id = Joe and visitor_id = Julia
> limit 1
>
> ________________________________________
> From: Regis Le Bretonnic <r.lebreton...@meetic-corp.com>
> Sent: Tuesday, August 15, 2023 17:49
> To: user@cassandra.apache.org
> Subject: Re: Materialized View inconsistency issue
>
> You don't often get email from r.lebreton...@meetic-corp.com. Learn why
> this is important<https://aka.ms/LearnAboutSenderIdentification>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
> Hi Josh...
>
> A long (and almost private) message to explain how we fix materialized
> views.
>
> Let me first explain our use case... I work for an european dating website.
> Users can received visits from other users (typically when someone looks
> at a member profile page), and we want to inform them for each visit
> received (sorted from the most recent one to the oldest one).
> But imagine that Karen goes several times on my profile page... I don't
> want to see all her visits but only the last one. So, we want to
> deduplicate rows (see only once Karen), and ordered the rows (showing Julia
> that visit me 1 minute ago, Sophia that visit me 3 minutes ago, Karen that
> visit me 10 minutes ago, and so on).
>
> You can not do that in cassandra. If you want to deduplicate rows by pairs
> of users, the "visit timestamp" can not be in the primary key... and if you
> want to order rows by the "visit timestamp", this field must be in the
> clustering columns and consequently in the primary key. That is just not
> possible !
>
> Waht we do is :
> - a master table like this :
>
> CREATE TABLE visits_received (
>     receiver_id bigint,
>     sender_id bigint,
>     visit_date timestamp,
>     PRIMARY KEY ((receiver_id), sender_id)
> ) WITH CLUSTERING ORDER BY (sender_id ASC);
>
> - and a materialized view like this :
>
> CREATE MATERIALIZED VIEW visits_received_by_date as
>     SELECT receiver_id, sender_id, visit_date
>     FROM visits_received
>     WHERE receiver_id IS NOT NULL AND sender_id IS NOT NULL AND visit_date
> IS NOT NULL
>     PRIMARY KEY ((receiver_id), visit_date, sender_id)
>     WITH CLUSTERING ORDER BY (visit_date DESC, sender_id ASC);
>
> With this the master table deduplicates, and the MV sorts rows the way we
> want.
>
>
> Problems we have are most of the time having rows that should not exist in
> the MV...
> Let's say that I have this row in the master table :
> - 111, 222, t3
> and that because of materialized view unconsistency, I have 3 rows in the
> MV :
> - 111, 222, t3
> - 111, 222, t2
> - 111, 222, t1
>
> then to remove the 2 wrong rows in the MV, we do a double insert on the
> master table :
> insert (111, 222, t1) + insert (111, 222, t3) -> this remove the row "111,
> 222, t1"
> insert (111, 222, t2) + insert (111, 222, t3) -> this remove the row "111,
> 222, t2"
>
> We can very, very rarely have other cases (rows in master and not in MV),
> but these are also very easy to fix by just re-inserting the master rows.
>
>
> Now about our spark script :
> - we download sequentially the master table and the MV
> - we compare them to find ... "potential inconsistencies" (because the
> tables are not download at the same time and data can have change, we can
> find false positive errors)
> - we loop on all the "potential inconsistencies" and force a new read on
> the table and the MV to check if there is truly inconsistency when reads
> are made in few milliseconds
> - if it is a true inconsistency, we force inserts on the master table to
> fix the MV as describe below
>
>
> Now, about the volume of inconsistency :
> - on a master table with 1.7 B-rows
> - we have ~ 12.5 K-rows that are unconsistent (0,0007%) after 2 years...
> clearly better than what our developpers will do by managing inserts and
> deletes by themshelves (and acceptable for our use case)
>
>
> Le lun. 14 août 2023 à 16:36, Josh McKenzie <jmcken...@apache.org<mailto:
> jmcken...@apache.org>> a écrit :
> When it comes to denormalization in Cassandra today your options are to
> either do it yourself in your application layer or rely on Materialized
> Views to do it for you at the server layer. Neither are production-ready
> approaches out of the box (which is one of the biggest flaws in the
> "provide it server side as a feature" approach); both implementations will
> need you as a user to:
>
>   1.  Deal with failure cases (data loss in base table, consistency
> violations between base and view due to failures during write /
> anti-entropy vs. gc_grace, etc) and
>   2.  Manage the storage implications of a given base write and the
> denormalized writes that it spawns. This is arguably worse with MV's as you
> have less visibility into the fanout and they're easier to create; it was
> common to see folks create 5-10 views on a base table when they were first
> released and lock up tables and exhaust storage disks, not realizing the
> implications.
>
> The current inability to clearly see and rely on the state of consistency
> between a base and a view is a significant limitation that's shared by both
> the MV implementation and a user-hand-rolled version. @regis I'd be super
> interested to hear more about:
> we made a spark script downloading the master table and the MV, and
> comparing them and fixing data (as said previously we have very few errors
> and we run it maybe once a year
> Given the inclusion of the spark bulk reader and writer in the project
> ecosystem, this could prove to be something really useful for a lot of
> users.
>
> In a post-Accord world with atomic durable multi-partition transactions,
> we should be able to create a more robust, consistent implementation of
> MV's. This doesn't solve the problem of "complete data loss on a base table
> leaves you with data in a view that's orphaned; you need to rebuild the
> view." That said, a Materialized Views feature that only has that one
> caveat of "if you lose data in the base you need to recreate the views"
> would be a significant improvement. It should also be pretty trivial to
> augment the upcoming size commands to support future MV's as well
> (CASSANDRA-12367<https://issues.apache.org/jira/browse/CASSANDRA-12367>)
>
> So yeah. Denormalization is a Hard Problem. MV's were an attempt to take a
> burden off the user but we just didn't have sufficiently robust primitives
> to build on at that time to get it where it needed to go.
>
> I'm personally still on the fence between whether a skilled user should go
> with hand-rolled vs. MV's today, but for the general populace of C* users
> (i.e. people that don't have time to get into the weeds), they're probably
> best avoided still for now.
>
> On Thu, Aug 10, 2023, at 8:19 PM, MyWorld wrote:
> Hi surbhi ,
> There are 2 drawbacks associated with MV.
> 1. Inconsistent view
> 2. The lock it takes on the base table. This gets worse when you have huge
> number of clustering keys in a specific partition.
>
> It's better you re-design a seperate table and let your API do a parallel
> write on both.
>
> Regards,
> Ashish
>
> On Fri, 11 Aug, 2023, 02:03 Surbhi Gupta, <surbhi.gupt...@gmail.com
> <mailto:surbhi.gupt...@gmail.com>> wrote:
> Thanks everyone.
>
>
> On Wed, 9 Aug 2023 at 01:00, Regis Le Bretonnic
> <r.lebreton...@meetic-corp.com<mailto:r.lebreton...@meetic-corp.com>>
> wrote:
> >
> > Hi Surbhi
> >
> > We do use cassandra materialized views even if not recommended.
> > There are known issues you have to make with. Despite of them, we still
> use VM.
> > What we observe is :
> > * there are  inconsistency issues but few. Most of them are rows that
> should not exist in the MV...
> > * we made a spark script downloading the master table and the MV, and
> comparing them and fixing data (as said previously we have very few errors
> and we run it maybe once a year)
> >
> > * Things go very very very bad when you add or remove a node ! Limit
> this operation if possible and do it knowing what can happen (we isolate
> the ring/datacenter and fix data before putting it back to production. We
> did this only once in the last 4 years).
> >
> > PS : all proposals avoiding MV failed for our project. Basically
> managing a table like a MV (by deleting and inserting rows from code) is
> worse and more corrupted than what MV does...
> > The worse issue is adding and removing nodes. Maybe cassandra 4 improves
> this point (not tested yet).
> >
> > Have fun...
> >
> > Le mar. 8 août 2023 à 22:36, Surbhi Gupta <surbhi.gupt...@gmail.com
> <mailto:surbhi.gupt...@gmail.com>> a écrit :
> >>
> >> Hi,
> >>
> >> We get complaints about Materialized View inconsistency issues.
> >> We are on 3.11.5 and on 3.11.5 Materialized Views were not production
> ready.
> >> We are ok to upgrade.
> >>
> >> On which version of cassandra MVs doesnt have inconsistency issues?
> >>
> >> Thanks
> >> Surbhi
>
>

Re: Materialized View inconsistency issue

Reply via email to