Re: Materialized View inconsistency issue

Miklosovic, Stefan Fri, 18 Aug 2023 01:48:46 -0700

The 2 tables you propose Stefan can not natively order rows by time (they will 
be ordered by visitor_id), excepted if you sort rows after the select.

So what? I think this is way better than dealing with MV which you will get 
inconsistent eventually. Do you want to have broken MV or you want to sort on 
the client? Which is better?

The table will be like this

cqlsh> select * from dating.visits_by_visitor_id ;

 user_id | visitor_id
---------+------------
     100 |        200
     100 |        300

(2 rows)
cqlsh> select * from dating.visits;

 user_id | visitor_id | visit_date
---------+------------+------------
     100 |        300 |          3
     100 |        200 |          5
     100 |        200 |          2
     100 |        200 |          1

(4 rows)
cqlsh>

Now if you iterate over 100 and 200 and you get limits by 1, you get latest 
results.

Now it might be true that you get the result which is not sorted on timestamp 
but does that really matter? You can always sort it on the client.

The advantage of this approach is that you know all visitors of somebody on one 
query if that ever mattered.
You also know when somebody was visited by somebody in some period of time

select visit_date from dating.visits where user_id = 100 and visitor_id = 200 
and visit_date > 3 and visit_date < 8;

Also, I dont know what business logic you have in detail, but why would 
somebody be interested who visited him 6 months ago? What is that information 
good for in practice? Why dont you do it like this?

INSERT INTO dating.visits (user_id , visitor_id, visit_date ) VALUES ( 100, 
300, 60) USING TTL 10;

Use TTL of e.g. 1 month? So rows would start to disappear automatically. If 
somebody visited me 2 months ago and then it disappears next I would not care 
at all. A user who visited me 2 months ago is basically equal to a user who has 
never visited me.

________________________________________
From: Regis Le Bretonnic <r.lebreton...@meetic-corp.com>
Sent: Friday, August 18, 2023 9:47
To: user@cassandra.apache.org
Subject: Re: Materialized View inconsistency issue

You don't often get email from r.lebreton...@meetic-corp.com. Learn why this is 
important<https://aka.ms/LearnAboutSenderIdentification>
NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.

Hi Stefan

Happy to see that our use case interest you :-)
I'm not sure that I explained well what we want.

Imagine that sequence of events :
- Julia visits Joe at t1
- Julia visits Joe at t2
- Karen visits Joe at t3
- Silvia visits Joe at t4
- Karen visits Joe at t5
- Karen visits Joe at t6
- Julia visits Joe at t7

We want to provide to Joe a webpage listing visits he received in that order :
- Juiia at t7  (the more recent)
- Karen at t6
- Silvia at t4

The 2 tables you propose Stefan can not natively order rows by time (they will 
be ordered by visitor_id), excepted if you sort rows after the select.

Keep in mind that some people can received a loooooot on visits in 6 months 
(200,000 or 300,000 deduplicated visits, and much more if you keep duplicate 
visits) and ordering such volume of rows by code is not easy (in fact 
impossible because we use PHP and we can't do that in a FPM memory...)
... and of course, because we can not provide in a single page of 200,000 or 
300,000 members stickers in one shot, the webpage requires pagination (with lot 
of 100 profiles per page). If you decide that sorting should be made on the 
code side, the pagination becomes awful to manage.

PS 1 : when we decide to do this, MV were not yet back to experimental
PS 2 : the code to manage a visit received is very easy... we just do a insert 
in the master table without doing any select before... we just don't care of 
what happened in past...
PS 3 : the pagination is very easy... we just do a
- select * from visits_received_by_date where receiver_id=111 and 
visit_date<now limit 100
and on the next page :
- select * from visits_received_by_date where receiver_id=111 and 
visit_date<$last_date_of_previous_page limit 100

PS 4 : I simplify reality
- we have additionnal columns in our table (and not only 3) that take space in 
memory...
- the query of pagination is not exactly this one because we can received 2 or 
3 visits at the same second (and we manage this correctly).

#simple is very important for us. Our master table + MV simplify a lot of 
things on the code side.
0,0007% of error is acceptable for us. In true life, very few people detects 
that Karen appears in the page 1 of visits received, and also in page 7 
(because most of the time people don't scroll a lot of pages).

But if someone has a better proposal, I take it (we already discussed with 
datastax about our need with no better proposal considering our use case 😎)

Le jeu. 17 août 2023 à 21:37, Miklosovic, Stefan 
<stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>> a écrit :
Why can't you do it like this? You would have two tables:

create table visits (user_id bigint, visitor_id bigint, visit_date timestamp, 
primary key ((user_id, visitor_id), visit_date)) order by visit_date desc

create table visitors_by_user_id (user_id bigint, visitor_id bigint, primary 
key ((user_id), visitor_id))

The logic behind the second table, visitors_by_user_id, is that you do not care 
if a user visited you twice, because it is primary key + clustering column, if 
same user visits you twice, the second time it would basically do nothing, 
because such entry is already there.

For example:

user_id | visitor_id
joe | karen
joe | julia

If Karen visits me again, nothing happens as that entry is already there.

Then if Karen visits me, I put into the second table

joe | karen | tuesday
joe | karen | monday
joe | karen | last friday
joe | julia | today

So to know who visited me recently, I do

select visitor_id from visitors_by_user_id where user_id = Joe;

So I get Karen and Julia

And then for each such visitor I do

select visit_date from visits where user_id = Joe and visitor_id = Julia limit 1

________________________________________
From: Regis Le Bretonnic 
<r.lebreton...@meetic-corp.com<mailto:r.lebreton...@meetic-corp.com>>
Sent: Tuesday, August 15, 2023 17:49
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Materialized View inconsistency issue

You don't often get email from 
r.lebreton...@meetic-corp.com<mailto:r.lebreton...@meetic-corp.com>. Learn why 
this is important<https://aka.ms/LearnAboutSenderIdentification>
NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.

Hi Josh...

A long (and almost private) message to explain how we fix materialized views.

Let me first explain our use case... I work for an european dating website.
Users can received visits from other users (typically when someone looks at a 
member profile page), and we want to inform them for each visit received 
(sorted from the most recent one to the oldest one).
But imagine that Karen goes several times on my profile page... I don't want to 
see all her visits but only the last one. So, we want to deduplicate rows (see 
only once Karen), and ordered the rows (showing Julia that visit me 1 minute 
ago, Sophia that visit me 3 minutes ago, Karen that visit me 10 minutes ago, 
and so on).

You can not do that in cassandra. If you want to deduplicate rows by pairs of 
users, the "visit timestamp" can not be in the primary key... and if you want 
to order rows by the "visit timestamp", this field must be in the clustering 
columns and consequently in the primary key. That is just not possible !

Waht we do is :
- a master table like this :

CREATE TABLE visits_received (
    receiver_id bigint,
    sender_id bigint,
    visit_date timestamp,
    PRIMARY KEY ((receiver_id), sender_id)
) WITH CLUSTERING ORDER BY (sender_id ASC);

- and a materialized view like this :

CREATE MATERIALIZED VIEW visits_received_by_date as
    SELECT receiver_id, sender_id, visit_date
    FROM visits_received
    WHERE receiver_id IS NOT NULL AND sender_id IS NOT NULL AND visit_date IS 
NOT NULL
    PRIMARY KEY ((receiver_id), visit_date, sender_id)
    WITH CLUSTERING ORDER BY (visit_date DESC, sender_id ASC);

With this the master table deduplicates, and the MV sorts rows the way we want.

Problems we have are most of the time having rows that should not exist in the 
MV...
Let's say that I have this row in the master table :
- 111, 222, t3
and that because of materialized view unconsistency, I have 3 rows in the MV :
- 111, 222, t3
- 111, 222, t2
- 111, 222, t1

then to remove the 2 wrong rows in the MV, we do a double insert on the master 
table :
insert (111, 222, t1) + insert (111, 222, t3) -> this remove the row "111, 222, 
t1"
insert (111, 222, t2) + insert (111, 222, t3) -> this remove the row "111, 222, 
t2"

We can very, very rarely have other cases (rows in master and not in MV), but 
these are also very easy to fix by just re-inserting the master rows.

Now about our spark script :
- we download sequentially the master table and the MV
- we compare them to find ... "potential inconsistencies" (because the tables 
are not download at the same time and data can have change, we can find false 
positive errors)
- we loop on all the "potential inconsistencies" and force a new read on the 
table and the MV to check if there is truly inconsistency when reads are made 
in few milliseconds
- if it is a true inconsistency, we force inserts on the master table to fix 
the MV as describe below

Now, about the volume of inconsistency :
- on a master table with 1.7 B-rows
- we have ~ 12.5 K-rows that are unconsistent (0,0007%) after 2 years... 
clearly better than what our developpers will do by managing inserts and 
deletes by themshelves (and acceptable for our use case)

Le lun. 14 août 2023 à 16:36, Josh McKenzie 
<jmcken...@apache.org<mailto:jmcken...@apache.org><mailto:jmcken...@apache.org<mailto:jmcken...@apache.org>>>
 a écrit :
When it comes to denormalization in Cassandra today your options are to either 
do it yourself in your application layer or rely on Materialized Views to do it 
for you at the server layer. Neither are production-ready approaches out of the 
box (which is one of the biggest flaws in the "provide it server side as a 
feature" approach); both implementations will need you as a user to:

  1.  Deal with failure cases (data loss in base table, consistency violations 
between base and view due to failures during write / anti-entropy vs. gc_grace, 
etc) and
  2.  Manage the storage implications of a given base write and the 
denormalized writes that it spawns. This is arguably worse with MV's as you 
have less visibility into the fanout and they're easier to create; it was 
common to see folks create 5-10 views on a base table when they were first 
released and lock up tables and exhaust storage disks, not realizing the 
implications.

The current inability to clearly see and rely on the state of consistency 
between a base and a view is a significant limitation that's shared by both the 
MV implementation and a user-hand-rolled version. @regis I'd be super 
interested to hear more about:
we made a spark script downloading the master table and the MV, and comparing 
them and fixing data (as said previously we have very few errors and we run it 
maybe once a year
Given the inclusion of the spark bulk reader and writer in the project 
ecosystem, this could prove to be something really useful for a lot of users.

In a post-Accord world with atomic durable multi-partition transactions, we 
should be able to create a more robust, consistent implementation of MV's. This 
doesn't solve the problem of "complete data loss on a base table leaves you 
with data in a view that's orphaned; you need to rebuild the view." That said, 
a Materialized Views feature that only has that one caveat of "if you lose data 
in the base you need to recreate the views" would be a significant improvement. 
It should also be pretty trivial to augment the upcoming size commands to 
support future MV's as well 
(CASSANDRA-12367<https://issues.apache.org/jira/browse/CASSANDRA-12367<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FCASSANDRA-12367&data=05%7C01%7CStefan.Miklosovic%40netapp.com%7C4fb3285cbce348a5ad8a08db9fbf775e%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638279416926093170%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=UGNXErjGqrlp4VFXkluoeV0qKRx%2Fr5ksRnFaN%2FPj7uk%3D&reserved=0>>)

So yeah. Denormalization is a Hard Problem. MV's were an attempt to take a 
burden off the user but we just didn't have sufficiently robust primitives to 
build on at that time to get it where it needed to go.

I'm personally still on the fence between whether a skilled user should go with 
hand-rolled vs. MV's today, but for the general populace of C* users (i.e. 
people that don't have time to get into the weeds), they're probably best 
avoided still for now.

On Thu, Aug 10, 2023, at 8:19 PM, MyWorld wrote:
Hi surbhi ,
There are 2 drawbacks associated with MV.
1. Inconsistent view
2. The lock it takes on the base table. This gets worse when you have huge 
number of clustering keys in a specific partition.

It's better you re-design a seperate table and let your API do a parallel write 
on both.

Regards,
Ashish

On Fri, 11 Aug, 2023, 02:03 Surbhi Gupta, 
<surbhi.gupt...@gmail.com<mailto:surbhi.gupt...@gmail.com><mailto:surbhi.gupt...@gmail.com<mailto:surbhi.gupt...@gmail.com>>>
 wrote:
Thanks everyone.

On Wed, 9 Aug 2023 at 01:00, Regis Le Bretonnic
<r.lebreton...@meetic-corp.com<mailto:r.lebreton...@meetic-corp.com><mailto:r.lebreton...@meetic-corp.com<mailto:r.lebreton...@meetic-corp.com>>>
 wrote:
>
> Hi Surbhi
>
> We do use cassandra materialized views even if not recommended.
> There are known issues you have to make with. Despite of them, we still use 
> VM.
> What we observe is :
> * there are  inconsistency issues but few. Most of them are rows that should 
> not exist in the MV...
> * we made a spark script downloading the master table and the MV, and 
> comparing them and fixing data (as said previously we have very few errors 
> and we run it maybe once a year)
>
> * Things go very very very bad when you add or remove a node ! Limit this 
> operation if possible and do it knowing what can happen (we isolate the 
> ring/datacenter and fix data before putting it back to production. We did 
> this only once in the last 4 years).
>
> PS : all proposals avoiding MV failed for our project. Basically managing a 
> table like a MV (by deleting and inserting rows from code) is worse and more 
> corrupted than what MV does...
> The worse issue is adding and removing nodes. Maybe cassandra 4 improves this 
> point (not tested yet).
>
> Have fun...
>
> Le mar. 8 août 2023 à 22:36, Surbhi Gupta 
> <surbhi.gupt...@gmail.com<mailto:surbhi.gupt...@gmail.com><mailto:surbhi.gupt...@gmail.com<mailto:surbhi.gupt...@gmail.com>>>
>  a écrit :
>>
>> Hi,
>>
>> We get complaints about Materialized View inconsistency issues.
>> We are on 3.11.5 and on 3.11.5 Materialized Views were not production ready.
>> We are ok to upgrade.
>>
>> On which version of cassandra MVs doesnt have inconsistency issues?
>>
>> Thanks
>> Surbhi

Re: Materialized View inconsistency issue

Reply via email to