[
https://issues.apache.org/jira/browse/CASSANDRA-13547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067734#comment-16067734
]
Krishna Dattu Koneru edited comment on CASSANDRA-13547 at 6/29/17 5:31 AM:
---------------------------------------------------------------------------
Thanks [~jasonstack] !
I will try to rework on {{1. Missing Update}} to address your comment.
About
{quote}
Using the greater timestamp from view's columns(pk+non-pk) in base row will
later shadow entire row in view if there is a normal column in base as primary
key in view.
{quote}
This looks like a nasty problem. This patch does not cause this. This is a
existing behaviour that any updates to view's pk columns will make old row
marked as tombstone (at highest timestamp of all columns in base row) and will
create a new row with updated pk.
EDIT : I just saw that https://issues.apache.org/jira/browse/CASSANDRA-11500
exists because of this exact problem.
See view row timestamps in the below example :
Existing behaviour without patch:
{code}
INSERT INTO test (a, b, c, d) VALUES (1, 1, 1, 1) using timestamp 0;
test : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=0], [c=1 ts=0], [d=1 ts=0]
mv_test1 : [1]@0 Row[info=[ts=0] ]: 1 | [c=1 ts=0], [d=1 ts=0]
{code}
{code}
UPDATE test using timestamp 5 set c = 0 WHERE a=1;
test : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=0], [c=0 ts=5], [d=1 ts=0]
mv_test1 : [1]@0 Row[info=[ts=0] ]: 1 | [c=0 ts=5], [d=1 ts=0]
{code}
{code}
UPDATE test using timestamp 1 set b = 0 WHERE a=1;
test : [1]@0 Row[info=[ts=0] ]: | [b=0 ts=1], [c=0 ts=5], [d=1 ts=0]
mv_test1 :
[1]@0 Row[info=[ts=1] ]: 0 | [c=0 ts=5], [d=1 ts=0]
[1]@39 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498707556(shadowable) ]: 1 |
{code}
{code}
UPDATE test using timestamp 2 set b = 1 WHERE a=1;
table : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=2], [c=0 ts=5], [d=1 ts=0]
View (before compaction)
[1]@0 Row[info=[ts=1] ]: 0 | [c=0 ts=5], [d=1 ts=0]
[1]@39 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498707556(shadowable) ]: 1 |
[1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498707897(shadowable) ]: 0 |
[1]@30 Row[info=[ts=2] ]: 1 | [c=0 ts=5], [d=1 ts=0]
View (after compaction)
[1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498707897(shadowable) ]: 0 |
[1]@31 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498707556(shadowable) ]: 1 |
{code}
With this patch :
{code}
INSERT INTO test (a, b, c, d) VALUES (1, 1, 1, 1) using timestamp 0;
table : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=0], [c=1 ts=0], [d=1 ts=0]
view : [1]@0 Row[info=[ts=0] ]: 1 | [c=1 ts=0], [d=1 ts=0]
{code}
{code}
UPDATE test using timestamp 5 set c = 0 WHERE a=1;
UPDATE test using timestamp 1 set b = 0 WHERE a=1;
table : [1]@0 Row[info=[ts=0] ]: | [b=0 ts=1], [c=0 ts=5], [d=1 ts=0]
view :
[1]@0 Row[info=[ts=5] ]: 0 | [c=0 ts=5], [d=1 ts=0] /* row ts=5 because
of this patch */
[1]@38 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498708886(shadowable) ]: 1 |
{code}
{code}
UPDATE test using timestamp 2 set b = 1 WHERE a=1;
table : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=2], [c=0 ts=5], [d=1 ts=0]
View (before compaction)
[1]@0 Row[info=[ts=5] ]: 0 | [c=0 ts=5], [d=1 ts=0] /* row ts=5 because of this
patch */
[1]@38 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498708886(shadowable) ]: 1 |
[1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498709103(shadowable) ]: 0 |
[1]@30 Row[info=[ts=5] ]: 1 | [c=0 ts=5], [d=1 ts=0] /*-- row ts=5 because of
this patch */
View (after compaction)
[1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498709103(shadowable) ]: 0 |
[1]@31 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498708886(shadowable) ]: 1 |
{code}
I am not sure yet how to fix this issue ... given that if live row and
tombstone have same timestamp , tombstone wins.
Another problem is that all view deletes are marked as shadowable. But then
that is a different problem and I belive it is being fixed in
https://issues.apache.org/jira/browse/CASSANDRA-13409 .
was (Author: krishna.koneru):
Thanks [~jasonstack] !
I will try to rework on {{1. Missing Update}} to address your comment.
About
{quote}
Using the greater timestamp from view's columns(pk+non-pk) in base row will
later shadow entire row in view if there is a normal column in base as primary
key in view.
{quote}
This looks like a nasty problem. This patch does not cause this. This is a
existing behaviour that any updates to view's pk columns will make old row
marked as tombstone (at highest timestamp of all columns in base row) and will
create a new row with updated pk.
See view row timestamps in the below example :
Existing behaviour without patch:
{code}
INSERT INTO test (a, b, c, d) VALUES (1, 1, 1, 1) using timestamp 0;
test : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=0], [c=1 ts=0], [d=1 ts=0]
mv_test1 : [1]@0 Row[info=[ts=0] ]: 1 | [c=1 ts=0], [d=1 ts=0]
{code}
{code}
UPDATE test using timestamp 5 set c = 0 WHERE a=1;
test : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=0], [c=0 ts=5], [d=1 ts=0]
mv_test1 : [1]@0 Row[info=[ts=0] ]: 1 | [c=0 ts=5], [d=1 ts=0]
{code}
{code}
UPDATE test using timestamp 1 set b = 0 WHERE a=1;
test : [1]@0 Row[info=[ts=0] ]: | [b=0 ts=1], [c=0 ts=5], [d=1 ts=0]
mv_test1 :
[1]@0 Row[info=[ts=1] ]: 0 | [c=0 ts=5], [d=1 ts=0]
[1]@39 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498707556(shadowable) ]: 1 |
{code}
{code}
UPDATE test using timestamp 2 set b = 1 WHERE a=1;
table : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=2], [c=0 ts=5], [d=1 ts=0]
View (before compaction)
[1]@0 Row[info=[ts=1] ]: 0 | [c=0 ts=5], [d=1 ts=0]
[1]@39 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498707556(shadowable) ]: 1 |
[1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498707897(shadowable) ]: 0 |
[1]@30 Row[info=[ts=2] ]: 1 | [c=0 ts=5], [d=1 ts=0]
View (after compaction)
[1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498707897(shadowable) ]: 0 |
[1]@31 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498707556(shadowable) ]: 1 |
{code}
With this patch :
{code}
INSERT INTO test (a, b, c, d) VALUES (1, 1, 1, 1) using timestamp 0;
table : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=0], [c=1 ts=0], [d=1 ts=0]
view : [1]@0 Row[info=[ts=0] ]: 1 | [c=1 ts=0], [d=1 ts=0]
{code}
{code}
UPDATE test using timestamp 5 set c = 0 WHERE a=1;
UPDATE test using timestamp 1 set b = 0 WHERE a=1;
table : [1]@0 Row[info=[ts=0] ]: | [b=0 ts=1], [c=0 ts=5], [d=1 ts=0]
view :
[1]@0 Row[info=[ts=5] ]: 0 | [c=0 ts=5], [d=1 ts=0] /* row ts=5 because
of this patch */
[1]@38 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498708886(shadowable) ]: 1 |
{code}
{code}
UPDATE test using timestamp 2 set b = 1 WHERE a=1;
table : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=2], [c=0 ts=5], [d=1 ts=0]
View (before compaction)
[1]@0 Row[info=[ts=5] ]: 0 | [c=0 ts=5], [d=1 ts=0] /* row ts=5 because of this
patch */
[1]@38 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498708886(shadowable) ]: 1 |
[1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498709103(shadowable) ]: 0 |
[1]@30 Row[info=[ts=5] ]: 1 | [c=0 ts=5], [d=1 ts=0] /*-- row ts=5 because of
this patch */
View (after compaction)
[1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498709103(shadowable) ]: 0 |
[1]@31 Row[info=[ts=-9223372036854775808] del=deletedAt=5,
localDeletion=1498708886(shadowable) ]: 1 |
{code}
I am not sure yet how to fix this issue ... given that if live row and
tombstone have same timestamp , tombstone wins.
Another problem is that all view deletes are marked as shadowable. But then
that is a different problem and I belive it is being fixed in
https://issues.apache.org/jira/browse/CASSANDRA-13409 .
> Filtered materialized views missing data
> ----------------------------------------
>
> Key: CASSANDRA-13547
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13547
> Project: Cassandra
> Issue Type: Bug
> Components: Materialized Views
> Environment: Official Cassandra 3.10 Docker image (ID 154b919bf8ce).
> Reporter: Craig Nicholson
> Assignee: Krishna Dattu Koneru
> Priority: Blocker
> Labels: materializedviews
> Fix For: 3.11.x
>
>
> When creating a materialized view against a base table the materialized view
> does not always reflect the correct data.
> Using the following test schema:
> {code:title=Schema|language=sql}
> DROP KEYSPACE IF EXISTS test;
> CREATE KEYSPACE test
> WITH REPLICATION = {
> 'class' : 'SimpleStrategy',
> 'replication_factor' : 1
> };
> CREATE TABLE test.table1 (
> id int,
> name text,
> enabled boolean,
> foo text,
> PRIMARY KEY (id, name));
> CREATE MATERIALIZED VIEW test.table1_mv1 AS SELECT id, name, foo
> FROM test.table1
> WHERE id IS NOT NULL
> AND name IS NOT NULL
> AND enabled = TRUE
> PRIMARY KEY ((name), id);
> CREATE MATERIALIZED VIEW test.table1_mv2 AS SELECT id, name, foo, enabled
> FROM test.table1
> WHERE id IS NOT NULL
> AND name IS NOT NULL
> AND enabled = TRUE
> PRIMARY KEY ((name), id);
> {code}
> When I insert a row into the base table the materialized views are updated
> appropriately. (+)
> {code:title=Insert row|language=sql}
> cqlsh> INSERT INTO test.table1 (id, name, enabled, foo) VALUES (1, 'One',
> TRUE, 'Bar');
> cqlsh> SELECT * FROM test.table1;
> id | name | enabled | foo
> ----+------+---------+-----
> 1 | One | True | Bar
> (1 rows)
> cqlsh> SELECT * FROM test.table1_mv1;
> name | id | foo
> ------+----+-----
> One | 1 | Bar
> (1 rows)
> cqlsh> SELECT * FROM test.table1_mv2;
> name | id | enabled | foo
> ------+----+---------+-----
> One | 1 | True | Bar
> (1 rows)
> {code}
> Updating the record in the base table and setting enabled to FALSE will
> filter the record from both materialized views. (+)
> {code:title=Disable the row|language=sql}
> cqlsh> UPDATE test.table1 SET enabled = FALSE WHERE id = 1 AND name = 'One';
> cqlsh> SELECT * FROM test.table1;
> id | name | enabled | foo
> ----+------+---------+-----
> 1 | One | False | Bar
> (1 rows)
> cqlsh> SELECT * FROM test.table1_mv1;
> name | id | foo
> ------+----+-----
> (0 rows)
> cqlsh> SELECT * FROM test.table1_mv2;
> name | id | enabled | foo
> ------+----+---------+-----
> (0 rows)
> {code}
> However a further update to the base table setting enabled to TRUE should
> include the record in both materialzed views, however only one view
> (table1_mv2) gets updated. (-)
> It appears that only the view (table1_mv2) that returns the filtered column
> (enabled) is updated. (-)
> Additionally columns that are not part of the partiion or clustering key are
> not updated. You can see that the foo column has a null value in table1_mv2.
> (-)
> {code:title=Enable the row|language=sql}
> cqlsh> UPDATE test.table1 SET enabled = TRUE WHERE id = 1 AND name = 'One';
> cqlsh> SELECT * FROM test.table1;
> id | name | enabled | foo
> ----+------+---------+-----
> 1 | One | True | Bar
> (1 rows)
> cqlsh> SELECT * FROM test.table1_mv1;
> name | id | foo
> ------+----+-----
> (0 rows)
> cqlsh> SELECT * FROM test.table1_mv2;
> name | id | enabled | foo
> ------+----+---------+------
> One | 1 | True | null
> (1 rows)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]