[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2019-09-04 Thread Nadav Har'El (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922466#comment-16922466
 ] 

Nadav Har'El commented on CASSANDRA-9928:
-

This issue has recently turned 4 years old, and I'm curious how sure we are 
about the *reasons* described above for why we forbid MV with two new key 
columns - whether these reasons are correct, and whether we are sure these are 
the only reasons.

As [~fsander] asked above, while a base-view inconsistency is indeed more 
likely in the two-new-key-columns case, don't we have the same problem in the 
regular one-new-key-column case - of scenarios where an unfortunate order of 
node failures cause data to appear in a view replica which doesn't appear in 
the base replica, and thus will never be deleted? I thought this was one of the 
main reasons why MV was recently downgraded to "experimental" status.

But I also wonder if we didn't miss a second problem, that of row liveness, 
similar to what we have in the case of unselected columns (see 
[CASSANDRA-13826)|https://jira.apache.org/jira/browse/CASSANDRA-13826)] where 
if we add and remove different base columns which are view keys, but the view 
row has just a *single* timestamp, we can end up being unable to add a view row 
that we previously deleted. For example, here is a scenario I thought might be 
problematic (didn't actually test this, one would need to disable the check in 
the code forbidding multiple new MV key columns to run a test case):

Assume that x,y are regular column in base, but key columns in the view. For 
brevity, we leave out other base key columns and other regular columns. 
Consider the following sequence of events on one row of the base table:
 # Add x=1 at timestamp 1. Since y is still null, no view row is created yet.
 # Add y=1 at timestamp 10. This creates a view row with key x=1, y=1. The row 
only contains a CQL row marker, and a single timestamp is chosen for it: 10.
 # Delete x at timestamp 2. This deletes x’s older (ts=1) value, and so the 
view row should be deleted. Again, a timestamp needs to be chosen for this 
deletion - it will be 10 again, and the deletion will override the creation 
with the same timestamp from the previous step, and so far everything is fine.
 # Add x=2 at timestamp 3. This overrides the deletion of x (which was in 
timestamp 2) so again, both x and y have values and a view row should be 
created with key x=2, y=1. However, this creation will again have timestamp 10 
(y’s timestamp) and not be able to shadow the deletion from step 3 (in step 3, 
deletion won over data, so here it will win again). So the view row we wanted 
to add will not be added!

> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Materialized Views
>Reporter: T Jake Luciani
>Priority: Normal
>  Labels: materializedviews
> Fix For: 4.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2017-07-15 Thread Fridtjof Sander (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088582#comment-16088582
 ] 

Fridtjof Sander commented on CASSANDRA-9928:


I seem to be the only one, who doesn't understanding where the actual 
difference to the single-column case is:
Consider {{(p=1, a=1)}} with an index-MV on {{a}} and two updates {{a=2}} and 
{{a=3}}. One base-replica receives {{a=2}}, deletes view entry {{(a=1, p=1)}} 
and inserts {{(a=2, p=1)}}, then dies. Other base-replicas get {{a=3}}, delete 
{{(a=1, p=1)}} and insert {{(a=3, p=1)}}. Now, how is {{(a=2, p=1)}} removed 
from the view replica that was paired with the dying base-node? I don't get 
what's different here. Or does my analog case miss the point?


> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 4.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2017-04-26 Thread craig mcmillan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984667#comment-15984667
 ] 

craig mcmillan commented on CASSANDRA-9928:
---

currently i achieve this function by manually concatenating the extra keys i 
want in the MV into a single text key - it's roughly workable, but timeuuids 
can no longer be used to provide ordering, since they don't sort lexically

[~thobbs] solution would formalize and improve upon what i, and presumably many 
others, are already having to do ?

> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.11.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2016-10-19 Thread Ariel Scarpinelli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15588847#comment-15588847
 ] 

Ariel Scarpinelli commented on CASSANDRA-9928:
--

Why not letting people decide?
If you implement [~thobbs] solution then people simply gets warned: "non-PK 
columns participating in MV PKs will need to be updated together". Then it 
becomes the user responsibility to choose if they prefer to be tied to that 
restriction, or use a single column (which is tied to that restriction anyway, 
but since you always update columns in a minimum set of 1 ... :-D) . The 
current way it is implemented you are not letting people with other choice than 
fabricating a "fake" column that concatenates values or so... which effectively 
translates in having to update in tandem anyways but also adding complexity and 
repeated data.

> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2016-09-27 Thread Donovan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526654#comment-15526654
 ] 

Donovan Hsieh commented on CASSANDRA-9928:
--

Whatever technical issues associated with race condition stated above and limit 
to just 1 non-PK column, imho, make MV seriously handicapped. If this 
limitation is not removed, I can't see any serious real world applications can 
use MV effectively.

> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2016-06-21 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15341741#comment-15341741
 ] 

T Jake Luciani commented on CASSANDRA-9928:
---

bq.  If we can guarantee that there is a limit on the number of changes to 
those columns than we can limit the number of distinct state permutations that 
may need to be considered in the scenarios above. 

How do you propose we limit changes across the cluster and DCs? Tyler's 
suggestion is easy to guarantee without introducing some global rate limiting.

> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2016-06-21 Thread Matthias Broecheler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15341188#comment-15341188
 ] 

Matthias Broecheler commented on CASSANDRA-9928:


Requiring that all non-PK columns that are indexed be updated at the same time 
is too much of a limitation and will be really hard for users to understand 
imho. 

Instead, I would propose that we introduce a rate-change limit on the columns 
that participate in an MV. If we can guarantee that there is a limit on the 
number of changes to those columns than we can limit the number of distinct 
state permutations that may need to be considered in the scenarios above. In 
those cases, we would simply enumerate them and then delete all possible old MV 
states.This sounds expensive but it would only be expensive when change happen 
in fast succession or under extraordinary operational conditions - i.e. the 
cost should amortize well.

As for the rate limit, it seems that this would be a rather arbitrary 
limitation but if somebody changes their MV columns in rapid succession then 
they are pursuing an anti-pattern and throwing an exception would be a better 
response that deteriorating system performance.

> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2016-02-08 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136898#comment-15136898
 ] 

Aleksey Yeschenko commented on CASSANDRA-9928:
--

[~mbroecheler] Is the limitation outlined by Tyler above still compatible with 
the use case you got in mind?

> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2015-10-20 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14965509#comment-14965509
 ] 

Tyler Hobbs commented on CASSANDRA-9928:


One possible solution is to require that all non-PK columns that are in a view 
PK be updated simultaneously.  [~tjake] mentioned possible problems from read 
repair, but it seems like with this restriction in place, any read repairs 
would end up repairing all non-PK columns at once.

> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2015-08-31 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723556#comment-14723556
 ] 

T Jake Luciani commented on CASSANDRA-9928:
---

This scenario where 3 nodes won't see each others updates can't happen if we 
use the coordinator batchlog, since we guarantee at least a quorum of nodes 
will see the updates. Mentioning this for CASSANDRA-10230 

> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2015-08-31 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723703#comment-14723703
 ] 

T Jake Luciani commented on CASSANDRA-9928:
---

of course! thx.

> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2015-08-31 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723695#comment-14723695
 ] 

Joel Knighton commented on CASSANDRA-9928:
--

Guaranteeing a quorum of nodes will see the updates does not solve the problem 
because supporting multiple non-primary key columns in the materialized view 
primary key introduces a sensitivity to the ordering of updates to these 
non-primary key columns.

I think this is the simplest version of Benedict's example.  Envision a cluster 
with a table with primary key P and columns A and B. Presently, all replicas 
contain an entry for P=1, A=1, B=1.

Two concurrent updates are occurring - one setting A=2, and one setting B=2. 
One replica receives the update B=2, removes the MV entry for P=1, A=1, B=1, 
creates an MV entry for P=1, A=1, B=2, and then crashes with data loss. The 
remainder of the base replicas receive the update A=2; remove the MV entry for 
P=1, A=1, B=1; create an MV entry for P=1, A=2, B=1; receive the update B=2; 
remove the MV entry for P=1, A=2, B=1; and create an MV entry for P=1, A=2, B=2.

Upon repairing the data lost base replica from the remaining base replicas, a 
delete for the entry P=1, A=1, B=2 in the paired replica will never be created.

> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2015-08-03 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651989#comment-14651989
 ] 

Carl Yeksigian commented on CASSANDRA-9928:
---

[~benedict] brought up some potential issues in [a 
comment|https://issues.apache.org/jira/browse/CASSANDRA-6477#MultipleColumns] 
on CASSANDRA-6477:
{quote}
As far as multiple columns are concerned: I think we may need to go back to the 
drawing board there. It's actually really easy to demonstrate the cluster 
getting into broken states. Say you have three columns, A B C, and you send 
three competing updates a b c to their respective columns; previously all held 
the value _. If they arrive in different orders on each base-replica we can end 
up with 6 different MV states around the cluster. If any base replica dies, you 
don't know which of those 6 intermediate states were taken (and probably 
replicated) by its MV replicas. This problem grows exponentially as you add 
competing updates (which, given split brain, can compete over arbitrarily 
long intervals).

This is where my concern about a single (base) node dependency comes in, but 
after consideration it's clear that with a single column this problem is 
avoided because it's never ambiguous what the old state was. If you encounter a 
mutation that is shadowed by your current data, you can always issue a delete 
for the correct prior state. With multiple columns that is no longer possible.

I'm pretty sure the presence of multiple columns introduces other issues with 
each of the other moving parts.
{quote}

When we implement this feature, we should make sure to also add jepsen tests 
for the possible problems.

 Add Support for multiple non-primary key columns in Materialized View primary 
 keys
 --

 Key: CASSANDRA-9928
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
  Labels: materializedviews
 Fix For: 3.0 beta 1


 Currently we don't allow  1 non primary key from the base table in a MV 
 primary key.  We should remove this restriction assuming we continue 
 filtering out nulls.  With allowing nulls in the MV columns there are a lot 
 of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)