[jira] [Commented] (CASSANDRA-13127) Materialized Views: View row expires too soon

2017-07-17 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089988#comment-16089988
 ] 

ZhaoYang commented on CASSANDRA-13127:
--

Then single base column timestamp in {{shadowable-liveness-info or 
shadowable-tombstone}} won't be sufficient. All base columns' timestamp are 
required in {{shadowable-liveness-info or shadowable tombstone}}.

> Materialized Views: View row expires too soon
> -
>
> Key: CASSANDRA-13127
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13127
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths, Materialized Views
>Reporter: Duarte Nunes
>Assignee: ZhaoYang
>
> Consider the following commands, ran against trunk:
> {code}
> echo "DROP MATERIALIZED VIEW ks.mv; DROP TABLE ks.base;" | bin/cqlsh
> echo "CREATE TABLE ks.base (p int, c int, v int, PRIMARY KEY (p, c));" | 
> bin/cqlsh
> echo "CREATE MATERIALIZED VIEW ks.mv AS SELECT p, c FROM base WHERE p IS NOT 
> NULL AND c IS NOT NULL PRIMARY KEY (c, p);" | bin/cqlsh
> echo "INSERT INTO ks.base (p, c) VALUES (0, 0) USING TTL 10;" | bin/cqlsh
> # wait for row liveness to get closer to expiration
> sleep 6;
> echo "UPDATE ks.base USING TTL 8 SET v = 0 WHERE p = 0 and c = 0;" | bin/cqlsh
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  7
> (1 rows)
>  c | p
> ---+---
>  0 | 0
> (1 rows)
> # wait for row liveness to expire
> sleep 4;
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  3
> (1 rows)
>  c | p
> ---+---
> (0 rows)
> {code}
> Notice how the view row is removed even though the base row is still live. I 
> would say this is because in ViewUpdateGenerator#computeLivenessInfoForEntry 
> the TTLs are compared instead of the expiration times, but I'm not sure I'm 
> getting that far ahead in the code when updating a column that's not in the 
> view.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-09 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079644#comment-16079644
 ] 

ZhaoYang commented on CASSANDRA-13573:
--

Some issues are related to {{ColumnMetadata.cellValueType()}} which currently : 
 {{a}}. if CollectionType, returns value's type; {{b}} otherwise,  its own type.

It doesn't handle properly:  {{1}}. frozen collection type,  {{2}}. non-frozen 
udt which requires cellPath to retrieve value's type..  




> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-09 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079565#comment-16079565
 ] 

ZhaoYang commented on CASSANDRA-13573:
--

I think we could use {{`writeRawValue(String)`}}  to avoid double escape.  Now 
that [13592|https://issues.apache.org/jira/browse/CASSANDRA-13592] is merged, 
we should have correct json representations for all types.. better than using 
{{type.toString}} which, imo, serves a different purposes.

bq.  all UDTs were not supported initially because the information to 
deserialize the UDT requires the system schema tables to be read (which is 
dangerous and we dont want to do)

If there is dependency to local schema tables, I agree not to support UDT and 
print raw-bytes instead.

But AFAIK after 8099, there should not be a dependency on schema tables while 
deserializing UDT.

> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-11 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-11500:
-
Status: Awaiting Feedback  (was: In Progress)

> Obsolete MV entry may not be properly deleted
> -
>
> Key: CASSANDRA-11500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11500
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Sylvain Lebresne
>Assignee: ZhaoYang
>
> When a Materialized View uses a non-PK base table column in its PK, if an 
> update changes that column value, we add the new view entry and remove the 
> old one. When doing that removal, the current code uses the same timestamp 
> than for the liveness info of the new entry, which is the max timestamp for 
> any columns participating to the view PK. This is not correct for the 
> deletion as the old view entry could have other columns with higher timestamp 
> which won't be deleted as can easily shown by the failing of the following 
> test:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1;
> SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old 
> (invalid) and the new one
> {noformat}
> So the correct timestamp to use for the deletion is the biggest timestamp in 
> the old view entry (which we know since we read the pre-existing base row), 
> and that is what CASSANDRA-11475 does (the test above thus doesn't fail on 
> that branch).
> Unfortunately, even then we can still have problems if further updates 
> requires us to overide the old entry. Consider the following case:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the 
> entry for a=1 with timestamp 10
> UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; // This needs to re-insert 
> an entry for a=1 but shouldn't be deleted by the prior deletion
> UPDATE t USING TIMESTAMP 4 SET a = 2 WHERE k = 1; // ... and we can play this 
> game more than once
> UPDATE t USING TIMESTAMP 5 SET a = 1 WHERE k = 1;
> ...
> {noformat}
> In a way, this is saying that the "shadowable" deletion mechanism is not 
> general enough: we need to be able to re-insert an entry when a prior one had 
> been deleted before, but we can't rely on timestamps being strictly bigger on 
> the re-insert. In that sense, this can be though as a similar problem than 
> CASSANDRA-10965, though the solution there of a single flag is not enough 
> since we can have to replace more than once.
> I think the proper solution would be to ship enough information to always be 
> able to decide when a view deletion is shadowed. Which means that both 
> liveness info (for updates) and shadowable deletion would need to ship the 
> timestamp of any base table column that is part the view PK (so {{a}} in the 
> example below).  It's doable (and not that hard really), but it does require 
> a change to the sstable and intra-node protocol, which makes this a bit 
> painful right now.
> But I'll also note that as CASSANDRA-1096 shows, the timestamp is not even 
> enough since on equal timestamp the value can be the deciding factor. So in 
> theory we'd have to ship the value of those columns (in the case of a 
> deletion at least since we have it in the view PK for updates). That said, on 
> that last problem, my preference would be that we start prioritizing 
> CASSANDRA-6123 seriously so we don't have to care about conflicting timestamp 
> anymore, which would make this problem go away.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-11 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/11/17 2:06 PM:
---

h3. *Idea*

{{ShadowableTombstone}} : deletion-time, isShadowable, and "viewKeyTs" aka. 
base column's ts which is part of view pk(used to reconcile when timestamp 
tie), if there is no timestamp associated with that column, use base pk 
timestamp instead.
{{ShadowableLivenessInfo}}:  timestamp, and "viewKeyTs"

When reconcile {{ShadowableTombstone}} and {{ShadowableLivenessInfo}}: 
{quote}
if deletion-time greater than timestamp, tombstone wins
if deletion-time smaller than timestamp, livenessInfo wins
when deletion-time ties with timestamp, 
 - if {{ShadowableTombstone}}'s {{viewKeyTs}} >= {{ShadowableLivenessInfo}}'s, 
then tombstone wins
 - else livesnessInfo wins.
{quote}

When inserting to view, always use the greatest timestamp of all base columns 
in view similar to how view deletion timestamp is computed.

h3. *Example*

{quote}
CREATE TABLE t (k int PRIMARY KEY, a int, b int);
CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS NOT 
NULL PRIMARY KEY (k, a);

{{q1}} INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
{{q2}} UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
{{q3}} UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; 
{{q3}} UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; 
{quote}


* After {{q1}}:
** in base: {{k=1@0, a=1, b=1}}// 'k' is having value '1' with timestamp '0'
** in view: 
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}  // 'k:a' is having value '1:1' with 
timestamp '0' and viewKeyTs '0' from base's pk because column 'a' has no TS
* After {{q2}}
** in base(merged): {{k=1@0, a=1, b=2@10}} 
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  or merged: {{(k=1&=1)@TS(10,0), b=2@10}}
* After {{q3}}
** in base(merged): {{k=1@0, a=2@2, b=2@10}}  
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  sstable3: {{(k=1&=1)@Shadowable(10,0)}} & {{(k=1&=2)@TS(10,2), 
b=2@10}}  // '(k=1&=2)' is having biggest timestamp '10' and viewKeyTs '2' 
from column 'a'
***  or merged: {{(k=1&=2)@TS(10,2), b=2@10}}
* After {{q4}}
** in base(merged): {{k=1@0, a=1@3, b=2@10}}  
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  sstable3: {{(k=1&=1)@Shadowable(10,0)}} & {{(k=1&=2)@TS(10,2), 
b=2@10}} 
***  sstable4: {{(k=1&=2)@Shadowable(10,2)}} & {{(k=1&=1)@TS(10,3), 
b=2@10}}  // '(k=1&=1)' is having biggest timestamp '10' and viewKeyTs '3' 
from column 'a'
***  or merged: {{(k=1&=1)@TS(10,3), b=2@10}}




was (Author: jasonstack):
h3. *Idea*

{{ShadowableTombstone}} : deletion-time, isShadowable, and "viewKeyTs" aka. 
base column's ts which is part of view pk(used to reconcile when timestamp 
tie), if there is no timestamp associated with that column, use base pk 
timestamp instead.
{{ShadowableLivenessInfo}}:  timestamp, and "viewKeyTs"

When reconcile {{ShadowableTombstone}} and {{ShadowableLivenessInfo}}: 
{quote}
if deletion-time greater than timestamp, tombstone wins
if deletion-time smaller than timestamp, livenessInfo wins
when deletion-time ties with timestamp, 
 - if {{ShadowableTombstone}}'s {{viewKeyTs}} >= {{ShadowableLivenessInfo}}', 
then tombstone wins
 - else livesnessInfo wins.
{quote}

When inserting to view, always use the greatest timestamp of all base columns 
in view similar to how view deletion timestamp is computed.

h3. *Example*

{quote}
CREATE TABLE t (k int PRIMARY KEY, a int, b int);
CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS NOT 
NULL PRIMARY KEY (k, a);

{{q1}} INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
{{q2}} UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
{{q3}} UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; 
{{q3}} UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; 
{quote}


* After {{q1}}:
** in base: {{k=1@0, a=1, b=1}}// 'k' is having value '1' with timestamp '0'
** in view: 
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}  // 'k:a' is having value '1:1' with 
timestamp '0' and viewKeyTs '0' from base's pk because column 'a' has no TS
* After {{q2}}
** in base(merged): {{k=1@0, a=1, b=2@10}} 
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  or merged: {{(k=1&=1)@TS(10,0), b=2@10}}
* After {{q3}}
** in base(merged): {{k=1@0, a=2@2, b=2@10}}  
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  sstable3: {{(k=1&=1)@Shadowable(10,0)}} & {{(k=1&=2)@TS(10,2), 
b=2@10}}  // '(k=1&=2)' is having biggest timestamp '10' and viewKeyTs '2' 
from column 'a'
***  or merged: {{(k=1&=2)@TS(10,2), b=2@10}}
* After {{q4}}
** in base(merged): 

[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-11 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/11/17 2:03 PM:
---

h3. *Idea*

{{ShadowableTombstone}} : deletion-time, isShadowable, and "viewKeyTs" aka. 
base column's ts which is part of view pk(used to reconcile when timestamp 
tie), if there is no timestamp associated with that column, use base pk 
timestamp instead.
{{ShadowableLivenessInfo}}:  timestamp, and "viewKeyTs"

When reconcile {{ShadowableTombstone}} and {{ShadowableLivenessInfo}}: 
{quote}
if deletion-time greater than timestamp, tombstone wins
if deletion-time smaller than timestamp, livenessInfo wins
when deletion-time ties with timestamp, 
 - if {{ShadowableTombstone}}'s {{viewKeyTs}} >= {{ShadowableLivenessInfo}}', 
then tombstone wins
 - else livesnessInfo wins.
{quote}

When inserting to view, always use the greatest timestamp of all base columns 
in view similar to how view deletion timestamp is computed.

h3. *Example*

{quote}
CREATE TABLE t (k int PRIMARY KEY, a int, b int);
CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS NOT 
NULL PRIMARY KEY (k, a);

{{q1}} INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
{{q2}} UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
{{q3}} UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; 
{{q3}} UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; 
{quote}


* After {{q1}}:
** in base: {{k=1@0, a=1, b=1}}// 'k' is having value '1' with timestamp '0'
** in view: 
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}  // 'k:a' is having value '1:1' with 
timestamp '0' and viewKeyTs '0' from base's pk because column 'a' has no TS
* After {{q2}}
** in base(merged): {{k=1@0, a=1, b=2@10}} 
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  or merged: {{(k=1&=1)@TS(10,0), b=2@10}}
* After {{q3}}
** in base(merged): {{k=1@0, a=2@2, b=2@10}}  
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  sstable3: {{(k=1&=1)@Shadowable(10,0)}} & {{(k=1&=2)@TS(10,2), 
b=2@10}}  // '(k=1&=2)' is having biggest timestamp '10' and viewKeyTs '2' 
from column 'a'
***  or merged: {{(k=1&=2)@TS(10,2), b=2@10}}
* After {{q4}}
** in base(merged): {{k=1@0, a=1@3, b=2@10}}  
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  sstable3: {{(k=1&=1)@Shadowable(10,0)}} & {{(k=1&=2)@TS(10,2), 
b=2@10}} 
***  sstable4: {{(k=1&=2)@Shadowable(10,2)}} & {{(k=1&=1)@TS(10,3), 
b=2@10}}  // '(k=1&=1)' is having biggest timestamp '10' and viewKeyTs '3' 
from column 'a'
***  or merged: {{(k=1&=1)@TS(10,3), b=2@10}}




was (Author: jasonstack):
h3. idea

{{ShadowableTombstone}} : deletion-time, isShadowable, and "viewKeyTs" aka. 
base column's ts which is part of view pk(used to reconcile when timestamp 
tie), if there is no timestamp associated with that column, use base pk 
timestamp instead.
{{ShadowableLivenessInfo}}:  timestamp, and "viewKeyTs"

When reconcile {{ShadowableTombstone}} and {{ShadowableLivenessInfo}}: 
{quote}
if deletion-time greater than timestamp, tombstone wins
if deletion-time smaller than timestamp, livenessInfo wins
when deletion-time ties with timestamp, 
 - if {{ShadowableTombstone}}'s {{viewKeyTs}} >= {{ShadowableLivenessInfo}}', 
then tombstone wins
 - else livesnessInfo wins.
{quote}

When inserting to view, always use the greatest timestamp of all base columns 
in view similar to how view deletion timestamp is computed.

h3. example

{quote}
CREATE TABLE t (k int PRIMARY KEY, a int, b int);
CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS NOT 
NULL PRIMARY KEY (k, a);

{{q1}} INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
{{q2}} UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
{{q3}} UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; 
{{q3}} UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; 
{quote}


* After {{q1}}:
** in base: {{k=1@0, a=1, b=1}}// 'k' is having value '1' with timestamp '0'
** in view: 
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}  // 'k:a' is having value '1:1' with 
timestamp '0' and viewKeyTs '0' from base's pk because column 'a' has no TS
* After {{q2}}
** in base(merged): {{k=1@0, a=1, b=2@10}} 
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  or merged: {{(k=1&=1)@TS(10,0), b=2@10}}
* After {{q3}}
** in base(merged): {{k=1@0, a=2@2, b=2@10}}  
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  sstable3: {{(k=1&=1)@Shadowable(10,0)}} & {{(k=1&=2)@TS(10,2), 
b=2@10}}  // '(k=1&=2)' is having biggest timestamp '10' and viewKeyTs '2' 
from column 'a'
***  or merged: {{(k=1&=2)@TS(10,2), b=2@10}}
* After {{q4}}
** in base(merged): {{k=1@0, 

[jira] [Commented] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-11 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang commented on CASSANDRA-11500:
--

h3. idea

{{ShadowableTombstone}} : deletion-time, isShadowable, and "viewKeyTs" aka. 
base column's ts which is part of view pk(used to reconcile when timestamp 
tie), if there is no timestamp associated with that column, use base pk 
timestamp instead.
{{ShadowableLivenessInfo}}:  timestamp, and "viewKeyTs"

When reconcile {{ShadowableTombstone}} and {{ShadowableLivenessInfo}}: 
{quote}
if deletion-time greater than timestamp, tombstone wins
if deletion-time smaller than timestamp, livenessInfo wins
when deletion-time ties with timestamp, 
 - if {{ShadowableTombstone}}'s {{viewKeyTs}} >= {{ShadowableLivenessInfo}}', 
then tombstone wins
 - else livesnessInfo wins.
{quote}

When inserting to view, always use the greatest timestamp of all base columns 
in view similar to how view deletion timestamp is computed.

h3. example

{quote}
CREATE TABLE t (k int PRIMARY KEY, a int, b int);
CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS NOT 
NULL PRIMARY KEY (k, a);

{{q1}} INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
{{q2}} UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
{{q3}} UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; 
{{q3}} UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; 
{quote}


* After {{q1}}:
** in base: {{k=1@0, a=1, b=1}}// 'k' is having value '1' with timestamp '0'
** in view: 
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}  // 'k:a' is having value '1:1' with 
timestamp '0' and viewKeyTs '0' from base's pk because column 'a' has no TS
* After {{q2}}
** in base(merged): {{k=1@0, a=1, b=2@10}} 
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  or merged: {{(k=1&=1)@TS(10,0), b=2@10}}
* After {{q3}}
** in base(merged): {{k=1@0, a=2@2, b=2@10}}  
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  sstable3: {{(k=1&=1)@Shadowable(10,0)}} & {{(k=1&=2)@TS(10,2), 
b=2@10}}  // '(k=1&=2)' is having biggest timestamp '10' and viewKeyTs '2' 
from column 'a'
***  or merged: {{(k=1&=2)@TS(10,2), b=2@10}}
* After {{q4}}
** in base(merged): {{k=1@0, a=1@3, b=2@10}}  
** in view:  
***  sstable1: {{(k=1&=1)@TS(0,0), b=1}}
***  sstable2: {{(k=1&=1)@TS(10,0), b=2@10}}
***  sstable3: {{(k=1&=1)@Shadowable(10,0)}} & {{(k=1&=2)@TS(10,2), 
b=2@10}} 
***  sstable4: {{(k=1&=2)@Shadowable(10,2)}} & {{(k=1&=1)@TS(10,3), 
b=2@10}}  // '(k=1&=1)' is having biggest timestamp '10' and viewKeyTs '3' 
from column 'a'
***  or merged: {{(k=1&=1)@TS(10,3), b=2@10}}



> Obsolete MV entry may not be properly deleted
> -
>
> Key: CASSANDRA-11500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11500
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Sylvain Lebresne
>Assignee: ZhaoYang
>
> When a Materialized View uses a non-PK base table column in its PK, if an 
> update changes that column value, we add the new view entry and remove the 
> old one. When doing that removal, the current code uses the same timestamp 
> than for the liveness info of the new entry, which is the max timestamp for 
> any columns participating to the view PK. This is not correct for the 
> deletion as the old view entry could have other columns with higher timestamp 
> which won't be deleted as can easily shown by the failing of the following 
> test:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1;
> SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old 
> (invalid) and the new one
> {noformat}
> So the correct timestamp to use for the deletion is the biggest timestamp in 
> the old view entry (which we know since we read the pre-existing base row), 
> and that is what CASSANDRA-11475 does (the test above thus doesn't fail on 
> that branch).
> Unfortunately, even then we can still have problems if further updates 
> requires us to overide the old entry. Consider the following case:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the 
> entry for a=1 with timestamp 10
> UPDATE t USING 

[jira] [Commented] (CASSANDRA-10654) Make MV streaming rebuild parallel

2017-07-11 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081854#comment-16081854
 ] 

ZhaoYang commented on CASSANDRA-10654:
--

Is this issue still valid after CASSANDRA-13065 is merged? now when 
bootstrapping,  the base data will not go through write-path..

> Make MV streaming rebuild parallel
> --
>
> Key: CASSANDRA-10654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10654
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
> Fix For: 4.x
>
>
> When streaming a sstable that is a base table for one or more materialized 
> views we force the data through the mutation path to ensure the MVs are 
> updated.
> We currently do this sequentially so it's a bottleneck.  We should do this in 
> parallel.  We want to be smart to not saturate the mutations in the 
> non-bootstrap case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-10 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13573:
-
Status: Patch Available  (was: Awaiting Feedback)

> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Materialized Views, Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-10 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079806#comment-16079806
 ] 

ZhaoYang edited comment on CASSANDRA-13573 at 7/10/17 4:16 PM:
---

| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13573] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/130] | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13573] |

unit test: passed.
dtest: {{cqlsh_tests.cqlsh_tests.TestCqlsh.test_describe}} & 
{{bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test}}
 both are broken for some time

changes:
1. use {{type.toJSONString()}} with {{json.writeRawValue()}} instead of 
{{type.getString()}} to generate readable content 
2. {{column.cellValueType}} now :  {{a}}. if non-frozen collection, return 
value type, {{b}}. otherwise, return column type.



was (Author: jasonstack):
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13573] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/124] | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13573] |

changes:
1. use {{type.toJSONString()}} with {{json.writeRawValue()}} instead of 
{{type.getString()}} to generate readable content 
2. {{column.cellValueType}} now :  {{a}}. if non-frozen collection, return 
value type, {{b}}. otherwise, return column type.


> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Materialized Views, Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13576) test failure in bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test

2017-07-09 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16078214#comment-16078214
 ] 

ZhaoYang edited comment on CASSANDRA-13576 at 7/10/17 3:20 AM:
---

:D  LGTM


was (Author: jasonstack):
:D

> test failure in 
> bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
> -
>
> Key: CASSANDRA-13576
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13576
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Hamm
>Assignee: Alex Petrov
>  Labels: dtest, test-failure
> Attachments: node1_debug.log, node1_gc.log, node1.log, 
> node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, 
> node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_offheap_dtest/445/testReport/bootstrap_test/TestBootstrap/consistent_range_movement_false_with_rf1_should_succeed_test
> {noformat}
> Error Message
> 31 May 2017 04:28:09 [node3] Missing: ['Starting listening for CQL clients']:
> INFO  [main] 2017-05-31 04:18:01,615 YamlConfigura.
> See system.log for remainder
> {noformat}
> {noformat}
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 236, in 
> consistent_range_movement_false_with_rf1_should_succeed_test
> self._bootstrap_test_with_replica_down(False, rf=1)
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 278, in 
> _bootstrap_test_with_replica_down
> 
> jvm_args=["-Dcassandra.consistent.rangemovement={}".format(consistent_range_movement)])
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 696, in start
> self.wait_for_binary_interface(from_mark=self.mark)
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 514, in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 471, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> "31 May 2017 04:28:09 [node3] Missing: ['Starting listening for CQL 
> clients']:\nINFO  [main] 2017-05-31 04:18:01,615 YamlConfigura.\n
> {noformat}
> {noformat}
>  >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-PKphwD\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'memtable_allocation_type': 'offheap_objects',\n  
>   'num_tokens': '32',\n'phi_convict_threshold': 5,\n
> 'range_request_timeout_in_ms': 1,\n'read_request_timeout_in_ms': 
> 1,\n'request_timeout_in_ms': 1,\n
> 'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\ncassandra.policies: INFO: Using datacenter 'datacenter1' for 
> DCAwareRoundRobinPolicy (via host '127.0.0.1'); if incorrect, please specify 
> a local_dc to the constructor, or limit contact points to local cluster 
> nodes\ncassandra.cluster: INFO: New Cassandra host  datacenter1> discovered\ncassandra.protocol: WARNING: Server warning: When 
> increasing replication factor you need to run a full (-full) repair to 
> distribute the data.\ncassandra.connection: WARNING: Heartbeat failed for 
> connection (139927174110160) to 127.0.0.2\ncassandra.cluster: WARNING: Host 
> 127.0.0.2 has been marked down\ncassandra.pool: WARNING: Error attempting to 
> reconnect to 127.0.0.2, scheduling retry in 2.0 seconds: [Errno 111] Tried 
> connecting to [('127.0.0.2', 9042)]. Last error: Connection 
> refused\ncassandra.pool: WARNING: Error attempting to reconnect to 127.0.0.2, 
> scheduling retry in 4.0 seconds: [Errno 111] Tried connecting to 
> [('127.0.0.2', 9042)]. Last error: Connection refused\ncassandra.pool: 
> WARNING: Error attempting to reconnect to 127.0.0.2, scheduling retry in 8.0 
> seconds: [Errno 111] Tried connecting to [('127.0.0.2', 9042)]. Last error: 
> Connection refused\ncassandra.pool: WARNING: Error attempting to reconnect to 
> 127.0.0.2, scheduling retry in 16.0 seconds: [Errno 111] Tried connecting to 
> [('127.0.0.2', 9042)]. Last error: Connection refused\ncassandra.pool: 
> WARNING: Error attempting to reconnect to 127.0.0.2, scheduling retry in 32.0 
> seconds: [Errno 111] Tried connecting to [('127.0.0.2', 9042)]. Last error: 
> Connection refused\ncassandra.pool: WARNING: Error attempting to reconnect to 
> 127.0.0.2, 

[jira] [Comment Edited] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-09 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079806#comment-16079806
 ] 

ZhaoYang edited comment on CASSANDRA-13573 at 7/10/17 3:11 AM:
---

| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13573] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/124] | dtest |

changes:
1. use {{type.toJSONString()}} with {{json.writeRawValue()}} instead of 
{{type.getString()}} to generate readable content 
2. {{column.cellValueType}} now :  {{a}}. if non-frozen collect, return value 
type, {{b}}. otherwise, return current column type.

going to add more dtests on sstabledump, MV and SASI.


was (Author: jasonstack):
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13573] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/124] | dtest |

changes:
1. use {{type.toJSONString()}} with {{json.writeRawValue()}} instead of 
{{type.getString()}} to generate readable content 
2. {{column.cellValueType}} now :  a. only if non-frozen collect

going to add more dtests on Dump, MV and SASI.

> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13127) Materialized Views: View row expires too soon

2017-07-09 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079873#comment-16079873
 ] 

ZhaoYang commented on CASSANDRA-13127:
--

[~tjake]  yes, it will have to read existing data more often. 

the extra cases are:  when updating base's columns which are not used in view.  
If a user has multiple views on a base, read-before-write is very likely to 
happen, the original optimization won't help too much.

another naive way is totally disable TTL on table with MV. because it may not 
make sense to denormalize a short-life data..



> Materialized Views: View row expires too soon
> -
>
> Key: CASSANDRA-13127
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13127
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths, Materialized Views
>Reporter: Duarte Nunes
>Assignee: ZhaoYang
>
> Consider the following commands, ran against trunk:
> {code}
> echo "DROP MATERIALIZED VIEW ks.mv; DROP TABLE ks.base;" | bin/cqlsh
> echo "CREATE TABLE ks.base (p int, c int, v int, PRIMARY KEY (p, c));" | 
> bin/cqlsh
> echo "CREATE MATERIALIZED VIEW ks.mv AS SELECT p, c FROM base WHERE p IS NOT 
> NULL AND c IS NOT NULL PRIMARY KEY (c, p);" | bin/cqlsh
> echo "INSERT INTO ks.base (p, c) VALUES (0, 0) USING TTL 10;" | bin/cqlsh
> # wait for row liveness to get closer to expiration
> sleep 6;
> echo "UPDATE ks.base USING TTL 8 SET v = 0 WHERE p = 0 and c = 0;" | bin/cqlsh
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  7
> (1 rows)
>  c | p
> ---+---
>  0 | 0
> (1 rows)
> # wait for row liveness to expire
> sleep 4;
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  3
> (1 rows)
>  c | p
> ---+---
> (0 rows)
> {code}
> Notice how the view row is removed even though the base row is still live. I 
> would say this is because in ViewUpdateGenerator#computeLivenessInfoForEntry 
> the TTLs are compared instead of the expiration times, but I'm not sure I'm 
> getting that far ahead in the code when updating a column that's not in the 
> view.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-10 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079644#comment-16079644
 ] 

ZhaoYang edited comment on CASSANDRA-13573 at 7/10/17 3:08 AM:
---

Some issues are related to {{ColumnMetadata.cellValueType()}} which currently : 
 {{a}}. if CollectionType, returns value's type; {{b}} otherwise,  its own type.

It doesn't handle properly:  {{1}}. frozen collection type,  {{2}}. non-frozen 
udt which requires cellPath to retrieve value's type..  

There are 3 kind of usage for {{ColumnMetadata.cellValueType()}}:

1. to check if column is counter type, it's safe

2. used in MV to check if cell value (base's non key column in view's primary 
key) is changed. 
in the existing implementation, it will get {{cellValueType}} of a {{frozen 
collection}} to decode {{frozen collection bytes}}
it can easily result in runtime error. eg. base has non-key column 
{{frozen>}}  as view's primary key. so  {{tuple type}} is used to decode {{frozen}}
 
 {{non-frozen-udt}} cannot be used as view's primary key. the issue here is 
only with {{frozen-collection}}.
 
3. in {{sasi}}.  haven't check how it is affected.  will check it later


was (Author: jasonstack):
Some issues are related to {{ColumnMetadata.cellValueType()}} which currently : 
 {{a}}. if CollectionType, returns value's type; {{b}} otherwise,  its own type.

It doesn't handle properly:  {{1}}. frozen collection type,  {{2}}. non-frozen 
udt which requires cellPath to retrieve value's type..  

There are 3 kind of usage for {{ColumnMetadata.cellValueType()}}:

1. to check if column is counter type, it's safe

2. used in MV to check if cell value (base's non key column in view's primary 
key) is changed. 
in the existing implementation, it will get {{cellValueType}} of a {{frozen 
collection}} to decode {{frozen collection bytes}}
it can easily cause runtime error. eg. base has non-key column 
{{frozen>}}  as view's primary key. so  {{tuple type}} is used to decode {{frozen}}
 
 {{non-frozen-udt}} cannot be used as view's primary key. the issue here is 
only use {{frozen-collection}}.
 
3. in {{sasi}}.  haven't check how it is affected.  will check it later

> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13127) Materialized Views: View row expires too soon

2017-07-09 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079873#comment-16079873
 ] 

ZhaoYang edited comment on CASSANDRA-13127 at 7/10/17 5:07 AM:
---

[~tjake]  yes, it will have to read existing data more often. 

the extra cases are:  when updating base's columns which are not used in view.  
If a user has multiple views on a base, read-before-write is very likely to 
happen, the original optimization won't help too much.

another naive way is to ignore the TTL effect on view in the given example or 
totally disable TTL on table with MV. because it may not make sense to 
denormalize a short-life data..




was (Author: jasonstack):
[~tjake]  yes, it will have to read existing data more often. 

the extra cases are:  when updating base's columns which are not used in view.  
If a user has multiple views on a base, read-before-write is very likely to 
happen, the original optimization won't help too much.

another naive way is totally disable TTL on table with MV. because it may not 
make sense to denormalize a short-life data..



> Materialized Views: View row expires too soon
> -
>
> Key: CASSANDRA-13127
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13127
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths, Materialized Views
>Reporter: Duarte Nunes
>Assignee: ZhaoYang
>
> Consider the following commands, ran against trunk:
> {code}
> echo "DROP MATERIALIZED VIEW ks.mv; DROP TABLE ks.base;" | bin/cqlsh
> echo "CREATE TABLE ks.base (p int, c int, v int, PRIMARY KEY (p, c));" | 
> bin/cqlsh
> echo "CREATE MATERIALIZED VIEW ks.mv AS SELECT p, c FROM base WHERE p IS NOT 
> NULL AND c IS NOT NULL PRIMARY KEY (c, p);" | bin/cqlsh
> echo "INSERT INTO ks.base (p, c) VALUES (0, 0) USING TTL 10;" | bin/cqlsh
> # wait for row liveness to get closer to expiration
> sleep 6;
> echo "UPDATE ks.base USING TTL 8 SET v = 0 WHERE p = 0 and c = 0;" | bin/cqlsh
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  7
> (1 rows)
>  c | p
> ---+---
>  0 | 0
> (1 rows)
> # wait for row liveness to expire
> sleep 4;
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  3
> (1 rows)
>  c | p
> ---+---
> (0 rows)
> {code}
> Notice how the view row is removed even though the base row is still live. I 
> would say this is because in ViewUpdateGenerator#computeLivenessInfoForEntry 
> the TTLs are compared instead of the expiration times, but I'm not sure I'm 
> getting that far ahead in the code when updating a column that's not in the 
> view.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-09 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079644#comment-16079644
 ] 

ZhaoYang edited comment on CASSANDRA-13573 at 7/10/17 3:07 AM:
---

Some issues are related to {{ColumnMetadata.cellValueType()}} which currently : 
 {{a}}. if CollectionType, returns value's type; {{b}} otherwise,  its own type.

It doesn't handle properly:  {{1}}. frozen collection type,  {{2}}. non-frozen 
udt which requires cellPath to retrieve value's type..  

There are 3 kind of usage for {{ColumnMetadata.cellValueType()}}:

1. to check if column is counter type, it's safe

2. used in MV to check if cell value (base's non key column in view's primary 
key) is changed. 
in the existing implementation, it will get {{cellValueType}} of a {{frozen 
collection}} to decode {{frozen collection bytes}}
it can easily cause runtime error. eg. base has non-key column 
{{frozen>}}  as view's primary key. so  {{tuple type}} is used to decode {{frozen}}
 
 {{non-frozen-udt}} cannot be used as view's primary key. the issue here is 
only use {{frozen-collection}}.
 
3. in {{sasi}}.  haven't check how it is affected.  will check it later


was (Author: jasonstack):
Some issues are related to {{ColumnMetadata.cellValueType()}} which currently : 
 {{a}}. if CollectionType, returns value's type; {{b}} otherwise,  its own type.

It doesn't handle properly:  {{1}}. frozen collection type,  {{2}}. non-frozen 
udt which requires cellPath to retrieve value's type..  




> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-09 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13573:
-
Component/s: Materialized Views
 CQL
 Core

> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Materialized Views, Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-09 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079806#comment-16079806
 ] 

ZhaoYang edited comment on CASSANDRA-13573 at 7/10/17 3:16 AM:
---

| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13573] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/124] | dtest |

changes:
1. use {{type.toJSONString()}} with {{json.writeRawValue()}} instead of 
{{type.getString()}} to generate readable content 
2. {{column.cellValueType}} now :  {{a}}. if non-frozen collection, return 
value type, {{b}}. otherwise, return column type.

going to add more dtests on sstabledump, MV and SASI.


was (Author: jasonstack):
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13573] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/124] | dtest |

changes:
1. use {{type.toJSONString()}} with {{json.writeRawValue()}} instead of 
{{type.getString()}} to generate readable content 
2. {{column.cellValueType}} now :  {{a}}. if non-frozen collect, return value 
type, {{b}}. otherwise, return current column type.

going to add more dtests on sstabledump, MV and SASI.

> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Materialized Views, Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-09 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079806#comment-16079806
 ] 

ZhaoYang commented on CASSANDRA-13573:
--

| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13573] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/124] | dtest |

changes:
1. use {{type.toJSONString()}} with {{json.writeRawValue()}} instead of 
{{type.getString()}} to generate readable content 
2. {{column.cellValueType}} now :  a. only if non-frozen collect

going to add more dtests on Dump, MV and SASI.

> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13573) ColumnMetadata.cellValueType() doesn't return correct type for non-frozen collection

2017-07-10 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13573:
-
Summary: ColumnMetadata.cellValueType() doesn't return correct type for 
non-frozen collection  (was: ColumnMetadata.cellValueType() doesn't return 
correct type for sstabledump, view, sasi)

> ColumnMetadata.cellValueType() doesn't return correct type for non-frozen 
> collection
> 
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Materialized Views, Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-10 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081517#comment-16081517
 ] 

ZhaoYang commented on CASSANDRA-13573:
--

About the impact on SASI,  now sasi doesn't support {{complex}} type (aka, type 
with cellPath, non-frozen collection or udt).  

By fixing {{column.cellValueTytpe}}, the {{columnIndex.isLiteral()}} is now 
properly returning {{false}} if indexed column is {{frozen-collection}}.

> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Materialized Views, Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13573) ColumnMetadata.cellValueType() doesn't return correct type for sstabledump, view, sasi

2017-07-10 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13573:
-
Summary: ColumnMetadata.cellValueType() doesn't return correct type for 
sstabledump, view, sasi  (was: sstabledump doesn't print out tombstone 
information for frozen set collection)

> ColumnMetadata.cellValueType() doesn't return correct type for sstabledump, 
> view, sasi
> --
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Materialized Views, Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13573) ColumnMetadata.cellValueType() doesn't return correct type for non-frozen collection

2017-07-11 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079806#comment-16079806
 ] 

ZhaoYang edited comment on CASSANDRA-13573 at 7/11/17 6:55 AM:
---

first draft of the patch. if it looks good, I will prepare fixes for 
2.2/3.0/3.11 as well
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13573] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/130] | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13573] |

unit test: passed.
dtest: {{cqlsh_tests.cqlsh_tests.TestCqlsh.test_describe}} & 
{{bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test}}
 both are broken for some time

changes:
1. use {{type.toJSONString()}} with {{json.writeRawValue()}} instead of 
{{type.getString()}} to generate readable content 
2. {{column.cellValueType}} now :  {{a}}. if non-frozen collection, return 
value type, {{b}}. otherwise, return column type.



was (Author: jasonstack):
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13573] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/130] | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13573] |

unit test: passed.
dtest: {{cqlsh_tests.cqlsh_tests.TestCqlsh.test_describe}} & 
{{bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test}}
 both are broken for some time

changes:
1. use {{type.toJSONString()}} with {{json.writeRawValue()}} instead of 
{{type.getString()}} to generate readable content 
2. {{column.cellValueType}} now :  {{a}}. if non-frozen collection, return 
value type, {{b}}. otherwise, return column type.


> ColumnMetadata.cellValueType() doesn't return correct type for non-frozen 
> collection
> 
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Materialized Views, Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13127) Materialized Views: View row expires too soon

2017-07-07 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062967#comment-16062967
 ] 

ZhaoYang edited comment on CASSANDRA-13127 at 7/7/17 9:25 AM:
--

|| source || junit-result || dtest-result||
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13127-trunk] 
| [utest|https://circleci.com/gh/jasonstack/cassandra/115] | 
{{bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13576]
{{cqlsh_tests.cqlsh_tests.TestCqlsh.test_describe}} |
| [dtest| https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13127 
 ] | \ | \ |

1. View.mayBeAffectedBy  will return true as long as view-key-values are not 
filtered

2. LivenessInfo.supersedes(another) will check localDeletionTime if timestamps 
are the same. greater localDeletionTimestamp supersedes. 
 



was (Author: jasonstack):
|| source || junit-result || dtest-result||
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13127-trunk] 
| [utest|https://circleci.com/gh/jasonstack/cassandra/62] | |
| [dtest| https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13127 
 ] | \ | \ |

1. View.mayBeAffectedBy  will return true as long as view-key-values are not 
filtered

2. LivenessInfo.supersedes(another) will check localDeletionTime if timestamps 
are the same. greater localDeletionTimestamp supersedes. 
 


> Materialized Views: View row expires too soon
> -
>
> Key: CASSANDRA-13127
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13127
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths, Materialized Views
>Reporter: Duarte Nunes
>Assignee: ZhaoYang
>
> Consider the following commands, ran against trunk:
> {code}
> echo "DROP MATERIALIZED VIEW ks.mv; DROP TABLE ks.base;" | bin/cqlsh
> echo "CREATE TABLE ks.base (p int, c int, v int, PRIMARY KEY (p, c));" | 
> bin/cqlsh
> echo "CREATE MATERIALIZED VIEW ks.mv AS SELECT p, c FROM base WHERE p IS NOT 
> NULL AND c IS NOT NULL PRIMARY KEY (c, p);" | bin/cqlsh
> echo "INSERT INTO ks.base (p, c) VALUES (0, 0) USING TTL 10;" | bin/cqlsh
> # wait for row liveness to get closer to expiration
> sleep 6;
> echo "UPDATE ks.base USING TTL 8 SET v = 0 WHERE p = 0 and c = 0;" | bin/cqlsh
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  7
> (1 rows)
>  c | p
> ---+---
>  0 | 0
> (1 rows)
> # wait for row liveness to expire
> sleep 4;
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  3
> (1 rows)
>  c | p
> ---+---
> (0 rows)
> {code}
> Notice how the view row is removed even though the base row is still live. I 
> would say this is because in ViewUpdateGenerator#computeLivenessInfoForEntry 
> the TTLs are compared instead of the expiration times, but I'm not sure I'm 
> getting that far ahead in the code when updating a column that's not in the 
> view.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-07 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16077805#comment-16077805
 ] 

ZhaoYang edited comment on CASSANDRA-13573 at 7/7/17 8:32 AM:
--

I am thinking to change all {{type.getString()}} in {{JsonTransformer}} to 
{{type.toJSONString}} when 
[13592|https://issues.apache.org/jira/browse/CASSANDRA-13592] merged.

Current {{collectionType.getString()}} only generates entire data as byte 
string. Imo,{{getString}} is not designed for generate json readable values.


was (Author: jasonstack):
I am thinking to change all {{type.getString()}} in {{JsonTransformer}} to 
{{type.toJSONString}} when 
[13592|https://issues.apache.org/jira/browse/CASSANDRA-13592].

Current {{collectionType.getString()}} only generates entire data as byte 
string. Imo, it's not designed for generate json readable values. 

> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-07 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16077805#comment-16077805
 ] 

ZhaoYang commented on CASSANDRA-13573:
--

I am thinking to change all {{type.getString()}} in {{JsonTransformer}} to 
{{type.toJSONString}} when 
[13592|https://issues.apache.org/jira/browse/CASSANDRA-13592].

Current {{collectionType.getString()}} only generates entire data as byte 
string. Imo, it's not designed for generate json readable values. 

> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-07 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13573:
-
Status: Awaiting Feedback  (was: In Progress)

> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-07 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang reassigned CASSANDRA-11500:


Assignee: ZhaoYang

> Obsolete MV entry may not be properly deleted
> -
>
> Key: CASSANDRA-11500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11500
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Sylvain Lebresne
>Assignee: ZhaoYang
>
> When a Materialized View uses a non-PK base table column in its PK, if an 
> update changes that column value, we add the new view entry and remove the 
> old one. When doing that removal, the current code uses the same timestamp 
> than for the liveness info of the new entry, which is the max timestamp for 
> any columns participating to the view PK. This is not correct for the 
> deletion as the old view entry could have other columns with higher timestamp 
> which won't be deleted as can easily shown by the failing of the following 
> test:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1;
> SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old 
> (invalid) and the new one
> {noformat}
> So the correct timestamp to use for the deletion is the biggest timestamp in 
> the old view entry (which we know since we read the pre-existing base row), 
> and that is what CASSANDRA-11475 does (the test above thus doesn't fail on 
> that branch).
> Unfortunately, even then we can still have problems if further updates 
> requires us to overide the old entry. Consider the following case:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the 
> entry for a=1 with timestamp 10
> UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; // This needs to re-insert 
> an entry for a=1 but shouldn't be deleted by the prior deletion
> UPDATE t USING TIMESTAMP 4 SET a = 2 WHERE k = 1; // ... and we can play this 
> game more than once
> UPDATE t USING TIMESTAMP 5 SET a = 1 WHERE k = 1;
> ...
> {noformat}
> In a way, this is saying that the "shadowable" deletion mechanism is not 
> general enough: we need to be able to re-insert an entry when a prior one had 
> been deleted before, but we can't rely on timestamps being strictly bigger on 
> the re-insert. In that sense, this can be though as a similar problem than 
> CASSANDRA-10965, though the solution there of a single flag is not enough 
> since we can have to replace more than once.
> I think the proper solution would be to ship enough information to always be 
> able to decide when a view deletion is shadowed. Which means that both 
> liveness info (for updates) and shadowable deletion would need to ship the 
> timestamp of any base table column that is part the view PK (so {{a}} in the 
> example below).  It's doable (and not that hard really), but it does require 
> a change to the sstable and intra-node protocol, which makes this a bit 
> painful right now.
> But I'll also note that as CASSANDRA-1096 shows, the timestamp is not even 
> enough since on equal timestamp the value can be the deciding factor. So in 
> theory we'd have to ship the value of those columns (in the case of a 
> deletion at least since we have it in the view PK for updates). That said, on 
> that last problem, my preference would be that we start prioritizing 
> CASSANDRA-6123 seriously so we don't have to care about conflicting timestamp 
> anymore, which would make this problem go away.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-13622) Better config validation/documentation

2017-07-12 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang reassigned CASSANDRA-13622:


Assignee: ZhaoYang

> Better config validation/documentation
> --
>
> Key: CASSANDRA-13622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13622
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Kurt Greaves
>Assignee: ZhaoYang
>Priority: Minor
>  Labels: lhf
>
> There are a number of properties in the yaml that are "in_mb", however 
> resolve to bytes when calculated in {{DatabaseDescriptor.java}}, but are 
> stored in int's. This means that their maximum values are 2047, as any higher 
> when converted to bytes overflows the int.
> Where possible/reasonable we should convert these to be long's, and stored as 
> long's. If there is no reason for the value to ever be >2047 we should at 
> least document that as the max value, or better yet make it error if set 
> higher than that. Noting that although it's bad practice to increase a lot of 
> them to such high values, there may be cases where it is necessary and in 
> which case we should handle it appropriately rather than overflowing and 
> surprising the user. That is, causing it to break but not in the way the user 
> expected it to :)
> Following are functions that currently could be at risk of the above:
> {code:java|title=DatabaseDescriptor.java}
> getThriftFramedTransportSize()
> getMaxValueSize()
> getCompactionLargePartitionWarningThreshold()
> getCommitLogSegmentSize()
> getNativeTransportMaxFrameSize()
> # These are in KB so max value of 2096128
> getBatchSizeWarnThreshold()
> getColumnIndexSize()
> getColumnIndexCacheSize()
> getMaxMutationSize()
> {code}
> Note we may not actually need to fix all of these, and there may be more. 
> This was just from a rough scan over the code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-13622) Better config validation/documentation

2017-07-12 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang reassigned CASSANDRA-13622:


Assignee: (was: ZhaoYang)

> Better config validation/documentation
> --
>
> Key: CASSANDRA-13622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13622
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Kurt Greaves
>Priority: Minor
>  Labels: lhf
>
> There are a number of properties in the yaml that are "in_mb", however 
> resolve to bytes when calculated in {{DatabaseDescriptor.java}}, but are 
> stored in int's. This means that their maximum values are 2047, as any higher 
> when converted to bytes overflows the int.
> Where possible/reasonable we should convert these to be long's, and stored as 
> long's. If there is no reason for the value to ever be >2047 we should at 
> least document that as the max value, or better yet make it error if set 
> higher than that. Noting that although it's bad practice to increase a lot of 
> them to such high values, there may be cases where it is necessary and in 
> which case we should handle it appropriately rather than overflowing and 
> surprising the user. That is, causing it to break but not in the way the user 
> expected it to :)
> Following are functions that currently could be at risk of the above:
> {code:java|title=DatabaseDescriptor.java}
> getThriftFramedTransportSize()
> getMaxValueSize()
> getCompactionLargePartitionWarningThreshold()
> getCommitLogSegmentSize()
> getNativeTransportMaxFrameSize()
> # These are in KB so max value of 2096128
> getBatchSizeWarnThreshold()
> getColumnIndexSize()
> getColumnIndexCacheSize()
> getMaxMutationSize()
> {code}
> Note we may not actually need to fix all of these, and there may be more. 
> This was just from a rough scan over the code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-11 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16083327#comment-16083327
 ] 

ZhaoYang commented on CASSANDRA-13526:
--

[~jjirsa] could you review ? thanks..

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13127) Materialized Views: View row expires too soon

2017-07-12 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16083764#comment-16083764
 ] 

ZhaoYang edited comment on CASSANDRA-13127 at 7/12/17 10:24 AM:


{{when updating base's columns which are not used in view.}} is about whether 
we should consider `Update semantic the same as Insert`. (for now it's not the 
same. update-statement has no primary key liveness info. pk's liveness of 
update-statement is depending on liveness of normal columns)

eg. current semantic/behavior:   
{quote}
base table: create table ks.base (p int, c int, v int, primary key (p, c))
view table:  select p,c from ks.base ... primary key (c, p)
an update query on base table normal column {{v}} will not generate any rows..
a delete query on base table normal column {{v}} will remove the base row.
{quote}

imo, to avoid user confusion, it's better to keep view matched with base 
regardless semantic of update..(the current patch is not general enough to 
cover all such cases yet)
[~tjake][~slebresne] what do you think?



was (Author: jasonstack):
{{when updating base's columns which are not used in view.}} is about whether 
we should consider `Update semantic the same as Insert`. (for now it's not the 
same. update-statement has no primary key liveness info. pk's liveness of 
update-statement is depending on liveness of normal columns)

eg.   
{quote}
base table: create table ks.base (p int, c int, v int, primary key (p, c))
view table:  select p,c from ks.base ... primary key (c, p)
an update query on base table normal column {{v}} will not generate any rows..
a delete query on base table normal column {{v}} will remove the base row.
{quote}

imo, to avoid user confusion, it's better to keep view matched with base 
regardless semantic of update..(the current patch is not general enough to 
cover all such cases yet)
[~tjake][~slebresne] what do you think?


> Materialized Views: View row expires too soon
> -
>
> Key: CASSANDRA-13127
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13127
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths, Materialized Views
>Reporter: Duarte Nunes
>Assignee: ZhaoYang
>
> Consider the following commands, ran against trunk:
> {code}
> echo "DROP MATERIALIZED VIEW ks.mv; DROP TABLE ks.base;" | bin/cqlsh
> echo "CREATE TABLE ks.base (p int, c int, v int, PRIMARY KEY (p, c));" | 
> bin/cqlsh
> echo "CREATE MATERIALIZED VIEW ks.mv AS SELECT p, c FROM base WHERE p IS NOT 
> NULL AND c IS NOT NULL PRIMARY KEY (c, p);" | bin/cqlsh
> echo "INSERT INTO ks.base (p, c) VALUES (0, 0) USING TTL 10;" | bin/cqlsh
> # wait for row liveness to get closer to expiration
> sleep 6;
> echo "UPDATE ks.base USING TTL 8 SET v = 0 WHERE p = 0 and c = 0;" | bin/cqlsh
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  7
> (1 rows)
>  c | p
> ---+---
>  0 | 0
> (1 rows)
> # wait for row liveness to expire
> sleep 4;
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  3
> (1 rows)
>  c | p
> ---+---
> (0 rows)
> {code}
> Notice how the view row is removed even though the base row is still live. I 
> would say this is because in ViewUpdateGenerator#computeLivenessInfoForEntry 
> the TTLs are compared instead of the expiration times, but I'm not sure I'm 
> getting that far ahead in the code when updating a column that's not in the 
> view.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12958) Cassandra Not Starting NullPointerException at org.apache.cassandra.db.index.SecondaryIndex.createInstance

2017-07-13 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-12958:
-
Status: Open  (was: Patch Available)

> Cassandra Not Starting NullPointerException at 
> org.apache.cassandra.db.index.SecondaryIndex.createInstance
> --
>
> Key: CASSANDRA-12958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: CentOS
>Reporter: Ashraful Islam
>Assignee: ZhaoYang
>  Labels: Bug, not_start, secondary_index
>
> Whole Process of this issue is given below : 
> # Dropped secondary index.
> # Run Repair on cluster.
> # After 15 days later of dropping index, below configuration changed in 
> Cassandra.yaml :
> index_summary_resize_interval_in_minutes: -1 
> (cause While adding nodes it was taking a lot of time to redistribute index)
> # Rolling restart all nodes.
> # While adding fresh node, live nodes were going down.
> After two nodes are down, we stopped node adding process. 
> This is the error Cassandra throws while restarting down nodes in System.log: 
> {noformat}
> INFO  [main] 2016-11-27 00:51:48,220 ColumnFamilyStore.java:382 - 
> Initializing ringid.verifiedmobile
> ERROR [main] 2016-11-27 00:51:48,236 CassandraDaemon.java:651 - Exception 
> encountered during startup
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.db.index.SecondaryIndex.createInstance(SecondaryIndex.java:378)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> at 
> org.apache.cassandra.db.index.SecondaryIndexManager.addIndexedColumn(SecondaryIndexManager.java:279)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> at org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:407) 
> ~[apache-cassandra-2.2.4.jar:2.2.4]
> at org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:354) 
> ~[apache-cassandra-2.2.4.jar:2.2.4]
> at 
> org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:535)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> at 
> org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:511)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:342) 
> ~[apache-cassandra-2.2.4.jar:2.2.4]
> at org.apache.cassandra.db.Keyspace.(Keyspace.java:270) 
> ~[apache-cassandra-2.2.4.jar:2.2.4]
> at org.apache.cassandra.db.Keyspace.open(Keyspace.java:116) 
> ~[apache-cassandra-2.2.4.jar:2.2.4]
> at org.apache.cassandra.db.Keyspace.open(Keyspace.java:93) 
> ~[apache-cassandra-2.2.4.jar:2.2.4]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:256) 
> [apache-cassandra-2.2.4.jar:2.2.4]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:529)
>  [apache-cassandra-2.2.4.jar:2.2.4]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:638) 
> [apache-cassandra-2.2.4.jar:2.2.4]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13127) Materialized Views: View row expires too soon

2017-07-18 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091325#comment-16091325
 ] 

ZhaoYang edited comment on CASSANDRA-13127 at 7/18/17 9:16 AM:
---

[~doanduyhai] yes, it's considered. 

There are 3 cases:
TTLed base column as view's PK or view's filter condition will wipe entire view 
row.
TTLed base column as selected in view will work as it is
TTLed base column not selected in view, if there is no other live cells or 
primary key of view, view row is gone.

please feel free to point out any cases, I will summarize the solution into 
CASSANDRA-11500


was (Author: jasonstack):
[~doanduyhai] yes, it's considered. 

There are 3 cases:
TTLed base column as view's PK or view's filter condition will wipe entire view 
row.
TTLed base column as selected in view will work as it is
TTLed base column not selected in view, if there is no other live cells or 
primary key of view, view row is gone.

> Materialized Views: View row expires too soon
> -
>
> Key: CASSANDRA-13127
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13127
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths, Materialized Views
>Reporter: Duarte Nunes
>Assignee: ZhaoYang
>
> Consider the following commands, ran against trunk:
> {code}
> echo "DROP MATERIALIZED VIEW ks.mv; DROP TABLE ks.base;" | bin/cqlsh
> echo "CREATE TABLE ks.base (p int, c int, v int, PRIMARY KEY (p, c));" | 
> bin/cqlsh
> echo "CREATE MATERIALIZED VIEW ks.mv AS SELECT p, c FROM base WHERE p IS NOT 
> NULL AND c IS NOT NULL PRIMARY KEY (c, p);" | bin/cqlsh
> echo "INSERT INTO ks.base (p, c) VALUES (0, 0) USING TTL 10;" | bin/cqlsh
> # wait for row liveness to get closer to expiration
> sleep 6;
> echo "UPDATE ks.base USING TTL 8 SET v = 0 WHERE p = 0 and c = 0;" | bin/cqlsh
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  7
> (1 rows)
>  c | p
> ---+---
>  0 | 0
> (1 rows)
> # wait for row liveness to expire
> sleep 4;
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  3
> (1 rows)
>  c | p
> ---+---
> (0 rows)
> {code}
> Notice how the view row is removed even though the base row is still live. I 
> would say this is because in ViewUpdateGenerator#computeLivenessInfoForEntry 
> the TTLs are compared instead of the expiration times, but I'm not sure I'm 
> getting that far ahead in the code when updating a column that's not in the 
> view.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13127) Materialized Views: View row expires too soon

2017-07-18 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091325#comment-16091325
 ] 

ZhaoYang commented on CASSANDRA-13127:
--

[~doanduyhai] yes, it's considered. 

There are 3 cases:
TTLed base column as view's PK or view's filter condition will wipe entire view 
row.
TTLed base column as selected in view will work as it is
TTLed base column not selected in view, if there is no other live cells or 
primary key of view, view row is gone.

> Materialized Views: View row expires too soon
> -
>
> Key: CASSANDRA-13127
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13127
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths, Materialized Views
>Reporter: Duarte Nunes
>Assignee: ZhaoYang
>
> Consider the following commands, ran against trunk:
> {code}
> echo "DROP MATERIALIZED VIEW ks.mv; DROP TABLE ks.base;" | bin/cqlsh
> echo "CREATE TABLE ks.base (p int, c int, v int, PRIMARY KEY (p, c));" | 
> bin/cqlsh
> echo "CREATE MATERIALIZED VIEW ks.mv AS SELECT p, c FROM base WHERE p IS NOT 
> NULL AND c IS NOT NULL PRIMARY KEY (c, p);" | bin/cqlsh
> echo "INSERT INTO ks.base (p, c) VALUES (0, 0) USING TTL 10;" | bin/cqlsh
> # wait for row liveness to get closer to expiration
> sleep 6;
> echo "UPDATE ks.base USING TTL 8 SET v = 0 WHERE p = 0 and c = 0;" | bin/cqlsh
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  7
> (1 rows)
>  c | p
> ---+---
>  0 | 0
> (1 rows)
> # wait for row liveness to expire
> sleep 4;
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  3
> (1 rows)
>  c | p
> ---+---
> (0 rows)
> {code}
> Notice how the view row is removed even though the base row is still live. I 
> would say this is because in ViewUpdateGenerator#computeLivenessInfoForEntry 
> the TTLs are compared instead of the expiration times, but I'm not sure I'm 
> getting that far ahead in the code when updating a column that's not in the 
> view.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-18 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/18/17 3:10 PM:
---

h3. Relation: base -> view

First of all, I think all of us should agree on what cases view row should 
exists.

IMO, there are two main cases:

1. base pk and view pk are the same (order doesn't matter) and view has no 
filter conditions or only conditions on base pk.
(filter condition is not a concern here, since no previous view data to be 
cleared)

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) and base row pk satifies 
view's filter conditions if any.
* b. or one of base row columns selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. this is 
handled by existing mechanism of liveness and tombstone since all info are 
included in view row
* c. or one of base row columns not selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. Those 
unselected columns' timestamp/ttl/cell-deletion info currently are not stored 
on view row. 

2. base column used in view pk or view has filter conditions on base non-key 
column which can also lead to entire view row being wiped.

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) && base column used in view 
pk is not null but no timestamp && conditions are satisfied. ( pk having live 
livenesInfo means it is not deleted by tombstone)
* b. or base row column in view pk has timestamp (via update) && conditions are 
satisfied. eg. if base column used in view pk is TTLed, entire view row should 
be wiped.

Next thing is to model "shadowable tombstone or shadowable liveness" to 
maintain view data based on above cases.
 
h3. Previous known issues: 
(I might miss some issues, feel free to ping me..)

ttl
* view row is not wiped when TTLed on base column used in view pk or TTLed on 
base non-key column with filter condition
* cells with same timestamp, merging ttls are not deterministic.

partial update on base columns not selected in view
* it results in no view data. because of current update semantics, no view 
updates are generated
* corresponding view row liveness is not depending on liveness of base columns

filter conditions or base column used in view pk causes
* view row is shadowed after a few modification on base column used in view pk 
if the base non-key column has TS greater than base pk's ts and view key 
column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry 
when a prior one had been deleted need to be careful to hanlde timestamp tie)

tombstone merging is not commutative
* in current code, shadowable tombstone doesn't co-exist with regular tombstone

sstabledump doesn't not support current shadowable tombstone

h3. Model (TO BE UPDATED)

{{ShadowableTombstone}} : 
* deletion-time, isShadowable, and "viewKeyTs" aka. base column's ts which is 
part of view pk(used to reconcile when timestamp tie), if there is no timestamp 
associated with that column, use base pk timestamp instead.
* it's only generated when one base column is a pk in view and this base column 
value is changed in base row, to mark previous view row as deleted. (original 
definition of {{shadowable}} in CASSANDRA-10261).  in other cases, {{standard 
tombstone}} is generated for view rows.
* if {{ShadowableTombstone}} is superseded by {{LivenessInfo}}, columns 
shadowed by {{ShadowableTombstone}} will come back alive. (original definition 
of {{shadowable}} in CASSANDRA-10261)
* {{ShadowableTombstone}}  should co-exist with {{Standard Tombstone}} if 
{{shadowable}}'s deletion time supersedes {{standard tombstone}} to avoid 
bringing columns older than {{standard tombstone}} coming back alive( as in 
CASSANDRA-13409)

{{ShadowableLivenessInfo}}:  
* timestamp, and "viewKeyTs"
* if shadowable and not live, all columns in this row are considered deleted as 
in CASSANDRA-13657 and CASSANDRA-13127 to solve partial update issues

When reconcile {{ShadowableTombstone}} and {{ShadowableLivenessInfo}}: 
{quote}
if deletion-time greater than timestamp, tombstone wins
if deletion-time smaller than timestamp, livenessInfo wins
when deletion-time ties with timestamp, 
 - if {{ShadowableTombstone}}'s {{viewKeyTs}} >= {{ShadowableLivenessInfo}}'s, 
then tombstone wins
 - else livesnessInfo wins.
{quote}

When inserting to view, always use the greatest timestamp of all base columns 
in view similar to how view deletion timestamp is computed.

h3. *Example*

{quote}
CREATE TABLE t (k int PRIMARY KEY, a int, b int);
CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS NOT 
NULL PRIMARY KEY (k, a);

{{q1}} INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
{{q2}} UPDATE t 

[jira] [Comment Edited] (CASSANDRA-13573) sstabledump doesn't print out tombstone information for frozen set collection

2017-07-10 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079806#comment-16079806
 ] 

ZhaoYang edited comment on CASSANDRA-13573 at 7/10/17 2:06 PM:
---

| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13573] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/124] | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13573] |

changes:
1. use {{type.toJSONString()}} with {{json.writeRawValue()}} instead of 
{{type.getString()}} to generate readable content 
2. {{column.cellValueType}} now :  {{a}}. if non-frozen collection, return 
value type, {{b}}. otherwise, return column type.



was (Author: jasonstack):
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13573] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/124] | dtest |

changes:
1. use {{type.toJSONString()}} with {{json.writeRawValue()}} instead of 
{{type.getString()}} to generate readable content 
2. {{column.cellValueType}} now :  {{a}}. if non-frozen collection, return 
value type, {{b}}. otherwise, return column type.

going to add more dtests on sstabledump, MV and SASI.

> sstabledump doesn't print out tombstone information for frozen set collection
> -
>
> Key: CASSANDRA-13573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13573
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, CQL, Materialized Views, Tools
>Reporter: Stefano Ortolani
>Assignee: ZhaoYang
>
> Schema and data"
> {noformat}
> CREATE TABLE ks.cf (
> hash blob,
> report_id timeuuid,
> subject_ids frozen,
> PRIMARY KEY (hash, report_id)
> ) WITH CLUSTERING ORDER BY (report_id DESC);
> INSERT INTO ks.cf (hash, report_id, subject_ids) VALUES (0x1213, now(), 
> {1,2,4,5});
> {noformat}
> sstabledump output is:
> {noformat}
> sstabledump mc-1-big-Data.db 
> [
>   {
> "partition" : {
>   "key" : [ "1213" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 16,
> "clustering" : [ "ec01eed0-49d9-11e7-b39a-97a96f529c02" ],
> "liveness_info" : { "tstamp" : "2017-06-05T10:29:57.434856Z" },
> "cells" : [
>   { "name" : "subject_ids", "value" : "" }
> ]
>   }
> ]
>   }
> ]
> {noformat}
> While the values are really there:
> {noformat}
> cqlsh:ks> select * from cf ;
>  hash   | report_id| subject_ids
> +--+-
>  0x1213 | 02bafff0-49d9-11e7-b39a-97a96f529c02 |   {1, 2, 4}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-13576) test failure in bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test

2017-07-07 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang reassigned CASSANDRA-13576:


Assignee: Alex Petrov  (was: ZhaoYang)

:D

> test failure in 
> bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
> -
>
> Key: CASSANDRA-13576
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13576
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Hamm
>Assignee: Alex Petrov
>  Labels: dtest, test-failure
> Attachments: node1_debug.log, node1_gc.log, node1.log, 
> node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, 
> node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_offheap_dtest/445/testReport/bootstrap_test/TestBootstrap/consistent_range_movement_false_with_rf1_should_succeed_test
> {noformat}
> Error Message
> 31 May 2017 04:28:09 [node3] Missing: ['Starting listening for CQL clients']:
> INFO  [main] 2017-05-31 04:18:01,615 YamlConfigura.
> See system.log for remainder
> {noformat}
> {noformat}
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 236, in 
> consistent_range_movement_false_with_rf1_should_succeed_test
> self._bootstrap_test_with_replica_down(False, rf=1)
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 278, in 
> _bootstrap_test_with_replica_down
> 
> jvm_args=["-Dcassandra.consistent.rangemovement={}".format(consistent_range_movement)])
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 696, in start
> self.wait_for_binary_interface(from_mark=self.mark)
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 514, in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 471, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> "31 May 2017 04:28:09 [node3] Missing: ['Starting listening for CQL 
> clients']:\nINFO  [main] 2017-05-31 04:18:01,615 YamlConfigura.\n
> {noformat}
> {noformat}
>  >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-PKphwD\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'memtable_allocation_type': 'offheap_objects',\n  
>   'num_tokens': '32',\n'phi_convict_threshold': 5,\n
> 'range_request_timeout_in_ms': 1,\n'read_request_timeout_in_ms': 
> 1,\n'request_timeout_in_ms': 1,\n
> 'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\ncassandra.policies: INFO: Using datacenter 'datacenter1' for 
> DCAwareRoundRobinPolicy (via host '127.0.0.1'); if incorrect, please specify 
> a local_dc to the constructor, or limit contact points to local cluster 
> nodes\ncassandra.cluster: INFO: New Cassandra host  datacenter1> discovered\ncassandra.protocol: WARNING: Server warning: When 
> increasing replication factor you need to run a full (-full) repair to 
> distribute the data.\ncassandra.connection: WARNING: Heartbeat failed for 
> connection (139927174110160) to 127.0.0.2\ncassandra.cluster: WARNING: Host 
> 127.0.0.2 has been marked down\ncassandra.pool: WARNING: Error attempting to 
> reconnect to 127.0.0.2, scheduling retry in 2.0 seconds: [Errno 111] Tried 
> connecting to [('127.0.0.2', 9042)]. Last error: Connection 
> refused\ncassandra.pool: WARNING: Error attempting to reconnect to 127.0.0.2, 
> scheduling retry in 4.0 seconds: [Errno 111] Tried connecting to 
> [('127.0.0.2', 9042)]. Last error: Connection refused\ncassandra.pool: 
> WARNING: Error attempting to reconnect to 127.0.0.2, scheduling retry in 8.0 
> seconds: [Errno 111] Tried connecting to [('127.0.0.2', 9042)]. Last error: 
> Connection refused\ncassandra.pool: WARNING: Error attempting to reconnect to 
> 127.0.0.2, scheduling retry in 16.0 seconds: [Errno 111] Tried connecting to 
> [('127.0.0.2', 9042)]. Last error: Connection refused\ncassandra.pool: 
> WARNING: Error attempting to reconnect to 127.0.0.2, scheduling retry in 32.0 
> seconds: [Errno 111] Tried connecting to [('127.0.0.2', 9042)]. Last error: 
> Connection refused\ncassandra.pool: WARNING: Error attempting to reconnect to 
> 127.0.0.2, scheduling retry in 64.0 seconds: [Errno 111] Tried connecting to 
> 

[jira] [Comment Edited] (CASSANDRA-13622) Better config validation/documentation

2017-07-16 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087060#comment-16087060
 ] 

ZhaoYang edited comment on CASSANDRA-13622 at 7/17/17 5:14 AM:
---

| [trunk| https://github.com/jasonstack/cassandra/commits/CASSANDRA-13622] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/153] | dtest: except for 1 
known error in bootstrap_test |

handled NPE when empty entry in storage_directories and handled overflow when 
converting to int32 KB from MB


was (Author: jasonstack):
| [trunk| https://github.com/jasonstack/cassandra/commits/CASSANDRA-13622] | 
[unit|https://circleci.com/gh/jasonstack/cassandra/153] | dtest: except for 1 
known error in bootstrap_test |

> Better config validation/documentation
> --
>
> Key: CASSANDRA-13622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13622
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Kurt Greaves
>Assignee: ZhaoYang
>Priority: Minor
>  Labels: lhf
>
> There are a number of properties in the yaml that are "in_mb", however 
> resolve to bytes when calculated in {{DatabaseDescriptor.java}}, but are 
> stored in int's. This means that their maximum values are 2047, as any higher 
> when converted to bytes overflows the int.
> Where possible/reasonable we should convert these to be long's, and stored as 
> long's. If there is no reason for the value to ever be >2047 we should at 
> least document that as the max value, or better yet make it error if set 
> higher than that. Noting that although it's bad practice to increase a lot of 
> them to such high values, there may be cases where it is necessary and in 
> which case we should handle it appropriately rather than overflowing and 
> surprising the user. That is, causing it to break but not in the way the user 
> expected it to :)
> Following are functions that currently could be at risk of the above:
> {code:java|title=DatabaseDescriptor.java}
> getThriftFramedTransportSize()
> getMaxValueSize()
> getCompactionLargePartitionWarningThreshold()
> getCommitLogSegmentSize()
> getNativeTransportMaxFrameSize()
> # These are in KB so max value of 2096128
> getBatchSizeWarnThreshold()
> getColumnIndexSize()
> getColumnIndexCacheSize()
> getMaxMutationSize()
> {code}
> Note we may not actually need to fix all of these, and there may be more. 
> This was just from a rough scan over the code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075925#comment-16075925
 ] 

ZhaoYang edited comment on CASSANDRA-13526 at 7/20/17 5:42 AM:
---

| branch | unit | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13526] |
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
running | running |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
running | running |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
running | running |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
running | running |


when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  


was (Author: jasonstack):
| branch | unit | dtest|
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
running | running |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
running | running |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
running | running |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
running | running |


when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-20 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094272#comment-16094272
 ] 

ZhaoYang commented on CASSANDRA-11500:
--

All livenessInfo or row deletion in MV will be ViewLivenessInfo or ViewDeletion 
with some extra details to check if view row is still alive.

Shadowable mechanism is not used..(single flag is not sufficient and in the 
proposal, we don't need to bring back the columns shadowed by 
shadowable-tombstone)

> Obsolete MV entry may not be properly deleted
> -
>
> Key: CASSANDRA-11500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11500
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Sylvain Lebresne
>Assignee: ZhaoYang
>
> When a Materialized View uses a non-PK base table column in its PK, if an 
> update changes that column value, we add the new view entry and remove the 
> old one. When doing that removal, the current code uses the same timestamp 
> than for the liveness info of the new entry, which is the max timestamp for 
> any columns participating to the view PK. This is not correct for the 
> deletion as the old view entry could have other columns with higher timestamp 
> which won't be deleted as can easily shown by the failing of the following 
> test:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1;
> SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old 
> (invalid) and the new one
> {noformat}
> So the correct timestamp to use for the deletion is the biggest timestamp in 
> the old view entry (which we know since we read the pre-existing base row), 
> and that is what CASSANDRA-11475 does (the test above thus doesn't fail on 
> that branch).
> Unfortunately, even then we can still have problems if further updates 
> requires us to overide the old entry. Consider the following case:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the 
> entry for a=1 with timestamp 10
> UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; // This needs to re-insert 
> an entry for a=1 but shouldn't be deleted by the prior deletion
> UPDATE t USING TIMESTAMP 4 SET a = 2 WHERE k = 1; // ... and we can play this 
> game more than once
> UPDATE t USING TIMESTAMP 5 SET a = 1 WHERE k = 1;
> ...
> {noformat}
> In a way, this is saying that the "shadowable" deletion mechanism is not 
> general enough: we need to be able to re-insert an entry when a prior one had 
> been deleted before, but we can't rely on timestamps being strictly bigger on 
> the re-insert. In that sense, this can be though as a similar problem than 
> CASSANDRA-10965, though the solution there of a single flag is not enough 
> since we can have to replace more than once.
> I think the proper solution would be to ship enough information to always be 
> able to decide when a view deletion is shadowed. Which means that both 
> liveness info (for updates) and shadowable deletion would need to ship the 
> timestamp of any base table column that is part the view PK (so {{a}} in the 
> example below).  It's doable (and not that hard really), but it does require 
> a change to the sstable and intra-node protocol, which makes this a bit 
> painful right now.
> But I'll also note that as CASSANDRA-1096 shows, the timestamp is not even 
> enough since on equal timestamp the value can be the deciding factor. So in 
> theory we'd have to ship the value of those columns (in the case of a 
> deletion at least since we have it in the view PK for updates). That said, on 
> that last problem, my preference would be that we start prioritizing 
> CASSANDRA-6123 seriously so we don't have to care about conflicting timestamp 
> anymore, which would make this problem go away.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075925#comment-16075925
 ] 

ZhaoYang edited comment on CASSANDRA-13526 at 7/20/17 5:36 AM:
---

| branch | unit | dtest|
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
running | running |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
running | running |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
running | running |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
running | running |


when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  


was (Author: jasonstack):
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] | 
[dtest-source|https://github.com/riptano/cassandra-dtest/commits/CASSANDRA-13526]
 |
| [unit|https://circleci.com/gh/jasonstack/cassandra/106] | dtest: 
{{cql_tests.py:SlowQueryTester.local_query_test}}{{cql_tests.py:SlowQueryTester.remote_query_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13592]
{{bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13576]
 |

when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094231#comment-16094231
 ] 

ZhaoYang commented on CASSANDRA-11500:
--

[~KurtG] branch is not yet ready for you to test. but you could have a look at 
[proposal|https://issues.apache.org/jira/browse/CASSANDRA-11500?focusedCommentId=16082241=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16082241]
 first, see if there is any missing case.

> Obsolete MV entry may not be properly deleted
> -
>
> Key: CASSANDRA-11500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11500
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Sylvain Lebresne
>Assignee: ZhaoYang
>
> When a Materialized View uses a non-PK base table column in its PK, if an 
> update changes that column value, we add the new view entry and remove the 
> old one. When doing that removal, the current code uses the same timestamp 
> than for the liveness info of the new entry, which is the max timestamp for 
> any columns participating to the view PK. This is not correct for the 
> deletion as the old view entry could have other columns with higher timestamp 
> which won't be deleted as can easily shown by the failing of the following 
> test:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1;
> SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old 
> (invalid) and the new one
> {noformat}
> So the correct timestamp to use for the deletion is the biggest timestamp in 
> the old view entry (which we know since we read the pre-existing base row), 
> and that is what CASSANDRA-11475 does (the test above thus doesn't fail on 
> that branch).
> Unfortunately, even then we can still have problems if further updates 
> requires us to overide the old entry. Consider the following case:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the 
> entry for a=1 with timestamp 10
> UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; // This needs to re-insert 
> an entry for a=1 but shouldn't be deleted by the prior deletion
> UPDATE t USING TIMESTAMP 4 SET a = 2 WHERE k = 1; // ... and we can play this 
> game more than once
> UPDATE t USING TIMESTAMP 5 SET a = 1 WHERE k = 1;
> ...
> {noformat}
> In a way, this is saying that the "shadowable" deletion mechanism is not 
> general enough: we need to be able to re-insert an entry when a prior one had 
> been deleted before, but we can't rely on timestamps being strictly bigger on 
> the re-insert. In that sense, this can be though as a similar problem than 
> CASSANDRA-10965, though the solution there of a single flag is not enough 
> since we can have to replace more than once.
> I think the proper solution would be to ship enough information to always be 
> able to decide when a view deletion is shadowed. Which means that both 
> liveness info (for updates) and shadowable deletion would need to ship the 
> timestamp of any base table column that is part the view PK (so {{a}} in the 
> example below).  It's doable (and not that hard really), but it does require 
> a change to the sstable and intra-node protocol, which makes this a bit 
> painful right now.
> But I'll also note that as CASSANDRA-1096 shows, the timestamp is not even 
> enough since on equal timestamp the value can be the deciding factor. So in 
> theory we'd have to ship the value of those columns (in the case of a 
> deletion at least since we have it in the view PK for updates). That said, on 
> that last problem, my preference would be that we start prioritizing 
> CASSANDRA-6123 seriously so we don't have to care about conflicting timestamp 
> anymore, which would make this problem go away.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-20 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/20/17 1:28 PM:
---

h3. Relation: base -> view

First of all, I think all of us should agree on what cases view row should 
exists.

IMO, there are two main cases:

1. base pk and view pk are the same (order doesn't matter) and view has no 
filter conditions or only conditions on base pk.
(filter condition is not a concern here, since no previous view data to be 
cleared)

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) and base row pk satifies 
view's filter conditions if any.
* b. or one of base row columns selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. this is 
handled by existing mechanism of liveness and tombstone since all info are 
included in view row
* c. or one of base row columns not selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. Those 
unselected columns' timestamp/ttl/cell-deletion info currently are not stored 
on view row. 

2. base column used in view pk or view has filter conditions on base non-key 
column which can also lead to entire view row being wiped.

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) && base column used in view 
pk is not null but no timestamp && conditions are satisfied. ( pk having live 
livenesInfo means it is not deleted by tombstone)
* b. or base row column in view pk has timestamp (via update) && conditions are 
satisfied. eg. if base column used in view pk is TTLed, entire view row should 
be wiped.

Next thing is to model "shadowable tombstone or shadowable liveness" to 
maintain view data based on above cases.
 
h3. Previous known issues: 
(I might miss some issues, feel free to ping me..)

ttl
* view row is not wiped when TTLed on base column used in view pk or TTLed on 
base non-key column with filter condition
* cells with same timestamp, merging ttls are not deterministic.

partial update on base columns not selected in view
* it results in no view data. because of current update semantics, no view 
updates are generated
* corresponding view row liveness is not depending on liveness of base columns

filter conditions or base column used in view pk causes
* view row is shadowed after a few modification on base column used in view pk 
if the base non-key column has TS greater than base pk's ts and view key 
column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry 
when a prior one had been deleted need to be careful to hanlde timestamp tie)

tombstone merging is not commutative
* in current code, shadowable tombstone doesn't co-exist with regular tombstone

sstabledump doesn't not support current shadowable tombstone

h3. Model

I can think of two ways to ship all required base column info to view:
* make base columns that are not selected in view as "virtual cell" and 
store their imestamp/ttl to view without their actual values. so we can reuse 
current ts/tb/ttl mechanism with additional validation logic to check if a view 
row is alive.
* or storing those info on view's livenessInfo/deletion with addition merge 
logic. 

I will go ahead with second way since there is an existing shadowable tombstone 
mechanism.


View PrimaryKey LivenessInfo, its timestamp, payloads, merging

{code}
ColumnInfo: // generated from base column
0. timestamp
1. ttl 
2. localDeletionTime:  could be used to represent tombstone or TTLed 
depends on if there is ttl

supersedes(): if timestamps are different, greater timestamp 
supersedes; if timestamps are same, greater localDeletionTime supersedes.

ViewLivenessInfo
// corresponding to base pk livenessInfo
0. timestamp
1. ttl / localDeletionTime

// base column that are used in view pk or has filter condition.
// if any column is not live or doesn't exist, entire view row is wiped.
// if a column in base is filtered and not selected, it's stored here.
2. Map keyOrConditions; 

// if any column is live
3. Map unselected;

// to determina if a row is live
isRowAlive(Deletion delete):
get timestamp or columnInfo that is greater than those in Deletion

if any colummn in {{keyOrConditions}} is TTLed or tombstone(dead) 
or not existed, false
if {{timestamp or ttl}} are alive, true
if any column in {{unselected}} is alive, true
otherwise check any columns in view row are alive

// cannot use supersedes, because timestamp can tie, we cannot compare 
keyOrConditions.  

[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-20 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/20/17 2:07 PM:
---

h3. Relation: base -> view

First of all, I think all of us should agree on what cases view row should 
exists.

IMO, there are two main cases:

1. base pk and view pk are the same (order doesn't matter) and view has no 
filter conditions or only conditions on base pk.
(filter condition is not a concern here, since no previous view data to be 
cleared)

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) and base row pk satifies 
view's filter conditions if any.
* b. or one of base row columns selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. this is 
handled by existing mechanism of liveness and tombstone since all info are 
included in view row
* c. or one of base row columns not selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. Those 
unselected columns' timestamp/ttl/cell-deletion info currently are not stored 
on view row. 

2. base column used in view pk or view has filter conditions on base non-key 
column which can also lead to entire view row being wiped.

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) && base column used in view 
pk is not null but no timestamp && conditions are satisfied. ( pk having live 
livenesInfo means it is not deleted by tombstone)
* b. or base row column in view pk has timestamp (via update) && conditions are 
satisfied. eg. if base column used in view pk is TTLed, entire view row should 
be wiped.

Next thing is to model "shadowable tombstone or shadowable liveness" to 
maintain view data based on above cases.
 
h3. Previous known issues: 
(I might miss some issues, feel free to ping me..)

ttl
* view row is not wiped when TTLed on base column used in view pk or TTLed on 
base non-key column with filter condition
* cells with same timestamp, merging ttls are not deterministic.

partial update on base columns not selected in view
* it results in no view data. because of current update semantics, no view 
updates are generated
* corresponding view row liveness is not depending on liveness of base columns

filter conditions or base column used in view pk causes
* view row is shadowed after a few modification on base column used in view pk 
if the base non-key column has TS greater than base pk's ts and view key 
column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry 
when a prior one had been deleted need to be careful to hanlde timestamp tie)

tombstone merging is not commutative
* in current code, shadowable tombstone doesn't co-exist with regular tombstone

sstabledump doesn't not support current shadowable tombstone

h3. Model

I can think of two ways to ship all required base column info to view:
* make base columns that are not selected in view as "virtual cell" and 
store their imestamp/ttl to view without their actual values. so we can reuse 
current ts/tb/ttl mechanism with additional validation logic to check if a view 
row is alive.
* or storing those info on view's livenessInfo/deletion with addition merge 
logic. 

I will go ahead with second way since there is an existing shadowable tombstone 
mechanism.


View PrimaryKey LivenessInfo, its timestamp, payloads, merging

{code}
ColumnInfo: // generated from base column
0. timestamp
1. ttl 
2. localDeletionTime:  could be used to represent tombstone or TTLed 
depends on if there is ttl

supersedes(): if timestamps are different, greater timestamp 
supersedes; if timestamps are same, greater localDeletionTime supersedes.

ViewLivenessInfo
// corresponding to base pk livenessInfo
0. timestamp
1. ttl / localDeletionTime

// base column that are used in view pk or has filter condition.
// if any column is not live or doesn't exist, entire view row is wiped.
// if a column in base is filtered and not selected, it's stored here.
2. Map keyOrConditions; 

// if any column is live
3. Map unselected;

// to determina if a row is live
isRowAlive(Deletion delete):
get timestamp or columnInfo that is greater than those in Deletion

if any colummn in {{keyOrConditions}} is TTLed or tombstone(dead) 
or not existed, false
if {{timestamp or ttl}} are alive, true
if any column in {{unselected}} is alive, true
otherwise check any columns in view row are alive

// cannot use supersedes, because timestamp can tie, we cannot compare 
keyOrConditions.  

[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-20 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/20/17 12:53 PM:


h3. Relation: base -> view

First of all, I think all of us should agree on what cases view row should 
exists.

IMO, there are two main cases:

1. base pk and view pk are the same (order doesn't matter) and view has no 
filter conditions or only conditions on base pk.
(filter condition is not a concern here, since no previous view data to be 
cleared)

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) and base row pk satifies 
view's filter conditions if any.
* b. or one of base row columns selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. this is 
handled by existing mechanism of liveness and tombstone since all info are 
included in view row
* c. or one of base row columns not selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. Those 
unselected columns' timestamp/ttl/cell-deletion info currently are not stored 
on view row. 

2. base column used in view pk or view has filter conditions on base non-key 
column which can also lead to entire view row being wiped.

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) && base column used in view 
pk is not null but no timestamp && conditions are satisfied. ( pk having live 
livenesInfo means it is not deleted by tombstone)
* b. or base row column in view pk has timestamp (via update) && conditions are 
satisfied. eg. if base column used in view pk is TTLed, entire view row should 
be wiped.

Next thing is to model "shadowable tombstone or shadowable liveness" to 
maintain view data based on above cases.
 
h3. Previous known issues: 
(I might miss some issues, feel free to ping me..)

ttl
* view row is not wiped when TTLed on base column used in view pk or TTLed on 
base non-key column with filter condition
* cells with same timestamp, merging ttls are not deterministic.

partial update on base columns not selected in view
* it results in no view data. because of current update semantics, no view 
updates are generated
* corresponding view row liveness is not depending on liveness of base columns

filter conditions or base column used in view pk causes
* view row is shadowed after a few modification on base column used in view pk 
if the base non-key column has TS greater than base pk's ts and view key 
column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry 
when a prior one had been deleted need to be careful to hanlde timestamp tie)

tombstone merging is not commutative
* in current code, shadowable tombstone doesn't co-exist with regular tombstone

sstabledump doesn't not support current shadowable tombstone

h3. Model

I can think of two ways to ship all required base column info to view:
* make base columns that are not selected in view as "virtual cell" and 
store their imestamp/ttl to view without their actual values. so we can reuse 
current ts/tb/ttl mechanism with additional validation logic to check if a view 
row is alive.
* or storing those info on view's livenessInfo/deletion with addition merge 
logic. 

I will go ahead with second way since there is an existing shadowable tombstone 
mechanism.


View PrimaryKey LivenessInfo, its timestamp, payloads, merging

{code}
ColumnInfo: // generated from base column
0. timestamp
1. ttl 
2. localDeletionTime:  could be used to represent tombstone or TTLed 
depends on if there is ttl

supersedes(): if timestamps are different, greater timestamp 
supersedes; if timestamps are same, greater localDeletionTime supersedes.

ViewLivenessInfo
// corresponding to base pk livenessInfo
0. timestamp
1. ttl / localDeletionTime

// base column that are used in view pk or has filter condition.
// if any column is not live or doesn't exist, entire view row is wiped.
// if a column in base is filtered and not selected, it's stored here.
2. Map keyOrConditions; 

// if any column is live
3. Map unselected;

// to determina if a row is live
isRowAlive(Deletion delete):
get timestamp or columnInfo that is greater than those in Deletion

if any colummn in {{keyOrConditions}} is TTLed or tombstone(dead) 
or not existed, false
if {{timestamp or ttl}} are alive, true
if any column in {{unselected}} is alive, true
otherwise check any columns in view row are alive

// cannot use supersedes, because timestamp can tie, we cannot compare 
keyOrConditions. 

[jira] [Comment Edited] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-20 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075925#comment-16075925
 ] 

ZhaoYang edited comment on CASSANDRA-13526 at 7/20/17 10:36 AM:


| branch | unit | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13526] |
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
[pass|https://circleci.com/gh/jasonstack/cassandra/182] | 
bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
 known |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
[running|https://circleci.com/gh/jasonstack/cassandra/186] | running |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/181] | running |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/185] | running |


when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  


was (Author: jasonstack):
| branch | unit | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13526] |
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
running | running |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
running | running |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
running | running |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
running | running |


when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-20 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/20/17 2:43 PM:
---

h3. Relation: base -> view

First of all, I think all of us should agree on what cases view row should 
exists.

IMO, there are two main cases:

1. base pk and view pk are the same (order doesn't matter) and view has no 
filter conditions or only conditions on base pk.
(filter condition mean: {{c = 1}} in view's where clause. filter condition is 
not a concern here, since no previous view data to be cleared.)

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) and base row pk satifies 
view's filter conditions if any.
* b. or one of base row columns selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. this is 
handled by existing mechanism of liveness and tombstone since all info are 
included in view row
* c. or one of base row columns not selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. Those 
unselected columns' timestamp/ttl/cell-deletion info are not currently stored 
on view row.

2. base column used in view pk or view has filter conditions on base non-key 
column which can also lead to entire view row being wiped.

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) && base column used in view 
pk is not null but no timestamp && conditions are satisfied. ( pk having live 
livenesInfo means it is not deleted by tombstone)
* b. or base row column in view pk has timestamp (via update) && conditions are 
satisfied. eg. if base column used in view pk is TTLed, entire view row should 
be wiped.

Next thing is to model "view's tombstone and livenessInfo" to maintain view 
data based on above cases.
 
h3. Previous known issues: 
(I might miss some issues, feel free to ping me..)

ttl
* view row is not wiped when TTLed on base column used in view pk or TTLed on 
base non-key column with filter condition
* cells with same timestamp, merging ttls are not deterministic.

partial update on base columns not selected in view
* it results in no view data. because of current update semantics, no view 
updates are generated
* corresponding view row' liveness is not depending on liveness of base columns

filter conditions or base column used in view pk causes
* view row is shadowed after a few modification on base column used in view pk 
if the base non-key column has TS greater than base pk's ts and view key 
column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry 
when a prior one had been deleted need to be careful to hanlde timestamp tie)

tombstone merging is not commutative
* in current code, shadowable tombstone doesn't co-exist with regular tombstone

sstabledump not supporting current shadowable tombstone

h3. Model

I can think of two ways to ship all required base column info to view:
* make base columns that are not selected in view as "virtual cell" and 
store their imestamp/ttl to view without their actual values. so we can reuse 
current ts/tb/ttl mechanism with additional validation logic to check if a view 
row is alive.
* or storing those info on view's livenessInfo/deletion with addition merge 
logic. 

I will go ahead with second way since there is an existing shadowable tombstone 
mechanism.


View PrimaryKey LivenessInfo, its timestamp, payloads, merging

{code}
ColumnInfo: // generated from base column as it is.
0. timestamp
1. ttl 
2. localDeletionTime:  could be used to represent tombstone or TTLed 
depends on if there is ttl

supersedes(): if timestamps are different, greater timestamp 
supersedes; if timestamps are same, greater localDeletionTime supersedes.

// if a normal column in base row has no timestamp (aka. generated by 
Insert statement), when it is sent to view, it remains no timestamp.
// it will implicitly inherit ViewLivenessInfo just like how it works with 
standard LivenessInfo in regular table,
// unlike shadowable mechanism which will explicitly put base pk's 
timestamp into not-updated base columns into view data to keep "select 
writetime" correct in view.
// (because in shaowable mechanism, view's pk timestamp is promoted to a 
bigger value which cannot be used for writetime in view)
ViewLivenessInfo
// corresponding to base pk livenessInfo
0. timestamp
1. ttl / localDeletionTime

// base column that are used in view pk or has filter condition.
// if any column is not live or doesn't exist, entire view row is wiped.
// if a column in base is filtered and not selected, it's stored here.
2. Map keyOrConditions; 

[jira] [Commented] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-16 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088830#comment-16088830
 ] 

ZhaoYang commented on CASSANDRA-13526:
--

[~jjirsa] thanks for reviewing. {{trunk}} was draft for review. I will prepare 
for older branches and more tests.

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-16 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088835#comment-16088835
 ] 

ZhaoYang commented on CASSANDRA-13526:
--

sure. it's not urgent.

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-21 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075925#comment-16075925
 ] 

ZhaoYang edited comment on CASSANDRA-13526 at 7/21/17 6:10 AM:
---

| branch | unit | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13526] |
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
[pass|https://circleci.com/gh/jasonstack/cassandra/182] | 
bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
 known |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/186] | pass |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/181] | 
offline_tools_test.TestOfflineTools.sstableofflinerelevel_test  
auth_test.TestAuth.system_auth_ks_is_alterable_test |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/185] |  
repair_tests.incremental_repair_test.TestIncRepair.multiple_repair_test  
ttl_test.TestTTL.collection_list_ttl_test |

unit test all padded, some irrelevant dtests failed.

when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  


was (Author: jasonstack):
| branch | unit | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13526] |
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
[pass|https://circleci.com/gh/jasonstack/cassandra/182] | 
bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
 known |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
[running|https://circleci.com/gh/jasonstack/cassandra/186] | running |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/181] | running |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/185] | running |


when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-21 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075925#comment-16075925
 ] 

ZhaoYang edited comment on CASSANDRA-13526 at 7/21/17 6:10 AM:
---

| branch | unit | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13526] |
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
[pass|https://circleci.com/gh/jasonstack/cassandra/182] | 
bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
 known |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/186] | pass |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/181] | 
offline_tools_test.TestOfflineTools.sstableofflinerelevel_test  
auth_test.TestAuth.system_auth_ks_is_alterable_test |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/185] |  
repair_tests.incremental_repair_test.TestIncRepair.multiple_repair_test  
ttl_test.TestTTL.collection_list_ttl_test |

unit test all passed, some irrelevant dtests failed.

when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  


was (Author: jasonstack):
| branch | unit | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13526] |
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
[pass|https://circleci.com/gh/jasonstack/cassandra/182] | 
bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
 known |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/186] | pass |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/181] | 
offline_tools_test.TestOfflineTools.sstableofflinerelevel_test  
auth_test.TestAuth.system_auth_ks_is_alterable_test |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/185] |  
repair_tests.incremental_repair_test.TestIncRepair.multiple_repair_test  
ttl_test.TestTTL.collection_list_ttl_test |

unit test all padded, some irrelevant dtests failed.

when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13723) Trivial format error of digest in StorageProxy

2017-07-25 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099734#comment-16099734
 ] 

ZhaoYang commented on CASSANDRA-13723:
--

the affected range is unknown.. need to go through the codebase. the digest log 
is luckily being caught by dtest

> Trivial format error of digest in StorageProxy
> --
>
> Key: CASSANDRA-13723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: CASSANDRA-13723.patch
>
>
> The wrong tracing log will fail 
> {{materialized_views_test.py:TestMaterializedViews.view_tombstone_test}} and 
> impact clients.
> Current log: {{Digest mismatch: {} on 127.0.0.1}}
> Expected log: {{Digest mismatch: 
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
> DecoratedKey... on 127.0.0.1}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-25 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/25/17 9:24 AM:
---

h3. Relation: base -> view

First of all, I think all of us should agree on what cases view row should 
exists.

IMO, there are two main cases:

1. base pk and view pk are the same (order doesn't matter) and view has no 
filter conditions or only conditions on base pk.
(filter condition mean: {{c = 1}} in view's where clause. filter condition is 
not a concern here, since no previous view data to be cleared.)

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) and base row pk satifies 
view's filter conditions if any.
* b. or one of base row columns selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. this is 
handled by existing mechanism of liveness and tombstone since all info are 
included in view row
* c. or one of base row columns not selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. Those 
unselected columns' timestamp/ttl/cell-deletion info are not currently stored 
on view row.

2. base column used in view pk or view has filter conditions on base non-key 
column which can also lead to entire view row being wiped.

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) && base column used in view 
pk is not null but no timestamp && conditions are satisfied. ( pk having live 
livenesInfo means it is not deleted by tombstone)
* b. or base row column in view pk has timestamp (via update) && conditions are 
satisfied. eg. if base column used in view pk is TTLed, entire view row should 
be wiped.

Next thing is to model "view's tombstone and livenessInfo" to maintain view 
data based on above cases.
 
h3. Previous known issues: 
(I might miss some issues, feel free to ping me..)

ttl
* view row is not wiped when TTLed on base column used in view pk or TTLed on 
base non-key column with filter condition
* cells with same timestamp, merging ttls are not deterministic.

partial update on base columns not selected in view
* it results in no view data. because of current update semantics, no view 
updates are generated
* corresponding view row' liveness is not depending on liveness of base columns

filter conditions or base column used in view pk causes
* view row is shadowed after a few modification on base column used in view pk 
if the base non-key column has TS greater than base pk's ts and view key 
column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry 
when a prior one had been deleted need to be careful to hanlde timestamp tie)

tombstone merging is not commutative
* in current code, shadowable tombstone doesn't co-exist with regular tombstone

sstabledump not supporting current shadowable tombstone

h3. Model

I can think of two ways to ship all required base column info to view:
* make base columns that are not selected in view as "virtual cell" and 
store their imestamp/ttl to view without their actual values. so we can reuse 
current ts/tb/ttl mechanism with additional validation logic to check if a view 
row is alive.
* or storing those info on view's livenessInfo/deletion with addition merge 
logic to make sure view liveness/deletion are ordered properly even in the case 
of timestamp tie. It's like a replacement stragy which uses inserted view row 
to replace old view row. (In regular table, reconciliation is at cell level. 
need to research more about the concurrent view update cases, fow now, it looks 
fine). It also implies that every modification on base, view row will get 
replaced entirely..


FIXME  in one node, if multiple ViewLivenessInfo having same timestamp, we 
do replacement.. how about cross node merging? how about partial update on 
normal columns that are part of view normal column? always send entire row to 
do replacement

I will go ahead with ~second way since there is an existing shadowable 
tombstone mechanism.~ 


View PrimaryKey LivenessInfo, its timestamp, payloads, merging

{code}
ColumnInfo: // generated from base column as it is.
0. timestamp
1. ttl 
2. localDeletionTime:  could be used to represent tombstone or TTLed 
depends on if there is ttl

supersedes(): if timestamps are different, greater timestamp 
supersedes; if timestamps are same, greater localDeletionTime supersedes.


Row: // keyOrConditions and unselected are always merged with another row's 
during row merging process

// base column that are used in view pk or has filter condition on 
non-pk column.
// if any column is not live, entire view row is wiped.
// if a column in base is filtered and not selected, it's 

[jira] [Updated] (CASSANDRA-13409) Materialized Views: View cells are resurrected

2017-07-25 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13409:
-
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

mark it as duplicate of 11500. will fix MV in 11500

> Materialized Views: View cells are resurrected
> --
>
> Key: CASSANDRA-13409
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13409
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths, Materialized Views
>Reporter: Duarte Nunes
>Assignee: ZhaoYang
>
> Consider the following commands, ran against trunk@0f054fee5c:
> {code:xml}
> echo "create keyspace ks WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};" | bin/cqlsh
> echo "create table ks.base (p int primary key, v1 int, v2 int) with 
> gc_grace_seconds = 1;" | bin/cqlsh
> echo "create materialized view ks.my_view as select * from ks.base where p is 
> not null and v1 is not null primary key (v1, p);" | bin/cqlsh
> echo "insert into ks.base (p, v1, v2) values (3, 1, 3) using timestamp 1;" | 
> bin/cqlsh
> bin/nodetool flush ks my_view base
> echo "delete from ks.base using timestamp 2 where p = 3;" | bin/cqlsh
> bin/nodetool flush ks my_view base
> echo "insert into ks.base (p, v1) values (3, 1) using timestamp 3;" | 
> bin/cqlsh
> bin/nodetool flush ks my_view base
> echo "select * from ks.my_view;" | bin/cqlsh
>  v1 | p | v2
> +---+
>   1 | 3 |  3
> (1 rows)
> echo "select * from ks.base;" | bin/cqlsh
>  p | v1 | v2
> ---++--
>  3 |  1 | null
> (1 rows)
> {code}
> As you can see, this incorrectly brings back cell v2=3. 
> There is one definitive problem and a potential one:
> * Merging rows must be commutative. If a shadowable tombstone is applied 
> after a row tombstone, it will replace that tombstone; if a row marker 
> shadows the shadowable tombstone before the row containing the original data 
> is applied, then any dead cells in said data will be resurrected;
> * Shadowable tombstones shouldn't compact away previous row tombstones or 
> even deleted cells; if the relevant tombstones have been GCed from the base 
> table, then a base table update won't carry them anymore (alongside a newer 
> row marker).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-25 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/25/17 9:25 AM:
---

h3. Relation: base -> view

First of all, I think all of us should agree on what cases view row should 
exists.

IMO, there are two main cases:

1. base pk and view pk are the same (order doesn't matter) and view has no 
filter conditions or only conditions on base pk.
(filter condition mean: {{c = 1}} in view's where clause. filter condition is 
not a concern here, since no previous view data to be cleared.)

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) and base row pk satifies 
view's filter conditions if any.
* b. or one of base row columns selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. this is 
handled by existing mechanism of liveness and tombstone since all info are 
included in view row
* c. or one of base row columns not selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. Those 
unselected columns' timestamp/ttl/cell-deletion info are not currently stored 
on view row.

2. base column used in view pk or view has filter conditions on base non-key 
column which can also lead to entire view row being wiped.

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) && base column used in view 
pk is not null but no timestamp && conditions are satisfied. ( pk having live 
livenesInfo means it is not deleted by tombstone)
* b. or base row column in view pk has timestamp (via update) && conditions are 
satisfied. eg. if base column used in view pk is TTLed, entire view row should 
be wiped.

Next thing is to model "view's tombstone and livenessInfo" to maintain view 
data based on above cases.
 
h3. Previous known issues: 
(I might miss some issues, feel free to ping me..)

ttl
* view row is not wiped when TTLed on base column used in view pk or TTLed on 
base non-key column with filter condition
* cells with same timestamp, merging ttls are not deterministic.

partial update on base columns not selected in view
* it results in no view data. because of current update semantics, no view 
updates are generated
* corresponding view row' liveness is not depending on liveness of base columns

filter conditions or base column used in view pk causes
* view row is shadowed after a few modification on base column used in view pk 
if the base non-key column has TS greater than base pk's ts and view key 
column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry 
when a prior one had been deleted need to be careful to hanlde timestamp tie)

tombstone merging is not commutative
* in current code, shadowable tombstone doesn't co-exist with regular tombstone

sstabledump not supporting current shadowable tombstone

h3. Model

I can think of two ways to ship all required base column info to view:
* make base columns that are not selected in view as "virtual cell" and 
store their imestamp/ttl to view without their actual values. so we can reuse 
current ts/tb/ttl mechanism with additional validation logic to check if a view 
row is alive.
* or storing those info on view's livenessInfo/deletion with addition merge 
logic to make sure view liveness/deletion are ordered properly even in the case 
of timestamp tie. It's like a replacement stragy which uses inserted view row 
to replace old view row. (In regular table, reconciliation is at cell level. 
need to research more about the concurrent view update cases, fow now, it looks 
fine). It also implies that every modification on base, view row will get 
replaced entirely..

I will go ahead with ~second way since there is an existing shadowable 
tombstone mechanism.~  VirtualCells..

{code}
ColumnInfo: // generated from base column as it is.
0. timestamp
1. ttl 
2. localDeletionTime:  could be used to represent tombstone or TTLed 
depends on if there is ttl

supersedes(): if timestamps are different, greater timestamp 
supersedes; if timestamps are same, greater localDeletionTime supersedes.


Row: // VirtualCells(keyOrConditions and unselected) are always merged with 
another row's during row merging process

// base column that are used in view pk or has filter condition on 
non-pk column.
// if any column is not live, entire view row is wiped.
// if a column in base is filtered and not selected, it's stored here.
// during base modification, if a view row is removed due to 
base-column-in-view-pk or filter-contiions, then no deletion is issue,
// the virtual cell tombstone is added to Row's keyOrConditions.
2. Map keyOrConditions; 

   

[jira] [Commented] (CASSANDRA-13547) Filtered materialized views missing data

2017-07-25 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100092#comment-16100092
 ] 

ZhaoYang commented on CASSANDRA-13547:
--

[~krishna.koneru]  thanks for patch. I am currently working on 11500 to 
restructure MV timestamp. Can I mark this issue as duplicate of 11500?

> Filtered materialized views missing data
> 
>
> Key: CASSANDRA-13547
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13547
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
> Environment: Official Cassandra 3.10 Docker image (ID 154b919bf8ce).
>Reporter: Craig Nicholson
>Assignee: Krishna Dattu Koneru
>Priority: Blocker
>  Labels: materializedviews
> Fix For: 3.11.x
>
>
> When creating a materialized view against a base table the materialized view 
> does not always reflect the correct data.
> Using the following test schema:
> {code:title=Schema|language=sql}
> DROP KEYSPACE IF EXISTS test;
> CREATE KEYSPACE test
>   WITH REPLICATION = { 
>'class' : 'SimpleStrategy', 
>'replication_factor' : 1 
>   };
> CREATE TABLE test.table1 (
> id int,
> name text,
> enabled boolean,
> foo text,
> PRIMARY KEY (id, name));
> CREATE MATERIALIZED VIEW test.table1_mv1 AS SELECT id, name, foo
> FROM test.table1
> WHERE id IS NOT NULL 
> AND name IS NOT NULL 
> AND enabled = TRUE
> PRIMARY KEY ((name), id);
> CREATE MATERIALIZED VIEW test.table1_mv2 AS SELECT id, name, foo, enabled
> FROM test.table1
> WHERE id IS NOT NULL 
> AND name IS NOT NULL 
> AND enabled = TRUE
> PRIMARY KEY ((name), id);
> {code}
> When I insert a row into the base table the materialized views are updated 
> appropriately. (+)
> {code:title=Insert row|language=sql}
> cqlsh> INSERT INTO test.table1 (id, name, enabled, foo) VALUES (1, 'One', 
> TRUE, 'Bar');
> cqlsh> SELECT * FROM test.table1;
>  id | name | enabled | foo
> +--+-+-
>   1 |  One |True | Bar
> (1 rows)
> cqlsh> SELECT * FROM test.table1_mv1;
>  name | id | foo
> --++-
>   One |  1 | Bar
> (1 rows)
> cqlsh> SELECT * FROM test.table1_mv2;
>  name | id | enabled | foo
> --++-+-
>   One |  1 |True | Bar
> (1 rows)
> {code}
> Updating the record in the base table and setting enabled to FALSE will 
> filter the record from both materialized views. (+)
> {code:title=Disable the row|language=sql}
> cqlsh> UPDATE test.table1 SET enabled = FALSE WHERE id = 1 AND name = 'One';
> cqlsh> SELECT * FROM test.table1;
>  id | name | enabled | foo
> +--+-+-
>   1 |  One |   False | Bar
> (1 rows)
> cqlsh> SELECT * FROM test.table1_mv1;
>  name | id | foo
> --++-
> (0 rows)
> cqlsh> SELECT * FROM test.table1_mv2;
>  name | id | enabled | foo
> --++-+-
> (0 rows)
> {code}
> However a further update to the base table setting enabled to TRUE should 
> include the record in both materialzed views, however only one view 
> (table1_mv2) gets updated. (-)
> It appears that only the view (table1_mv2) that returns the filtered column 
> (enabled) is updated. (-)
> Additionally columns that are not part of the partiion or clustering key are 
> not updated. You can see that the foo column has a null value in table1_mv2. 
> (-)
> {code:title=Enable the row|language=sql}
> cqlsh> UPDATE test.table1 SET enabled = TRUE WHERE id = 1 AND name = 'One';
> cqlsh> SELECT * FROM test.table1;
>  id | name | enabled | foo
> +--+-+-
>   1 |  One |True | Bar
> (1 rows)
> cqlsh> SELECT * FROM test.table1_mv1;
>  name | id | foo
> --++-
> (0 rows)
> cqlsh> SELECT * FROM test.table1_mv2;
>  name | id | enabled | foo
> --++-+--
>   One |  1 |True | null
> (1 rows)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12783) Break up large MV mutations to prevent OOMs

2017-07-23 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-12783:
-
Fix Version/s: 4.x
  Component/s: Materialized Views
   Local Write-Read Paths

> Break up large MV mutations to prevent OOMs
> ---
>
> Key: CASSANDRA-12783
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12783
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths, Materialized Views
>Reporter: Carl Yeksigian
> Fix For: 4.x
>
>
> We only use the code path added in CASSANDRA-12268 for the view builder 
> because otherwise we would break the contract of the batchlog, where some 
> mutations may be written and pushed out before the whole batch log has been 
> saved.
> We would need to ensure that all of the updates make it to the batchlog 
> before allowing the batchlog manager to try to replay them, but also before 
> we start pushing out updates to the paired replicas.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11194) materialized views - support explode() on collections

2017-07-23 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-11194:
-
Component/s: Materialized Views

> materialized views - support explode() on collections
> -
>
> Key: CASSANDRA-11194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11194
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Materialized Views
>Reporter: Jon Haddad
>
> I'm working on a database design to model a product catalog.  Products can 
> belong to categories.  Categories can belong to multiple sub categories 
> (think about Amazon's complex taxonomies).
> My category table would look like this, giving me individual categories & 
> their parents:
> {code}
> CREATE TABLE category (
> category_id uuid primary key,
> name text,
> parents set
> );
> {code}
> To get a list of all the children of a particular category, I need a table 
> that looks like the following:
> {code}
> CREATE TABLE categories_by_parent (
> parent_id uuid,
> category_id uuid,
> name text,
> primary key (parent_id, category_id)
> );
> {code}
> The important thing to note here is that a single category can have multiple 
> parents.
> I'd like to propose support for collections in materialized views via an 
> explode() function that would create 1 row per item in the collection.  For 
> instance, I'll insert the following 3 rows (2 parents, 1 child) into the 
> category table:
> {code}
> insert into category (category_id, name, parents) values 
> (009fe0e1-5b09-4efc-a92d-c03720324a4f, 'Parent', null);
> insert into category (category_id, name, parents) values 
> (1f2914de-0adf-4afc-b7ad-ddd8dc876ab1, 'Parent2', null);
> insert into category (category_id, name, parents) values 
> (1f93bc07-9874-42a5-a7d1-b741dc9c509c, 'Child', 
> {009fe0e1-5b09-4efc-a92d-c03720324a4f, 1f2914de-0adf-4afc-b7ad-ddd8dc876ab1 
> });
> cqlsh:test> select * from category;
>  category_id  | name| parents
> --+-+--
>  009fe0e1-5b09-4efc-a92d-c03720324a4f |  Parent | 
> null
>  1f2914de-0adf-4afc-b7ad-ddd8dc876ab1 | Parent2 | 
> null
>  1f93bc07-9874-42a5-a7d1-b741dc9c509c |   Child | 
> {009fe0e1-5b09-4efc-a92d-c03720324a4f, 1f2914de-0adf-4afc-b7ad-ddd8dc876ab1}
> (3 rows)
> {code}
> Given the following CQL to select the child category, utilizing an explode 
> function, I would expect to get back 2 rows, 1 for each parent:
> {code}
> select category_id, name, explode(parents) as parent_id from category where 
> category_id = 1f93bc07-9874-42a5-a7d1-b741dc9c509c;
> category_id  | name  | parent_id
> --+---+--
> 1f93bc07-9874-42a5-a7d1-b741dc9c509c | Child | 
> 009fe0e1-5b09-4efc-a92d-c03720324a4f
> 1f93bc07-9874-42a5-a7d1-b741dc9c509c | Child | 
> 1f2914de-0adf-4afc-b7ad-ddd8dc876ab1
> (2 rows)
> {code}
> This functionality would ideally apply to materialized views, since the 
> ability to control partitioning here would allow us to efficiently query our 
> MV for all categories belonging to a parent in a complex taxonomy.
> {code}
> CREATE MATERIALIZED VIEW categories_by_parent as
> SELECT explode(parents) as parent_id,
> category_id, name FROM category WHERE parents IS NOT NULL
> {code}
> The explode() function is available in Spark Dataframes and my proposed 
> function has the same behavior: 
> http://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.functions.explode



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13073) Optimize repair behaviour with MVs

2017-07-23 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13073:
-
Component/s: Materialized Views

> Optimize repair behaviour with MVs
> --
>
> Key: CASSANDRA-13073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13073
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Benjamin Roth
>
> I am referring to a discusson on the dev list about the MV streaming issues 
> discussed in 12888.
> It turned out that under some circumstances, repairing MVs can lead to 
> inconsistencies. To remove that inconsistencies, it is necessary, to repair 
> the base table first and then the MV again. These inconsistencies can be 
> created both by read repair or CF/Keyspace repair.
> Proposition:
> - Exclude MVs from keyspace repairs
> - Disable read repairs on MVs or transform them to a read repair of the base 
> table (maybe complicated but possible)
> Explanation:
> - CF base has PK a and field b
> - MV has PK a, b
> - 2 nodes n1 + n2, no hints
> - Initial state is a=1,b=1 at time t=0
> - Node n2 goes down
> - Mutation a=1, b=2 at time t=1
> - Node n2 comes up and node n1 goes down
> - Mutation a=1, b=3 at time t=2
> - Node n1.mv contains: a1=1, b=3 + tombstone for a1=1,b=1
> - Node n2.mv contains: a1=1, b=2 + tombstone for a1=1,b=1
> When doing a repair on mv _before_ repairing base, mv would look like:
> - Node n1.mv contains: a1=1, b=3 + tombstone for a1=1,b=1 + a1=1, b=2
> - Node n2.mv contains: a1=1, b=2 + tombstone for a1=1,b=1 + a1=1, b=3
> Repairing _only_ the base table would create the correct result:
> - Node n1.mv contains: a1=1, b=3 + tombstone for a1=1,b=1 + tombstone for 
> a1=1,b=2
> - Node n2.mv contains: a1=1, b=3 + tombstone for a1=1,b=1 (TS for a1=2,b=2 
> should not have been created as b=3 was there, which shadows b=2 and should 
> not reach the MV at all)
> All this does not apply if CASSANDRA-13066 will be implemented and enabled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9995) Add background consistency mode for MV

2017-07-23 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-9995:

Component/s: Materialized Views

> Add background consistency mode for MV
> --
>
> Key: CASSANDRA-9995
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9995
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Materialized Views
>Reporter: Carl Yeksigian
>
> Currently, we only support a fast refresh mode which slows down writes, but 
> brings the base and view to consistency quickly. It would be possible to keep 
> reads and writes close to the same performance they have now by sacrificing 
> the time to consistency.
> The way this mode would work is:
> - When data is flushed, the sstable is marked as inconsistent for MV
> - Compaction can only run on either the set of sstables which are consistent, 
> or the set of sstables which are inconsistent, but cannot mix the two
> - A background job would take the sstables which are inconsistent and compare 
> them to the current set which are consistent and generate the appropriate 
> updates for the index to bring it up to date
> - Any newly streamed sstables would be marked as inconsistent and would be 
> included the next time the job ran
> The background consistency job could be configured to run whenever a new 
> sstable is flushed, or at certain time intervals.
> By switching to a job which only looked at the flushed sstables, we wouldn't 
> have to worry about the memtable updates which generate updates to the view 
> but aren't recorded anywhere. We also wouldn't have to do any coordination at 
> write time, use the batchlog for these writes, or issue any new updates when 
> applying the MV update.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2017-07-23 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-9928:

Component/s: Materialized Views

> Add Support for multiple non-primary key columns in Materialized View primary 
> keys
> --
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 4.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV 
> primary key.  We should remove this restriction assuming we continue 
> filtering out nulls.  With allowing nulls in the MV columns there are a lot 
> of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12463) Unable to create Materialized View on UDT fields saeperately

2017-07-23 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-12463:
-
Component/s: Materialized Views

> Unable to create Materialized View on UDT fields saeperately
> 
>
> Key: CASSANDRA-12463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12463
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: SathishKumar Alwar
>Priority: Minor
>
> We are unable to create index on UDT fields individually and there were 
> suggestions/recommendations to create Materialized Views for these UDT 
> individually. Unfortunately we are unable to create Materialized Views by 
> providing part of UDT fields. 
> It would be better if indexing is supported on UDT fields, if not possible 
> providing support in Materialized View will be helpful. We want support on 
> non frozen as well as frozen UDT.
> Example:
> CREATE TYPE mytype (
> id int,
> value text
> )
> CREATE TABLE mytable (
> key int PRIMARY KEY,
> mytype frozen,
> val text
> )
> And then creating a materialized view with the UDT 
> CREATE MATERIALIZED VIEW mv AS SELECT key, val, mytype.id, mytype.value FROM 
> mytable WHERE key IS NOT NULL AND mytype IS NOT NULL PRIMARY KEY (key, 
> mytype);
> The error I get is the following :
> [Invalid query] message="Cannot select out a part of type when defining a 
> materialized view"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13293) MV read-before-write can be omitted for some operations

2017-07-23 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13293:
-
Component/s: Materialized Views
 Local Write-Read Paths

> MV read-before-write can be omitted for some operations
> ---
>
> Key: CASSANDRA-13293
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13293
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths, Materialized Views
>Reporter: Benjamin Roth
>
> A view that has the same fields in the primary key as its base table (i call 
> it a congruent key), does not require read-before-writes except:
> - Range deletes
> - Partition deletes
> If the view uses filters on non-pk columns either a rbw is required or a 
> write that does not match the filter has to be turned into a delete. In doubt 
> I'd stay with the current behaviour and to a rbw.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13723) Trivial format error of digest in StorageProxy

2017-07-24 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098470#comment-16098470
 ] 

ZhaoYang commented on CASSANDRA-13723:
--

thanks for reviewing. let's wait for CASSANDRA-12996 to settle.

> Trivial format error of digest in StorageProxy
> --
>
> Key: CASSANDRA-13723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: CASSANDRA-13723.patch
>
>
> The wrong tracing log will fail 
> {{materialized_views_test.py:TestMaterializedViews.view_tombstone_test}} and 
> impact clients.
> Current log: {{Digest mismatch: {} on 127.0.0.1}}
> Expected log: {{Digest mismatch: 
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
> DecoratedKey... on 127.0.0.1}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-24 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075925#comment-16075925
 ] 

ZhaoYang edited comment on CASSANDRA-13526 at 7/24/17 6:30 AM:
---

| branch | unit | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13526] |
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
[pass|https://circleci.com/gh/jasonstack/cassandra/182] | 
bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
 known |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/186] | pass |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/181] | 
offline_tools_test.TestOfflineTools.sstableofflinerelevel_test  
auth_test.TestAuth.system_auth_ks_is_alterable_test |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/185] |  
ttl_test.TestTTL.collection_list_ttl_test |

unit test all passed, some irrelevant dtests failed.

when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  


was (Author: jasonstack):
| branch | unit | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13526] |
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
[pass|https://circleci.com/gh/jasonstack/cassandra/182] | 
bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
 known |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/186] | pass |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/181] | 
offline_tools_test.TestOfflineTools.sstableofflinerelevel_test  
auth_test.TestAuth.system_auth_ks_is_alterable_test |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
[pass|https://circleci.com/gh/jasonstack/cassandra/185] |  
repair_tests.incremental_repair_test.TestIncRepair.multiple_repair_test  
ttl_test.TestTTL.collection_list_ttl_test |

unit test all passed, some irrelevant dtests failed.

when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-24 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097997#comment-16097997
 ] 

ZhaoYang commented on CASSANDRA-13526:
--

[~jjirsa] sorry for the delay, I updated the dtest result for 
2.2/3.0/3.11/trunk, some irrelevant dtests failed. I skipped 2.1 since this is 
not critical.

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13622) Better config validation/documentation

2017-07-24 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13622:
-
Fix Version/s: 4.0

> Better config validation/documentation
> --
>
> Key: CASSANDRA-13622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13622
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Kurt Greaves
>Assignee: ZhaoYang
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
>
> There are a number of properties in the yaml that are "in_mb", however 
> resolve to bytes when calculated in {{DatabaseDescriptor.java}}, but are 
> stored in int's. This means that their maximum values are 2047, as any higher 
> when converted to bytes overflows the int.
> Where possible/reasonable we should convert these to be long's, and stored as 
> long's. If there is no reason for the value to ever be >2047 we should at 
> least document that as the max value, or better yet make it error if set 
> higher than that. Noting that although it's bad practice to increase a lot of 
> them to such high values, there may be cases where it is necessary and in 
> which case we should handle it appropriately rather than overflowing and 
> surprising the user. That is, causing it to break but not in the way the user 
> expected it to :)
> Following are functions that currently could be at risk of the above:
> {code:java|title=DatabaseDescriptor.java}
> getThriftFramedTransportSize()
> getMaxValueSize()
> getCompactionLargePartitionWarningThreshold()
> getCommitLogSegmentSize()
> getNativeTransportMaxFrameSize()
> # These are in KB so max value of 2096128
> getBatchSizeWarnThreshold()
> getColumnIndexSize()
> getColumnIndexCacheSize()
> getMaxMutationSize()
> {code}
> Note we may not actually need to fix all of these, and there may be more. 
> This was just from a rough scan over the code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13622) Better config validation/documentation

2017-07-24 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099415#comment-16099415
 ] 

ZhaoYang commented on CASSANDRA-13622:
--

That's java ByteBuffer restriction on 2GB.. 

> Better config validation/documentation
> --
>
> Key: CASSANDRA-13622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13622
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Kurt Greaves
>Assignee: ZhaoYang
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
>
> There are a number of properties in the yaml that are "in_mb", however 
> resolve to bytes when calculated in {{DatabaseDescriptor.java}}, but are 
> stored in int's. This means that their maximum values are 2047, as any higher 
> when converted to bytes overflows the int.
> Where possible/reasonable we should convert these to be long's, and stored as 
> long's. If there is no reason for the value to ever be >2047 we should at 
> least document that as the max value, or better yet make it error if set 
> higher than that. Noting that although it's bad practice to increase a lot of 
> them to such high values, there may be cases where it is necessary and in 
> which case we should handle it appropriately rather than overflowing and 
> surprising the user. That is, causing it to break but not in the way the user 
> expected it to :)
> Following are functions that currently could be at risk of the above:
> {code:java|title=DatabaseDescriptor.java}
> getThriftFramedTransportSize()
> getMaxValueSize()
> getCompactionLargePartitionWarningThreshold()
> getCommitLogSegmentSize()
> getNativeTransportMaxFrameSize()
> # These are in KB so max value of 2096128
> getBatchSizeWarnThreshold()
> getColumnIndexSize()
> getColumnIndexCacheSize()
> getMaxMutationSize()
> {code}
> Note we may not actually need to fix all of these, and there may be more. 
> This was just from a rough scan over the code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13723) Trivial format error of digest in StorageProxy

2017-07-27 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13723:
-
Status: Open  (was: Patch Available)

if switch to {{e.toString()}} everywhere, C*  may generate many unnecessary 
String objects(gc) even if logger level is NONE.

By checking:  {{if(logger.isXXXEnable())}} would help, but it's troublesome.

how about using {{e.getMessage()}} instead of {{e.toString()}}?  this will 
affect driver's trace and server log.

> Trivial format error of digest in StorageProxy
> --
>
> Key: CASSANDRA-13723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: CASSANDRA-13723.patch
>
>
> The wrong tracing log will fail 
> {{materialized_views_test.py:TestMaterializedViews.view_tombstone_test}} and 
> impact clients.
> Current log: {{Digest mismatch: {} on 127.0.0.1}}
> Expected log: {{Digest mismatch: 
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
> DecoratedKey... on 127.0.0.1}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-26 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101603#comment-16101603
 ] 

ZhaoYang commented on CASSANDRA-11500:
--

WIP 
[branch|https://github.com/jasonstack/cassandra/commits/CASSANDRA-11500-cell]

Changed:
Extra "VirtualCell"(kind of special LivenessInfo for MV) in Row. It stores: 1. 
base column used in view PK and base column used in view filter conditions. if 
any of such column dead, entire view row dead, regardless LivenessInfo or 
DeletionTime status. 2. unselected base columns. if any of such column alive, 
view's pk should be alive if it's not deleted by DeletionTime or those columns 
in <1>. 

Todo: 
optimize in-memory/storage representation for "virtual-cells" to re-use 
AbstractRow/BTree
discard dead view row shadowed by virtualCells in compaction after gc_grace
more dtest
support collection type in unselected column (for now, it won't generate 
"virtual-cells" for non-frozen collection type)

> Obsolete MV entry may not be properly deleted
> -
>
> Key: CASSANDRA-11500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11500
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Sylvain Lebresne
>Assignee: ZhaoYang
>
> When a Materialized View uses a non-PK base table column in its PK, if an 
> update changes that column value, we add the new view entry and remove the 
> old one. When doing that removal, the current code uses the same timestamp 
> than for the liveness info of the new entry, which is the max timestamp for 
> any columns participating to the view PK. This is not correct for the 
> deletion as the old view entry could have other columns with higher timestamp 
> which won't be deleted as can easily shown by the failing of the following 
> test:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1;
> SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old 
> (invalid) and the new one
> {noformat}
> So the correct timestamp to use for the deletion is the biggest timestamp in 
> the old view entry (which we know since we read the pre-existing base row), 
> and that is what CASSANDRA-11475 does (the test above thus doesn't fail on 
> that branch).
> Unfortunately, even then we can still have problems if further updates 
> requires us to overide the old entry. Consider the following case:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the 
> entry for a=1 with timestamp 10
> UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; // This needs to re-insert 
> an entry for a=1 but shouldn't be deleted by the prior deletion
> UPDATE t USING TIMESTAMP 4 SET a = 2 WHERE k = 1; // ... and we can play this 
> game more than once
> UPDATE t USING TIMESTAMP 5 SET a = 1 WHERE k = 1;
> ...
> {noformat}
> In a way, this is saying that the "shadowable" deletion mechanism is not 
> general enough: we need to be able to re-insert an entry when a prior one had 
> been deleted before, but we can't rely on timestamps being strictly bigger on 
> the re-insert. In that sense, this can be though as a similar problem than 
> CASSANDRA-10965, though the solution there of a single flag is not enough 
> since we can have to replace more than once.
> I think the proper solution would be to ship enough information to always be 
> able to decide when a view deletion is shadowed. Which means that both 
> liveness info (for updates) and shadowable deletion would need to ship the 
> timestamp of any base table column that is part the view PK (so {{a}} in the 
> example below).  It's doable (and not that hard really), but it does require 
> a change to the sstable and intra-node protocol, which makes this a bit 
> painful right now.
> But I'll also note that as CASSANDRA-1096 shows, the timestamp is not even 
> enough since on equal timestamp the value can be the deciding factor. So in 
> theory we'd have to ship the value of those columns (in the case of a 
> deletion at least since we have it in the view PK for updates). That said, on 
> that last problem, my preference would be that we start prioritizing 
> CASSANDRA-6123 seriously so we don't have to care about conflicting timestamp 
> anymore, which would make this problem go 

[jira] [Updated] (CASSANDRA-13547) Filtered materialized views missing data

2017-07-26 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13547:
-
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

thanks, mark it as duplicate, move to 11500 with new MV timestamp

> Filtered materialized views missing data
> 
>
> Key: CASSANDRA-13547
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13547
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
> Environment: Official Cassandra 3.10 Docker image (ID 154b919bf8ce).
>Reporter: Craig Nicholson
>Assignee: Krishna Dattu Koneru
>Priority: Blocker
>  Labels: materializedviews
> Fix For: 3.11.x
>
>
> When creating a materialized view against a base table the materialized view 
> does not always reflect the correct data.
> Using the following test schema:
> {code:title=Schema|language=sql}
> DROP KEYSPACE IF EXISTS test;
> CREATE KEYSPACE test
>   WITH REPLICATION = { 
>'class' : 'SimpleStrategy', 
>'replication_factor' : 1 
>   };
> CREATE TABLE test.table1 (
> id int,
> name text,
> enabled boolean,
> foo text,
> PRIMARY KEY (id, name));
> CREATE MATERIALIZED VIEW test.table1_mv1 AS SELECT id, name, foo
> FROM test.table1
> WHERE id IS NOT NULL 
> AND name IS NOT NULL 
> AND enabled = TRUE
> PRIMARY KEY ((name), id);
> CREATE MATERIALIZED VIEW test.table1_mv2 AS SELECT id, name, foo, enabled
> FROM test.table1
> WHERE id IS NOT NULL 
> AND name IS NOT NULL 
> AND enabled = TRUE
> PRIMARY KEY ((name), id);
> {code}
> When I insert a row into the base table the materialized views are updated 
> appropriately. (+)
> {code:title=Insert row|language=sql}
> cqlsh> INSERT INTO test.table1 (id, name, enabled, foo) VALUES (1, 'One', 
> TRUE, 'Bar');
> cqlsh> SELECT * FROM test.table1;
>  id | name | enabled | foo
> +--+-+-
>   1 |  One |True | Bar
> (1 rows)
> cqlsh> SELECT * FROM test.table1_mv1;
>  name | id | foo
> --++-
>   One |  1 | Bar
> (1 rows)
> cqlsh> SELECT * FROM test.table1_mv2;
>  name | id | enabled | foo
> --++-+-
>   One |  1 |True | Bar
> (1 rows)
> {code}
> Updating the record in the base table and setting enabled to FALSE will 
> filter the record from both materialized views. (+)
> {code:title=Disable the row|language=sql}
> cqlsh> UPDATE test.table1 SET enabled = FALSE WHERE id = 1 AND name = 'One';
> cqlsh> SELECT * FROM test.table1;
>  id | name | enabled | foo
> +--+-+-
>   1 |  One |   False | Bar
> (1 rows)
> cqlsh> SELECT * FROM test.table1_mv1;
>  name | id | foo
> --++-
> (0 rows)
> cqlsh> SELECT * FROM test.table1_mv2;
>  name | id | enabled | foo
> --++-+-
> (0 rows)
> {code}
> However a further update to the base table setting enabled to TRUE should 
> include the record in both materialzed views, however only one view 
> (table1_mv2) gets updated. (-)
> It appears that only the view (table1_mv2) that returns the filtered column 
> (enabled) is updated. (-)
> Additionally columns that are not part of the partiion or clustering key are 
> not updated. You can see that the foo column has a null value in table1_mv2. 
> (-)
> {code:title=Enable the row|language=sql}
> cqlsh> UPDATE test.table1 SET enabled = TRUE WHERE id = 1 AND name = 'One';
> cqlsh> SELECT * FROM test.table1;
>  id | name | enabled | foo
> +--+-+-
>   1 |  One |True | Bar
> (1 rows)
> cqlsh> SELECT * FROM test.table1_mv1;
>  name | id | foo
> --++-
> (0 rows)
> cqlsh> SELECT * FROM test.table1_mv2;
>  name | id | enabled | foo
> --++-+--
>   One |  1 |True | null
> (1 rows)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13723) Trivial format error of digest in StorageProxy

2017-07-26 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101240#comment-16101240
 ] 

ZhaoYang commented on CASSANDRA-13723:
--

thanks, I will update the patch this week

> Trivial format error of digest in StorageProxy
> --
>
> Key: CASSANDRA-13723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: CASSANDRA-13723.patch
>
>
> The wrong tracing log will fail 
> {{materialized_views_test.py:TestMaterializedViews.view_tombstone_test}} and 
> impact clients.
> Current log: {{Digest mismatch: {} on 127.0.0.1}}
> Expected log: {{Digest mismatch: 
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
> DecoratedKey... on 127.0.0.1}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13127) Materialized Views: View row expires too soon

2017-07-26 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13127:
-
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

mark this as duplicate, move to 11500 with new MV timestamp

> Materialized Views: View row expires too soon
> -
>
> Key: CASSANDRA-13127
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13127
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths, Materialized Views
>Reporter: Duarte Nunes
>Assignee: ZhaoYang
>
> Consider the following commands, ran against trunk:
> {code}
> echo "DROP MATERIALIZED VIEW ks.mv; DROP TABLE ks.base;" | bin/cqlsh
> echo "CREATE TABLE ks.base (p int, c int, v int, PRIMARY KEY (p, c));" | 
> bin/cqlsh
> echo "CREATE MATERIALIZED VIEW ks.mv AS SELECT p, c FROM base WHERE p IS NOT 
> NULL AND c IS NOT NULL PRIMARY KEY (c, p);" | bin/cqlsh
> echo "INSERT INTO ks.base (p, c) VALUES (0, 0) USING TTL 10;" | bin/cqlsh
> # wait for row liveness to get closer to expiration
> sleep 6;
> echo "UPDATE ks.base USING TTL 8 SET v = 0 WHERE p = 0 and c = 0;" | bin/cqlsh
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  7
> (1 rows)
>  c | p
> ---+---
>  0 | 0
> (1 rows)
> # wait for row liveness to expire
> sleep 4;
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+
>  0 | 0 |  3
> (1 rows)
>  c | p
> ---+---
> (0 rows)
> {code}
> Notice how the view row is removed even though the base row is still live. I 
> would say this is because in ViewUpdateGenerator#computeLivenessInfoForEntry 
> the TTLs are compared instead of the expiration times, but I'm not sure I'm 
> getting that far ahead in the code when updating a column that's not in the 
> view.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-26 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/26/17 3:46 PM:
---

h3. Relation: base -> view

First of all, I think all of us should agree on what cases view row should 
exists.

IMO, there are two main cases:

1. base pk and view pk are the same (order doesn't matter) and view has no 
filter conditions or only conditions on base pk.
(filter condition mean: {{c = 1}} in view's where clause. filter condition is 
not a concern here, since no previous view data to be cleared.)

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) and base row pk satifies 
view's filter conditions if any.
* b. or one of base row columns selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. this is 
handled by existing mechanism of liveness and tombstone since all info are 
included in view row
* c. or one of base row columns not selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. Those 
unselected columns' timestamp/ttl/cell-deletion info are not currently stored 
on view row.

2. base column used in view pk or view has filter conditions on base non-key 
column which can also lead to entire view row being wiped.

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) && base column used in view 
pk is not null but no timestamp && conditions are satisfied. ( pk having live 
livenesInfo means it is not deleted by tombstone)
* b. or base row column in view pk has timestamp (via update) && conditions are 
satisfied. eg. if base column used in view pk is TTLed, entire view row should 
be wiped.

Next thing is to model "view's tombstone and livenessInfo" to maintain view 
data based on above cases.
 
h3. Previous known issues: 
(I might miss some issues, feel free to ping me..)

ttl
* view row is not wiped when TTLed on base column used in view pk or TTLed on 
base non-key column with filter condition
* cells with same timestamp, merging ttls are not deterministic.

partial update on base columns not selected in view
* it results in no view data. because of current update semantics, no view 
updates are generated
* corresponding view row' liveness is not depending on liveness of base columns

filter conditions or base column used in view pk causes
* view row is shadowed after a few modification on base column used in view pk 
if the base non-key column has TS greater than base pk's ts and view key 
column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry 
when a prior one had been deleted need to be careful to hanlde timestamp tie)

tombstone merging is not commutative
* in current code, shadowable tombstone doesn't co-exist with regular tombstone

sstabledump not supporting current shadowable tombstone

h3. Model

I can think of two ways to ship all required base column info to view:
* make base columns that are not selected in view as "virtual cell" and 
store their imestamp/ttl to view without their actual values. so we can reuse 
current ts/tb/ttl mechanism with additional validation logic to check if a view 
row is alive.
* or storing those info on view's livenessInfo/deletion with addition merge 
logic to make sure view liveness/deletion are ordered properly even in the case 
of timestamp tie. It's like a replacement stragy which uses inserted view row 
to replace old view row. (In regular table, reconciliation is at cell level. 
need to research more about the concurrent view update cases, fow now, it looks 
fine). It also implies that every modification on base, view row will get 
replaced entirely..

I will go ahead with -second way since there is an existing shadowable 
tombstone mechanism.-  VirtualCells to avoid changing low level timestamp 
comparison..

{code}
ColumnInfo: // generated from base column as it is.
0. timestamp
1. ttl 
2. localDeletionTime:  could be used to represent tombstone or TTLed 
depends on if there is ttl

supersedes(): if timestamps are different, greater timestamp 
supersedes; if timestamps are same, greater localDeletionTime supersedes.


Row: // VirtualCells(keyOrConditions and unselected) are always merged with 
another row's during row merging process

// base column that are used in view pk or has filter condition on 
non-pk column.
// if any column is not live, entire view row is wiped.
// if a column in base is filtered and not selected, it's stored here.
// during base modification, if a view row is removed due to 
base-column-in-view-pk or filter-contiions, then no deletion is issue,
// the virtual cell tombstone is added to Row's keyOrConditions.
2. 

[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-26 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101603#comment-16101603
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/26/17 3:34 PM:
---

WIP 
[branch|https://github.com/jasonstack/cassandra/commits/CASSANDRA-11500-cell]

Changed:
Extra "VirtualCell"(kind of special LivenessInfo for MV) in Row. It stores: 1. 
base column used in view PK and base column used in view filter conditions. if 
any of such column dead, entire view row dead, regardless LivenessInfo or 
DeletionTime status. 2. unselected base columns. if any of such column alive, 
view's pk should be alive if it's not deleted by DeletionTime or those columns 
in <1>. 

Todo: 
optimize in-memory/storage representation for "virtual-cells" to re-use 
AbstractRow/BTree
more dtest
optimize ViewUpdateGenerator process to reduce payload. eg, if view row has 
live pk or other live column, "virtualCells' unselected payload" is not 
necessary.
support collection type in unselected column (for now, it won't generate 
"virtual-cells" for non-frozen collection type)


was (Author: jasonstack):
WIP 
[branch|https://github.com/jasonstack/cassandra/commits/CASSANDRA-11500-cell]

Changed:
Extra "VirtualCell"(kind of special LivenessInfo for MV) in Row. It stores: 1. 
base column used in view PK and base column used in view filter conditions. if 
any of such column dead, entire view row dead, regardless LivenessInfo or 
DeletionTime status. 2. unselected base columns. if any of such column alive, 
view's pk should be alive if it's not deleted by DeletionTime or those columns 
in <1>. 

Todo: 
optimize in-memory/storage representation for "virtual-cells" to re-use 
AbstractRow/BTree
discard dead view row shadowed by virtualCells in compaction after gc_grace
more dtest
support collection type in unselected column (for now, it won't generate 
"virtual-cells" for non-frozen collection type)

> Obsolete MV entry may not be properly deleted
> -
>
> Key: CASSANDRA-11500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11500
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Sylvain Lebresne
>Assignee: ZhaoYang
>
> When a Materialized View uses a non-PK base table column in its PK, if an 
> update changes that column value, we add the new view entry and remove the 
> old one. When doing that removal, the current code uses the same timestamp 
> than for the liveness info of the new entry, which is the max timestamp for 
> any columns participating to the view PK. This is not correct for the 
> deletion as the old view entry could have other columns with higher timestamp 
> which won't be deleted as can easily shown by the failing of the following 
> test:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1;
> SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old 
> (invalid) and the new one
> {noformat}
> So the correct timestamp to use for the deletion is the biggest timestamp in 
> the old view entry (which we know since we read the pre-existing base row), 
> and that is what CASSANDRA-11475 does (the test above thus doesn't fail on 
> that branch).
> Unfortunately, even then we can still have problems if further updates 
> requires us to overide the old entry. Consider the following case:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the 
> entry for a=1 with timestamp 10
> UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; // This needs to re-insert 
> an entry for a=1 but shouldn't be deleted by the prior deletion
> UPDATE t USING TIMESTAMP 4 SET a = 2 WHERE k = 1; // ... and we can play this 
> game more than once
> UPDATE t USING TIMESTAMP 5 SET a = 1 WHERE k = 1;
> ...
> {noformat}
> In a way, this is saying that the "shadowable" deletion mechanism is not 
> general enough: we need to be able to re-insert an entry when a prior one had 
> been deleted before, but we can't rely on timestamps being strictly bigger on 
> the re-insert. In that sense, this can be though as a similar problem than 
> CASSANDRA-10965, though the solution there of a single flag is not enough 
> since we can have to replace more than once.
> I think the proper solution 

[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/19/17 2:23 PM:
---

h3. Relation: base -> view

First of all, I think all of us should agree on what cases view row should 
exists.

IMO, there are two main cases:

1. base pk and view pk are the same (order doesn't matter) and view has no 
filter conditions or only conditions on base pk.
(filter condition is not a concern here, since no previous view data to be 
cleared)

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) and base row pk satifies 
view's filter conditions if any.
* b. or one of base row columns selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. this is 
handled by existing mechanism of liveness and tombstone since all info are 
included in view row
* c. or one of base row columns not selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. Those 
unselected columns' timestamp/ttl/cell-deletion info currently are not stored 
on view row. 

2. base column used in view pk or view has filter conditions on base non-key 
column which can also lead to entire view row being wiped.

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) && base column used in view 
pk is not null but no timestamp && conditions are satisfied. ( pk having live 
livenesInfo means it is not deleted by tombstone)
* b. or base row column in view pk has timestamp (via update) && conditions are 
satisfied. eg. if base column used in view pk is TTLed, entire view row should 
be wiped.

Next thing is to model "shadowable tombstone or shadowable liveness" to 
maintain view data based on above cases.
 
h3. Previous known issues: 
(I might miss some issues, feel free to ping me..)

ttl
* view row is not wiped when TTLed on base column used in view pk or TTLed on 
base non-key column with filter condition
* cells with same timestamp, merging ttls are not deterministic.

partial update on base columns not selected in view
* it results in no view data. because of current update semantics, no view 
updates are generated
* corresponding view row liveness is not depending on liveness of base columns

filter conditions or base column used in view pk causes
* view row is shadowed after a few modification on base column used in view pk 
if the base non-key column has TS greater than base pk's ts and view key 
column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry 
when a prior one had been deleted need to be careful to hanlde timestamp tie)

tombstone merging is not commutative
* in current code, shadowable tombstone doesn't co-exist with regular tombstone

sstabledump doesn't not support current shadowable tombstone

h3. Model

I can think of two ways to ship all required base column info to view:
* make base columns that are not selected in view as "virtual cell" and 
store their imestamp/ttl to view without their actual values. so we can reuse 
current ts/tb/ttl mechanism with additional validation logic to check if a view 
row is alive.
* or storing those info on view's livenessInfo/deletion with addition merge 
logic. 

I will go ahead with second way since there is an existing shadowable tombstone 
mechanism.


View PrimaryKey LivenessInfo, its timestamp, payloads, merging

{code}
ColumnInfo: // generated from base column
0. timestamp
1. ttl 
2. localDeletionTime:  could be used to represent tombstone or TTLed 
depends on if there is ttl

supersedes(): if timestamps are different, greater timestamp 
supersedes; if timestamps are same, greater localDeletionTime supersedes.

ViewLivenessInfo
// corresponding to base pk livenessInfo
0. timestamp
1. ttl / localDeletionTime

// base column that are used in view pk or has filter condition.
// if any column is not live or doesn't exist, entire view row is wiped.
// if a column in base is filtered and not selected, it's stored here.
2. Map keyOrConditions; 

// if any column is live
3. Map unselected;

// to determina if a row is live
isRowAlive(Deletion delete):
get timestamp or columnInfo that is greater than those in Deletion

if any colummn in {{keyOrConditions}} is TTLed or tombstone(dead) 
or not existed, false
if {{timestamp or ttl}} are alive, true
if any column in {{unselected}} is alive, true
otherwise check any columns in view row are alive

// cannot use supersedes, because timestamp can tie, we cannot compare 
keyOrConditions.  

[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089856#comment-16089856
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/19/17 2:26 PM:
---

I plan to solve: {{partial update}},{{ttl}}, {{co-existed shadowable 
tombstone}}, {{view timestamp tie}} all inside this ticket using extended 
shadowable approach(mentioned 
[here|https://issues.apache.org/jira/browse/CASSANDRA-11500?focusedCommentId=16082241=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16082241]).
 Because all these issues require some storage format changes(extendedFlag), 
it's better to fix them and refactor in one commit.

I will drafrt a patch using {{ViewTombstone}} and {{ViewLiveness}}.

Any suggestions would be appreciated.




was (Author: jasonstack):
I plan to solve: {{partial update}},{{ttl}}, {{co-existed shadowable 
tombstone}}, {{view timestamp tie}} all inside this ticket using extended 
shadowable approach(mentioned 
[here|https://issues.apache.org/jira/browse/CASSANDRA-11500?focusedCommentId=16082241=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16082241]).
 Because all these issues require some storage format changes(extendedFlag), 
it's better to fix them and refactor in one commit.

Drafted a 
[patch|https://github.com/jasonstack/cassandra/commits/CASSANDRA-11500-update-time]..(refactoring
 and adding more dtest)  

Any suggestions would be appreciated.



> Obsolete MV entry may not be properly deleted
> -
>
> Key: CASSANDRA-11500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11500
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Sylvain Lebresne
>Assignee: ZhaoYang
>
> When a Materialized View uses a non-PK base table column in its PK, if an 
> update changes that column value, we add the new view entry and remove the 
> old one. When doing that removal, the current code uses the same timestamp 
> than for the liveness info of the new entry, which is the max timestamp for 
> any columns participating to the view PK. This is not correct for the 
> deletion as the old view entry could have other columns with higher timestamp 
> which won't be deleted as can easily shown by the failing of the following 
> test:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1;
> SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old 
> (invalid) and the new one
> {noformat}
> So the correct timestamp to use for the deletion is the biggest timestamp in 
> the old view entry (which we know since we read the pre-existing base row), 
> and that is what CASSANDRA-11475 does (the test above thus doesn't fail on 
> that branch).
> Unfortunately, even then we can still have problems if further updates 
> requires us to overide the old entry. Consider the following case:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the 
> entry for a=1 with timestamp 10
> UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; // This needs to re-insert 
> an entry for a=1 but shouldn't be deleted by the prior deletion
> UPDATE t USING TIMESTAMP 4 SET a = 2 WHERE k = 1; // ... and we can play this 
> game more than once
> UPDATE t USING TIMESTAMP 5 SET a = 1 WHERE k = 1;
> ...
> {noformat}
> In a way, this is saying that the "shadowable" deletion mechanism is not 
> general enough: we need to be able to re-insert an entry when a prior one had 
> been deleted before, but we can't rely on timestamps being strictly bigger on 
> the re-insert. In that sense, this can be though as a similar problem than 
> CASSANDRA-10965, though the solution there of a single flag is not enough 
> since we can have to replace more than once.
> I think the proper solution would be to ship enough information to always be 
> able to decide when a view deletion is shadowed. Which means that both 
> liveness info (for updates) and shadowable deletion would need to ship the 
> timestamp of any base table column that is part the view PK (so {{a}} in the 
> example below).  It's doable (and not that hard really), but it does require 
> a change to the sstable and intra-node protocol, which makes this a bit 
> painful 

[jira] [Commented] (CASSANDRA-12952) AlterTableStatement propagates base table and affected MV changes inconsistently

2017-07-24 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098171#comment-16098171
 ] 

ZhaoYang commented on CASSANDRA-12952:
--

LGTM.. using byteman to test atomicity is really brilliant and I learned. now 
apache-c*-dtest is ready, we need to move to new repo. 

failed dtest are irrelevant : 
>  
> bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
>   -> known
> materialized_views_test.py:TestMaterializedViews.view_tombstone_test  ->  the 
> tracing format for diggest is wrong:  "Digest mismatch: {} on 127.0.0.1".  I 
> will open another ticket for it.

> AlterTableStatement propagates base table and affected MV changes 
> inconsistently
> 
>
> Key: CASSANDRA-12952
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12952
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata, Materialized Views
>Reporter: Aleksey Yeschenko
>Assignee: Andrés de la Peña
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> In {{AlterTableStatement}}, when renaming columns or changing their types, we 
> also keep track of all affected MVs - ones that also need column renames or 
> type changes. Then in the end we announce the migration for the table change, 
> and afterwards, separately, one for each affected MV.
> This creates a window in which view definitions and base table definition are 
> not in sync with each other. If a node fails in between receiving those 
> pushes, it's likely to have startup issues.
> The fix is trivial: table change and affected MV change should be pushed as a 
> single schema mutation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12952) AlterTableStatement propagates base table and affected MV changes inconsistently

2017-07-24 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098171#comment-16098171
 ] 

ZhaoYang edited comment on CASSANDRA-12952 at 7/24/17 10:06 AM:


LGTM.. using byteman to test atomicity is really brilliant and I learned. now 
apache-c*-dtest is ready, we need to move to new repo. 

failed dtest are irrelevant : 
| 
bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
  -> known
| materialized_views_test.py:TestMaterializedViews.view_tombstone_test  ->  the 
tracing format for diggest is wrong:  "Digest mismatch: {} on 127.0.0.1".  I 
will open another ticket for it.


was (Author: jasonstack):
LGTM.. using byteman to test atomicity is really brilliant and I learned. now 
apache-c*-dtest is ready, we need to move to new repo. 

failed dtest are irrelevant : 
>  
> bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
>   -> known
> materialized_views_test.py:TestMaterializedViews.view_tombstone_test  ->  the 
> tracing format for diggest is wrong:  "Digest mismatch: {} on 127.0.0.1".  I 
> will open another ticket for it.

> AlterTableStatement propagates base table and affected MV changes 
> inconsistently
> 
>
> Key: CASSANDRA-12952
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12952
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata, Materialized Views
>Reporter: Aleksey Yeschenko
>Assignee: Andrés de la Peña
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> In {{AlterTableStatement}}, when renaming columns or changing their types, we 
> also keep track of all affected MVs - ones that also need column renames or 
> type changes. Then in the end we announce the migration for the table change, 
> and afterwards, separately, one for each affected MV.
> This creates a window in which view definitions and base table definition are 
> not in sync with each other. If a node fails in between receiving those 
> pushes, it's likely to have startup issues.
> The fix is trivial: table change and affected MV change should be pushed as a 
> single schema mutation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13723) Trivial format error of digest in StorageProxy

2017-07-24 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13723:
-
Description: 
The wrong tracing log will fail 
{{materialized_views_test.py:TestMaterializedViews.view_tombstone_test}} and 
impact clients.

Current log: {{Digest mismatch: {} on 127.0.0.1}}

Expected log: {{Digest mismatch: 
org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
DecoratedKey... on 127.0.0.1}}


> Trivial format error of digest in StorageProxy
> --
>
> Key: CASSANDRA-13723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Trivial
> Fix For: 4.0
>
>
> The wrong tracing log will fail 
> {{materialized_views_test.py:TestMaterializedViews.view_tombstone_test}} and 
> impact clients.
> Current log: {{Digest mismatch: {} on 127.0.0.1}}
> Expected log: {{Digest mismatch: 
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
> DecoratedKey... on 127.0.0.1}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12952) AlterTableStatement propagates base table and affected MV changes inconsistently

2017-07-24 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098171#comment-16098171
 ] 

ZhaoYang edited comment on CASSANDRA-12952 at 7/24/17 10:20 AM:


LGTM.. using byteman to test atomicity is really brilliant and I learned. now 
apache-c*-dtest is ready, we need to move to new repo. 

failed dtest are irrelevant : 
| 
bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
  -> known
| materialized_views_test.py:TestMaterializedViews.view_tombstone_test  ->  the 
tracing format for diggest is wrong:  "Digest mismatch: {} on 127.0.0.1".  I 
will open another 
[ticket|https://issues.apache.org/jira/browse/CASSANDRA-13723] for it.


was (Author: jasonstack):
LGTM.. using byteman to test atomicity is really brilliant and I learned. now 
apache-c*-dtest is ready, we need to move to new repo. 

failed dtest are irrelevant : 
| 
bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
  -> known
| materialized_views_test.py:TestMaterializedViews.view_tombstone_test  ->  the 
tracing format for diggest is wrong:  "Digest mismatch: {} on 127.0.0.1".  I 
will open another ticket for it.

> AlterTableStatement propagates base table and affected MV changes 
> inconsistently
> 
>
> Key: CASSANDRA-12952
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12952
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata, Materialized Views
>Reporter: Aleksey Yeschenko
>Assignee: Andrés de la Peña
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> In {{AlterTableStatement}}, when renaming columns or changing their types, we 
> also keep track of all affected MVs - ones that also need column renames or 
> type changes. Then in the end we announce the migration for the table change, 
> and afterwards, separately, one for each affected MV.
> This creates a window in which view definitions and base table definition are 
> not in sync with each other. If a node fails in between receiving those 
> pushes, it's likely to have startup issues.
> The fix is trivial: table change and affected MV change should be pushed as a 
> single schema mutation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13723) Trivial format error of digest in StorageProxy

2017-07-24 Thread ZhaoYang (JIRA)
ZhaoYang created CASSANDRA-13723:


 Summary: Trivial format error of digest in StorageProxy
 Key: CASSANDRA-13723
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13723
 Project: Cassandra
  Issue Type: Bug
Reporter: ZhaoYang
Assignee: ZhaoYang
Priority: Trivial
 Fix For: 4.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13723) Trivial format error of digest in StorageProxy

2017-07-24 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13723:
-
Attachment: CASSANDRA-13723.patch

> Trivial format error of digest in StorageProxy
> --
>
> Key: CASSANDRA-13723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: CASSANDRA-13723.patch
>
>
> The wrong tracing log will fail 
> {{materialized_views_test.py:TestMaterializedViews.view_tombstone_test}} and 
> impact clients.
> Current log: {{Digest mismatch: {} on 127.0.0.1}}
> Expected log: {{Digest mismatch: 
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
> DecoratedKey... on 127.0.0.1}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13723) Trivial format error of digest in StorageProxy

2017-07-24 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-13723:
-
Status: Patch Available  (was: Open)

It only affects trunk by 
[12996|https://issues.apache.org/jira/browse/CASSANDRA-12996]

All tests passed except for bootstrap_test(known) in dtest.



> Trivial format error of digest in StorageProxy
> --
>
> Key: CASSANDRA-13723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: CASSANDRA-13723.patch
>
>
> The wrong tracing log will fail 
> {{materialized_views_test.py:TestMaterializedViews.view_tombstone_test}} and 
> impact clients.
> Current log: {{Digest mismatch: {} on 127.0.0.1}}
> Expected log: {{Digest mismatch: 
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
> DecoratedKey... on 127.0.0.1}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12996) update slf4j dependency to 1.7.21

2017-07-24 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098310#comment-16098310
 ] 

ZhaoYang commented on CASSANDRA-12996:
--

[~spo...@gmail.com]  thanks for upgrading the libraries.  but it had some 
unexpected changes in logging, for example CASSANDRA-13723

> update slf4j dependency to 1.7.21
> -
>
> Key: CASSANDRA-12996
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12996
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Tomas Repik
>Assignee: Stefan Podkowinski
> Fix For: 4.0
>
> Attachments: cassandra-3.11.0-slf4j.patch, 
> jcl-over-slf4j-1.7.25.jar.asc, log4j-over-slf4j-1.7.25.jar.asc, 
> slf4j-api-1.7.25.jar.asc
>
>
> Cassandra 3.11.0 is about to be included in Fedora. There are some tweaks to 
> the sources we need to do in order to successfully build it. Cassandra 
> depends on slf4j 1.7.7, but In Fedora we have the latest upstream version 
> 1.7.21 It was released some time ago on April 6 2016. I attached a patch 
> updating Cassandra sources to depend on the newer slf4j sources. The only 
> actual change is the number of parameters accepted by SubstituteLogger class. 
> Please consider updating.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13723) Trivial format error of digest in StorageProxy

2017-07-24 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098309#comment-16098309
 ] 

ZhaoYang edited comment on CASSANDRA-13723 at 7/24/17 12:51 PM:


It only affects trunk by  CASSANDRA-12996

All tests passed except for bootstrap_test(known) in dtest.




was (Author: jasonstack):
It only affects trunk by 
[12996|https://issues.apache.org/jira/browse/CASSANDRA-12996]

All tests passed except for bootstrap_test(known) in dtest.



> Trivial format error of digest in StorageProxy
> --
>
> Key: CASSANDRA-13723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: CASSANDRA-13723.patch
>
>
> The wrong tracing log will fail 
> {{materialized_views_test.py:TestMaterializedViews.view_tombstone_test}} and 
> impact clients.
> Current log: {{Digest mismatch: {} on 127.0.0.1}}
> Expected log: {{Digest mismatch: 
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
> DecoratedKey... on 127.0.0.1}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13657) Materialized Views: Index MV on TTL'ed column produces orphanized view entry if another column keeps entry live

2017-07-11 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16083278#comment-16083278
 ] 

ZhaoYang commented on CASSANDRA-13657:
--

it looks more relevant to CASSANDRA-13127,  but CASSANDRA-13127 didn't handle 
this case...

> Materialized Views: Index MV on TTL'ed column produces orphanized view entry 
> if another column keeps entry live
> ---
>
> Key: CASSANDRA-13657
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13657
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Fridtjof Sander
>Assignee: Krishna Dattu Koneru
>  Labels: materializedviews, ttl
>
> {noformat}
> CREATE TABLE t (k int, a int, b int, PRIMARY KEY (k));
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (a, k);
> INSERT INTO t (k) VALUES (1);
> UPDATE t USING TTL 5 SET a = 10 WHERE k = 1;
> UPDATE t SET b = 100 WHERE k = 1;
> SELECT * from t; SELECT * from mv;
>  k | a  | b
> ---++-
>  1 | 10 | 100
> (1 rows)
>  a  | k | b
> +---+-
>  10 | 1 | 100
> (1 rows)
> -- 5 seconds later
> SELECT * from t; SELECT * from mv;
>  k | a| b
> ---+--+-
>  1 | null | 100
> (1 rows)
>  a  | k | b
> +---+-
>  10 | 1 | 100
> (1 rows)
> -- that view entry's liveness-info is (probably) dead, but the entry is kept 
> alive by b=100
> DELETE b FROM t WHERE k=1;
> SELECT * from t; SELECT * from mv;
>  k | a| b
> ---+--+--
>  1 | null | null
> (1 rows)
>  a  | k | b
> +---+-
>  10 | 1 | 100
> (1 rows)
> DELETE FROM t WHERE k=1;
> cqlsh:test> SELECT * from t; SELECT * from mv;
>  k | a | b
> ---+---+---
> (0 rows)
>  a  | k | b
> +---+-
>  10 | 1 | 100
> (1 rows)
> -- deleting the base-entry doesn't help, because the view-key can not be 
> constructed anymore (a=10 already expired)
> {noformat}
> The problem here is that although the view-entry's liveness-info (probably) 
> expired correctly a regular column (`b`) keeps the view-entry live. It should 
> have disappeared since it's indexed column (`a`) expired in the corresponding 
> base-row. This is pretty severe, since that view-entry is now orphanized.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093220#comment-16093220
 ] 

ZhaoYang commented on CASSANDRA-13526:
--

thanks for reviewing, I will back port to 3.0/3.11.  I was stuck in other 
issues..

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093220#comment-16093220
 ] 

ZhaoYang edited comment on CASSANDRA-13526 at 7/19/17 2:54 PM:
---

thanks for reviewing, I will back port to 3.0/3.11 this week.  I was stuck in 
other issues..


was (Author: jasonstack):
thanks for reviewing, I will back port to 3.0/3.11.  I was stuck in other 
issues..

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12484) Unknown exception caught while attempting to update MaterializedView! findkita.kitas java.lang.AssertionErro

2017-06-28 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-12484:
-
Component/s: Materialized Views

> Unknown exception caught while attempting to update MaterializedView! 
> findkita.kitas java.lang.AssertionErro
> 
>
> Key: CASSANDRA-12484
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12484
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
> Environment: Docker Container with Cassandra version 3.7 running on 
> local pc
>Reporter: cordlessWool
>Priority: Critical
>
> After restart my cassandra node does not start anymore. Ends with following 
> error message.
> ERROR 18:39:37 Unknown exception caught while attempting to update 
> MaterializedView! findkita.kitas
> java.lang.AssertionError: We shouldn't have got there is the base row had no 
> associated entry
> Cassandra has heavy cpu usage and use 2,1 gb of memory there is be 1gb more 
> available. I run nodetool cleanup and repair, but did not help.
> I have 5 materialzied views on this table, but the amount of rows in table is 
> under 2000, that is not much.
> The cassandra runs in a docker container. The container is access able, but 
> can not call cqlsh and my website cound not connect too



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13592) Null Pointer exception at SELECT JSON statement

2017-06-28 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16067738#comment-16067738
 ] 

ZhaoYang commented on CASSANDRA-13592:
--

Thanks for reviewing.. I will update the 2.0/3.0/3.11 branch when trunk CI 
finishes.

"list , set, map or UDT types" are not allowed in key, ideally it shouldn't 
cause issues. I think only key buffer will be reused eg. paging-state, 
subsequent rows deserialization. 
I have included them in the test.

One more issue is in ToJsonFct.

{code}
public ByteBuffer execute(ProtocolVersion protocolVersion, List 
parameters) throws InvalidRequestException
{
assert parameters.size() == 1 : "Expected 1 argument for toJson(), but 
got " + parameters.size();
ByteBuffer parameter = parameters.get(0);
if (parameter == null)
return ByteBufferUtil.bytes("null");
// same..
return 
ByteBufferUtil.bytes(argTypes.get(0).toJSONString(parameter.duplicate(), 
protocolVersion));
}
{code}

> Null Pointer exception at SELECT JSON statement
> ---
>
> Key: CASSANDRA-13592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13592
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Debian Linux
>Reporter: Wyss Philipp
>Assignee: ZhaoYang
>  Labels: beginner
> Attachments: system.log
>
>
> A Nulll pointer exception appears when the command
> {code}
> SELECT JSON * FROM examples.basic;
> ---MORE---
>  message="java.lang.NullPointerException">
> Examples.basic has the following description (DESC examples.basic;):
> CREATE TABLE examples.basic (
> key frozen> PRIMARY KEY,
> wert text
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}
> The error appears after the ---MORE--- line.
> The field "wert" has a JSON formatted string.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13409) Materialized Views: View cells are resurrected

2017-06-29 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16067813#comment-16067813
 ] 

ZhaoYang commented on CASSANDRA-13409:
--

Those MV tombstone tickets already showed that shadowable tombstone mechanism 
isn't enough. 

Maybe we need to step back and come up with a new mechanism instead of putting 
out fire in each ticket and probably future tickets.

> Materialized Views: View cells are resurrected
> --
>
> Key: CASSANDRA-13409
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13409
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths, Materialized Views
>Reporter: Duarte Nunes
>Assignee: ZhaoYang
>
> Consider the following commands, ran against trunk@0f054fee5c:
> {code:xml}
> echo "create keyspace ks WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};" | bin/cqlsh
> echo "create table ks.base (p int primary key, v1 int, v2 int) with 
> gc_grace_seconds = 1;" | bin/cqlsh
> echo "create materialized view ks.my_view as select * from ks.base where p is 
> not null and v1 is not null primary key (v1, p);" | bin/cqlsh
> echo "insert into ks.base (p, v1, v2) values (3, 1, 3) using timestamp 1;" | 
> bin/cqlsh
> bin/nodetool flush ks my_view base
> echo "delete from ks.base using timestamp 2 where p = 3;" | bin/cqlsh
> bin/nodetool flush ks my_view base
> echo "insert into ks.base (p, v1) values (3, 1) using timestamp 3;" | 
> bin/cqlsh
> bin/nodetool flush ks my_view base
> echo "select * from ks.my_view;" | bin/cqlsh
>  v1 | p | v2
> +---+
>   1 | 3 |  3
> (1 rows)
> echo "select * from ks.base;" | bin/cqlsh
>  p | v1 | v2
> ---++--
>  3 |  1 | null
> (1 rows)
> {code}
> As you can see, this incorrectly brings back cell v2=3. 
> There is one definitive problem and a potential one:
> * Merging rows must be commutative. If a shadowable tombstone is applied 
> after a row tombstone, it will replace that tombstone; if a row marker 
> shadows the shadowable tombstone before the row containing the original data 
> is applied, then any dead cells in said data will be resurrected;
> * Shadowable tombstones shouldn't compact away previous row tombstones or 
> even deleted cells; if the relevant tombstones have been GCed from the base 
> table, then a base table update won't carry them anymore (alongside a newer 
> row marker).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13592) Null Pointer exception at SELECT JSON statement

2017-06-29 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16068410#comment-16068410
 ] 

ZhaoYang edited comment on CASSANDRA-13592 at 6/30/17 4:17 AM:
---

Thanks.. Yes, it's better solve it within {{type.toJSONString}}

I will add test the specification of {{type.toJSONString}}.  all should not 
change buffer position.

Got a question about {{DurationType.toJSONString()}}, imo, it should return 
string value with double quote {{"-2h9m"}}, like {{TimeType.toJSONString()}}.   
but it only returns string value {{-2h9m}} .  Is it expected?  it would be 
different from user input. 

One more thing is about: EmptyType. it should directly {{return "\"\"";}} 
instead of using parent method which causes null-pointer exception.  EmptyType 
seems not used for json-path, but good to keep it safe.


was (Author: jasonstack):
Thanks.. Yes, it's better solve it within type.toJSONString

I will add test the specification of type.toJSONString.  all should not change 
buffer position.

Got a question about DurationType.toJSONString(), imo, it should return value 
with double quote("-2h9m"), like TimeType.toJSONString().   but it only returns 
value(-2h9m) .  Is it expected?

> Null Pointer exception at SELECT JSON statement
> ---
>
> Key: CASSANDRA-13592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13592
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Debian Linux
>Reporter: Wyss Philipp
>Assignee: ZhaoYang
>  Labels: beginner
> Attachments: system.log
>
>
> A Nulll pointer exception appears when the command
> {code}
> SELECT JSON * FROM examples.basic;
> ---MORE---
>  message="java.lang.NullPointerException">
> Examples.basic has the following description (DESC examples.basic;):
> CREATE TABLE examples.basic (
> key frozen> PRIMARY KEY,
> wert text
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}
> The error appears after the ---MORE--- line.
> The field "wert" has a JSON formatted string.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13592) Null Pointer exception at SELECT JSON statement

2017-06-29 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16068410#comment-16068410
 ] 

ZhaoYang edited comment on CASSANDRA-13592 at 6/30/17 4:53 AM:
---

Thanks.. Yes, it's better solve it within {{type.toJSONString}}

I will add test the specification of {{type.toJSONString}}.  all should not 
change buffer position.

Got a question about {{DurationType.toJSONString()}}, imo, it should return 
string value with double quote {{"-2h9m"}}, like {{TimeType.toJSONString()}}.   
but it only returns string value {{-2h9m}} .  Is it expected?  it would be 
different from user's json input. 

One more thing is about: EmptyType. it should directly {{return "\"\"";}} 
instead of using parent method which causes null-pointer exception.  EmptyType 
seems not used for json-path, but good to keep it safe.


was (Author: jasonstack):
Thanks.. Yes, it's better solve it within {{type.toJSONString}}

I will add test the specification of {{type.toJSONString}}.  all should not 
change buffer position.

Got a question about {{DurationType.toJSONString()}}, imo, it should return 
string value with double quote {{"-2h9m"}}, like {{TimeType.toJSONString()}}.   
but it only returns string value {{-2h9m}} .  Is it expected?  it would be 
different from user input. 

One more thing is about: EmptyType. it should directly {{return "\"\"";}} 
instead of using parent method which causes null-pointer exception.  EmptyType 
seems not used for json-path, but good to keep it safe.

> Null Pointer exception at SELECT JSON statement
> ---
>
> Key: CASSANDRA-13592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13592
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Debian Linux
>Reporter: Wyss Philipp
>Assignee: ZhaoYang
>  Labels: beginner
> Attachments: system.log
>
>
> A Nulll pointer exception appears when the command
> {code}
> SELECT JSON * FROM examples.basic;
> ---MORE---
>  message="java.lang.NullPointerException">
> Examples.basic has the following description (DESC examples.basic;):
> CREATE TABLE examples.basic (
> key frozen> PRIMARY KEY,
> wert text
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}
> The error appears after the ---MORE--- line.
> The field "wert" has a JSON formatted string.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13592) Null Pointer exception at SELECT JSON statement

2017-06-30 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069565#comment-16069565
 ] 

ZhaoYang commented on CASSANDRA-13592:
--

|| source || junit-result || dtest-result||
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592] | 
[junit|https://circleci.com/gh/jasonstack/cassandra/84] | | 
| 
[3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-3.11]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/82] | | 
| 
[3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-3.0]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/83] | | 
| 
[2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-2.2]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/85] | | 

1. in {{listType, mapType, setType, TupleType}}.toJSONString(), keep buffer 
position the same.
2. change {{DurationType}}.toJSONString() to {{return "\"" + +"\"";}} (with 
double-quote) to be consistent with user json input
3. change {{EmptyType}}.toJSONString() to directly {{return "\"\"";}}, 
otherwise parent method throws NPE.

> Null Pointer exception at SELECT JSON statement
> ---
>
> Key: CASSANDRA-13592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13592
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Debian Linux
>Reporter: Wyss Philipp
>Assignee: ZhaoYang
>  Labels: beginner
> Attachments: system.log
>
>
> A Nulll pointer exception appears when the command
> {code}
> SELECT JSON * FROM examples.basic;
> ---MORE---
>  message="java.lang.NullPointerException">
> Examples.basic has the following description (DESC examples.basic;):
> CREATE TABLE examples.basic (
> key frozen> PRIMARY KEY,
> wert text
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}
> The error appears after the ---MORE--- line.
> The field "wert" has a JSON formatted string.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13565) Materialized view usage of commit logs requires large mutation but commitlog_segment_size_in_mb=2048 causes exception

2017-06-30 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069678#comment-16069678
 ] 

ZhaoYang commented on CASSANDRA-13565:
--

I will mark this ticket as `not an issue` and 13622 is better place to fix all 
boundary cases.

> Materialized view usage of commit logs requires large mutation but 
> commitlog_segment_size_in_mb=2048 causes exception
> -
>
> Key: CASSANDRA-13565
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13565
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration, Materialized Views, Streaming and 
> Messaging
> Environment: Cassandra 3.9.0, Windows 
>Reporter: Tania S Engel
> Attachments: CQLforTable.png
>
>
> We will be upgrading to 3.10 for CASSANDRA-11670. However, there is another 
> scenario (not applyunsafe during JOIN) which leads to :
>   java.lang.IllegalArgumentException: Mutation of 525.847MiB is too large 
> for the maximum size of 512.000MiB
>       at 
> org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:262) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.batchlog.BatchlogManager.store(BatchlogManager.java:147) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:797) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.view.ViewBuilder.buildKey(ViewBuilder.java:96) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.view.ViewBuilder.run(ViewBuilder.java:165) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.compaction.CompactionManager$14.run(CompactionManager.java:1591)
>  [apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_66]
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_66]
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_66]
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_66]
>       at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] 
> Due to the relationship of max_mutation_size_in_kb and 
> commitlog_segment_size_in_mb, we increased commitlog_segment_size_in_mb and 
> left Cassandra to calculate max_mutation_size_in_kb as half the size 
> commitlog_segment_size_in_mb * 1024.
>  However, we have found that if we set commitlog_segment_size_in_mb=2048 we 
> get an exception upon starting Cassandra, when it is creating a new commit 
> log.
> ERROR [COMMIT-LOG-ALLOCATOR] 2017-05-31 17:01:48,005 
> JVMStabilityInspector.java:82 - Exiting due to error while processing commit 
> log during initialization.
> org.apache.cassandra.io.FSWriteError: java.io.IOException: An attempt was 
> made to move the file pointer before the beginning of the file
> Perhaps the index you are using is not big enough and it goes negative.
> Is the relationship between max_mutation_size_in_kb and 
> commitlog_segment_size_in_mb important to preserve? In our limited stress 
> test we are finding mutation size already over 512mb and we expect more data 
> in our sstables and associated materialized views.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13592) Null Pointer exception at SELECT JSON statement

2017-06-30 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069565#comment-16069565
 ] 

ZhaoYang edited comment on CASSANDRA-13592 at 6/30/17 9:20 AM:
---

|| source || junit-result || dtest-result||
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592] | 
[junit|https://circleci.com/gh/jasonstack/cassandra/84]  | 
{{cql_tests.py:SlowQueryTester.local_query_test}} failed on trunk
{{bootstrap_test.TestBootstrap.simultaneous_bootstrap_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13506]
| 
| 
[3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-3.11]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/82] | 
{{topology_test.TestTopology.size_estimates_multidc_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13229]
{{cqlsh_tests.cqlsh_tests.TestCqlsh.test_describe}} 
[known|https://issues.apache.org/jira/browse/CASSANDRA-13250] | 
| 
[3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-3.0]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/83] | 
{{auth_test.TestAuth.system_auth_ks_is_alterable_test}}
 
[known|https://issues.apache.org/jira/browse/CASSANDRA-13113]{{offline_tools_test.TestOfflineTools.sstableofflinerelevel_test}}
 [known|https://issues.apache.org/jira/browse/CASSANDRA-12617]
 {{repair_tests.incremental_repair_test.TestIncRepair.multiple_repair_test }} | 
[known|https://issues.apache.org/jira/browse/CASSANDRA-13515]| 
| 
[2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-2.2]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/85] | passed | 

1. in {{listType, mapType, setType, TupleType}}.toJSONString(), keep buffer 
position the same.
2. change {{DurationType}}.toJSONString() to {{return "\"" + +"\"";}} (with 
double-quote) to be consistent with user json input
3. change {{EmptyType}}.toJSONString() to directly {{return "\"\"";}}, 
otherwise parent method throws NPE.


was (Author: jasonstack):
|| source || junit-result || dtest-result||
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592] | 
[junit|https://circleci.com/gh/jasonstack/cassandra/84]  | 
{{cql_tests.py:SlowQueryTester.local_query_test}} failed on trunk
{{bootstrap_test.TestBootstrap.simultaneous_bootstrap_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13506]
| 
| 
[3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-3.11]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/82] | 
{{topology_test.TestTopology.size_estimates_multidc_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13229]
{{cqlsh_tests.cqlsh_tests.TestCqlsh.test_describe}} 
[known|https://issues.apache.org/jira/browse/CASSANDRA-13250] | 
| 
[3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-3.0]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/83] | | 
| 
[2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-2.2]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/85] | passed | 

1. in {{listType, mapType, setType, TupleType}}.toJSONString(), keep buffer 
position the same.
2. change {{DurationType}}.toJSONString() to {{return "\"" + +"\"";}} (with 
double-quote) to be consistent with user json input
3. change {{EmptyType}}.toJSONString() to directly {{return "\"\"";}}, 
otherwise parent method throws NPE.

> Null Pointer exception at SELECT JSON statement
> ---
>
> Key: CASSANDRA-13592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13592
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Debian Linux
>Reporter: Wyss Philipp
>Assignee: ZhaoYang
>  Labels: beginner
> Attachments: system.log
>
>
> A Nulll pointer exception appears when the command
> {code}
> SELECT JSON * FROM examples.basic;
> ---MORE---
>  message="java.lang.NullPointerException">
> Examples.basic has the following description (DESC examples.basic;):
> CREATE TABLE examples.basic (
> key frozen> PRIMARY KEY,
> wert text
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> 

[jira] [Resolved] (CASSANDRA-13565) Materialized view usage of commit logs requires large mutation but commitlog_segment_size_in_mb=2048 causes exception

2017-06-30 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang resolved CASSANDRA-13565.
--
Resolution: Duplicate

> Materialized view usage of commit logs requires large mutation but 
> commitlog_segment_size_in_mb=2048 causes exception
> -
>
> Key: CASSANDRA-13565
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13565
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration, Materialized Views, Streaming and 
> Messaging
> Environment: Cassandra 3.9.0, Windows 
>Reporter: Tania S Engel
> Attachments: CQLforTable.png
>
>
> We will be upgrading to 3.10 for CASSANDRA-11670. However, there is another 
> scenario (not applyunsafe during JOIN) which leads to :
>   java.lang.IllegalArgumentException: Mutation of 525.847MiB is too large 
> for the maximum size of 512.000MiB
>       at 
> org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:262) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.Keyspace.apply(Keyspace.java:493) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.Keyspace.apply(Keyspace.java:396) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:215) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.Mutation.apply(Mutation.java:227) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.batchlog.BatchlogManager.store(BatchlogManager.java:147) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.service.StorageProxy.mutateMV(StorageProxy.java:797) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.view.ViewBuilder.buildKey(ViewBuilder.java:96) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.view.ViewBuilder.run(ViewBuilder.java:165) 
> ~[apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> org.apache.cassandra.db.compaction.CompactionManager$14.run(CompactionManager.java:1591)
>  [apache-cassandra-3.9.0.jar:3.9.0]
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_66]
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_66]
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_66]
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_66]
>       at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] 
> Due to the relationship of max_mutation_size_in_kb and 
> commitlog_segment_size_in_mb, we increased commitlog_segment_size_in_mb and 
> left Cassandra to calculate max_mutation_size_in_kb as half the size 
> commitlog_segment_size_in_mb * 1024.
>  However, we have found that if we set commitlog_segment_size_in_mb=2048 we 
> get an exception upon starting Cassandra, when it is creating a new commit 
> log.
> ERROR [COMMIT-LOG-ALLOCATOR] 2017-05-31 17:01:48,005 
> JVMStabilityInspector.java:82 - Exiting due to error while processing commit 
> log during initialization.
> org.apache.cassandra.io.FSWriteError: java.io.IOException: An attempt was 
> made to move the file pointer before the beginning of the file
> Perhaps the index you are using is not big enough and it goes negative.
> Is the relationship between max_mutation_size_in_kb and 
> commitlog_segment_size_in_mb important to preserve? In our limited stress 
> test we are finding mutation size already over 512mb and we expect more data 
> in our sstables and associated materialized views.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13592) Null Pointer exception at SELECT JSON statement

2017-06-30 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069565#comment-16069565
 ] 

ZhaoYang edited comment on CASSANDRA-13592 at 6/30/17 9:23 AM:
---

|| source || junit-result || dtest-result||
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592] | 
[junit|https://circleci.com/gh/jasonstack/cassandra/84]  | 
{{cql_tests.py:SlowQueryTester.local_query_test}} failed on trunk
{{bootstrap_test.TestBootstrap.simultaneous_bootstrap_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13506]
| 
| 
[3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-3.11]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/82] | 
{{topology_test.TestTopology.size_estimates_multidc_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13229]
{{cqlsh_tests.cqlsh_tests.TestCqlsh.test_describe}} 
[known|https://issues.apache.org/jira/browse/CASSANDRA-13250] | 
| 
[3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-3.0]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/83] | 
{{auth_test.TestAuth.system_auth_ks_is_alterable_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13113]
{{offline_tools_test.TestOfflineTools.sstableofflinerelevel_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-12617]
{{repair_tests.incremental_repair_test.TestIncRepair.multiple_repair_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13515]|
 
| 
[2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-2.2]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/85] | passed | 

1. in {{listType, mapType, setType, TupleType}}.toJSONString(), keep buffer 
position the same.
2. change {{DurationType}}.toJSONString() to {{return "\"" + +"\"";}} (with 
double-quote) to be consistent with user json input
3. change {{EmptyType}}.toJSONString() to directly {{return "\"\"";}}, 
otherwise parent method throws NPE.


was (Author: jasonstack):
|| source || junit-result || dtest-result||
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592] | 
[junit|https://circleci.com/gh/jasonstack/cassandra/84]  | 
{{cql_tests.py:SlowQueryTester.local_query_test}} failed on trunk
{{bootstrap_test.TestBootstrap.simultaneous_bootstrap_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13506]
| 
| 
[3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-3.11]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/82] | 
{{topology_test.TestTopology.size_estimates_multidc_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13229]
{{cqlsh_tests.cqlsh_tests.TestCqlsh.test_describe}} 
[known|https://issues.apache.org/jira/browse/CASSANDRA-13250] | 
| 
[3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-3.0]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/83] | 
{{auth_test.TestAuth.system_auth_ks_is_alterable_test}}
 
[known|https://issues.apache.org/jira/browse/CASSANDRA-13113]{{offline_tools_test.TestOfflineTools.sstableofflinerelevel_test}}
 [known|https://issues.apache.org/jira/browse/CASSANDRA-12617]
 {{repair_tests.incremental_repair_test.TestIncRepair.multiple_repair_test }} | 
[known|https://issues.apache.org/jira/browse/CASSANDRA-13515]| 
| 
[2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13592-cassandra-2.2]
 |  [junit|https://circleci.com/gh/jasonstack/cassandra/85] | passed | 

1. in {{listType, mapType, setType, TupleType}}.toJSONString(), keep buffer 
position the same.
2. change {{DurationType}}.toJSONString() to {{return "\"" + +"\"";}} (with 
double-quote) to be consistent with user json input
3. change {{EmptyType}}.toJSONString() to directly {{return "\"\"";}}, 
otherwise parent method throws NPE.

> Null Pointer exception at SELECT JSON statement
> ---
>
> Key: CASSANDRA-13592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13592
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Debian Linux
>Reporter: Wyss Philipp
>Assignee: ZhaoYang
>  Labels: beginner
> Attachments: system.log
>
>
> A Nulll pointer exception appears when the command
> {code}
> SELECT JSON * FROM examples.basic;
> ---MORE---
>  message="java.lang.NullPointerException">
> Examples.basic has the following description (DESC examples.basic;):
> CREATE TABLE examples.basic (
> key frozen> PRIMARY KEY,
> wert text
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}

<    1   2   3   4   5   6   7   8   9   10   >