[jira] [Commented] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094231#comment-16094231
 ] 

ZhaoYang commented on CASSANDRA-11500:
--

[~KurtG] branch is not yet ready for you to test. but you could have a look at 
[proposal|https://issues.apache.org/jira/browse/CASSANDRA-11500?focusedCommentId=16082241=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16082241]
 first, see if there is any missing case.

> Obsolete MV entry may not be properly deleted
> -
>
> Key: CASSANDRA-11500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11500
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Sylvain Lebresne
>Assignee: ZhaoYang
>
> When a Materialized View uses a non-PK base table column in its PK, if an 
> update changes that column value, we add the new view entry and remove the 
> old one. When doing that removal, the current code uses the same timestamp 
> than for the liveness info of the new entry, which is the max timestamp for 
> any columns participating to the view PK. This is not correct for the 
> deletion as the old view entry could have other columns with higher timestamp 
> which won't be deleted as can easily shown by the failing of the following 
> test:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1;
> SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old 
> (invalid) and the new one
> {noformat}
> So the correct timestamp to use for the deletion is the biggest timestamp in 
> the old view entry (which we know since we read the pre-existing base row), 
> and that is what CASSANDRA-11475 does (the test above thus doesn't fail on 
> that branch).
> Unfortunately, even then we can still have problems if further updates 
> requires us to overide the old entry. Consider the following case:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the 
> entry for a=1 with timestamp 10
> UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; // This needs to re-insert 
> an entry for a=1 but shouldn't be deleted by the prior deletion
> UPDATE t USING TIMESTAMP 4 SET a = 2 WHERE k = 1; // ... and we can play this 
> game more than once
> UPDATE t USING TIMESTAMP 5 SET a = 1 WHERE k = 1;
> ...
> {noformat}
> In a way, this is saying that the "shadowable" deletion mechanism is not 
> general enough: we need to be able to re-insert an entry when a prior one had 
> been deleted before, but we can't rely on timestamps being strictly bigger on 
> the re-insert. In that sense, this can be though as a similar problem than 
> CASSANDRA-10965, though the solution there of a single flag is not enough 
> since we can have to replace more than once.
> I think the proper solution would be to ship enough information to always be 
> able to decide when a view deletion is shadowed. Which means that both 
> liveness info (for updates) and shadowable deletion would need to ship the 
> timestamp of any base table column that is part the view PK (so {{a}} in the 
> example below).  It's doable (and not that hard really), but it does require 
> a change to the sstable and intra-node protocol, which makes this a bit 
> painful right now.
> But I'll also note that as CASSANDRA-1096 shows, the timestamp is not even 
> enough since on equal timestamp the value can be the deciding factor. So in 
> theory we'd have to ship the value of those columns (in the case of a 
> deletion at least since we have it in the view PK for updates). That said, on 
> that last problem, my preference would be that we start prioritizing 
> CASSANDRA-6123 seriously so we don't have to care about conflicting timestamp 
> anymore, which would make this problem go away.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075925#comment-16075925
 ] 

ZhaoYang edited comment on CASSANDRA-13526 at 7/20/17 5:42 AM:
---

| branch | unit | 
[dtest|https://github.com/jasonstack/cassandra-dtest/commits/CASSANDRA-13526] |
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
running | running |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
running | running |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
running | running |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
running | running |


when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  


was (Author: jasonstack):
| branch | unit | dtest|
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
running | running |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
running | running |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
running | running |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
running | running |


when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094211#comment-16094211
 ] 

ASF GitHub Bot commented on CASSANDRA-13526:


Github user jasonstack closed the pull request at:

https://github.com/apache/cassandra-dtest/pull/1


> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-9988) Introduce leaf-only iterator

2017-07-19 Thread Anthony Grasso (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094089#comment-16094089
 ] 

Anthony Grasso edited comment on CASSANDRA-9988 at 7/20/17 5:37 AM:


[~jay.zhuang] two other things I noticed as far as the code goes
* The patch needs to be updated so that it applies cleanly to _trunk_.
* Some minor coding style updates are needed. I have left a comments on the 
[9988-trunk-onecommit-update2 | 
https://github.com/cooldoger/cassandra/commit/c5003812327d4475a7fb29f11686c13eeb50d693]
 branch.

Let me know if you need a hand with either of these. I am happy to make that 
change as well.


was (Author: anthony grasso):
[~jay.zhuang] one other thing I noticed is that the patch needs to be updated 
so that it applies cleanly to _trunk_. Let me know if you need a hand with 
that. I am happy to make that change as well.

> Introduce leaf-only iterator
> 
>
> Key: CASSANDRA-9988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9988
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Benedict
>Assignee: Jay Zhuang
>Priority: Minor
>  Labels: patch
> Fix For: 4.0
>
> Attachments: 9988-trunk-new.txt, 9988-trunk-new-update.txt, 
> trunk-9988.txt
>
>
> In many cases we have small btrees, small enough to fit in a single leaf 
> page. In this case it _may_ be more efficient to specialise our iterator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075925#comment-16075925
 ] 

ZhaoYang edited comment on CASSANDRA-13526 at 7/20/17 5:36 AM:
---

| branch | unit | dtest|
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] |  
running | running |
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.11]|  
running | running |
| [3.0|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-3.0]|  
running | running |
| [2.2|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526-2.2]|  
running | running |


when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  


was (Author: jasonstack):
| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13526] | 
[dtest-source|https://github.com/riptano/cassandra-dtest/commits/CASSANDRA-13526]
 |
| [unit|https://circleci.com/gh/jasonstack/cassandra/106] | dtest: 
{{cql_tests.py:SlowQueryTester.local_query_test}}{{cql_tests.py:SlowQueryTester.remote_query_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13592]
{{bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test}}[known|https://issues.apache.org/jira/browse/CASSANDRA-13576]
 |

when no local range && node has joined token ring,  clean up will remove all 
base local sstables.  

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-9988) Introduce leaf-only iterator

2017-07-19 Thread Anthony Grasso (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092832#comment-16092832
 ] 

Anthony Grasso edited comment on CASSANDRA-9988 at 7/20/17 5:33 AM:


Have started reviewing the code.

Ran the partition unit tests on the branch
{noformat}
ant test 
-Dtest.name=org.apache.cassandra.db.partition.PartitionImplementationTest
ant test -Dtest.name=org.apache.cassandra.db.partition.PartitionUpdateTest
{noformat}

So far so good.


was (Author: anthony grasso):
Have reviewed the code.

Ran the Microbench tests
{noformat}
ant test -Dtest.name=org.apache.cassandra.test.microbench.CompactionBench
ant test 
-Dtest.name=org.apache.cassandra.test.microbench.BTreeSearchIteratorBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.DirectorySizerBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.FastThreadExecutor
ant test -Dtest.name=org.apache.cassandra.test.microbench.FastThreadLocalBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.MutationBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.OutputStreamBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.OutputStreamBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.OutputStreamBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.PendingRangesBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.ReadWriteTest
ant test 
-Dtest.name=org.apache.cassandra.test.microbench.StreamingHistogramBench
ant test 
-Dtest.name=org.apache.cassandra.test.microbench.PartitionImplementationTest
{noformat}

Ran the partition unit tests
{noformat}
ant test 
-Dtest.name=org.apache.cassandra.db.partition.PartitionImplementationTest
ant test -Dtest.name=org.apache.cassandra.db.partition.PartitionUpdateTest
{noformat}

Changes look good to me.

> Introduce leaf-only iterator
> 
>
> Key: CASSANDRA-9988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9988
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Benedict
>Assignee: Jay Zhuang
>Priority: Minor
>  Labels: patch
> Fix For: 4.0
>
> Attachments: 9988-trunk-new.txt, 9988-trunk-new-update.txt, 
> trunk-9988.txt
>
>
> In many cases we have small btrees, small enough to fit in a single leaf 
> page. In this case it _may_ be more efficient to specialise our iterator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-19 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094195#comment-16094195
 ] 

Kurt Greaves commented on CASSANDRA-11500:
--

[~jasonstack] do you have a WIP branch you can link here?

> Obsolete MV entry may not be properly deleted
> -
>
> Key: CASSANDRA-11500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11500
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Sylvain Lebresne
>Assignee: ZhaoYang
>
> When a Materialized View uses a non-PK base table column in its PK, if an 
> update changes that column value, we add the new view entry and remove the 
> old one. When doing that removal, the current code uses the same timestamp 
> than for the liveness info of the new entry, which is the max timestamp for 
> any columns participating to the view PK. This is not correct for the 
> deletion as the old view entry could have other columns with higher timestamp 
> which won't be deleted as can easily shown by the failing of the following 
> test:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1;
> SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old 
> (invalid) and the new one
> {noformat}
> So the correct timestamp to use for the deletion is the biggest timestamp in 
> the old view entry (which we know since we read the pre-existing base row), 
> and that is what CASSANDRA-11475 does (the test above thus doesn't fail on 
> that branch).
> Unfortunately, even then we can still have problems if further updates 
> requires us to overide the old entry. Consider the following case:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the 
> entry for a=1 with timestamp 10
> UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; // This needs to re-insert 
> an entry for a=1 but shouldn't be deleted by the prior deletion
> UPDATE t USING TIMESTAMP 4 SET a = 2 WHERE k = 1; // ... and we can play this 
> game more than once
> UPDATE t USING TIMESTAMP 5 SET a = 1 WHERE k = 1;
> ...
> {noformat}
> In a way, this is saying that the "shadowable" deletion mechanism is not 
> general enough: we need to be able to re-insert an entry when a prior one had 
> been deleted before, but we can't rely on timestamps being strictly bigger on 
> the re-insert. In that sense, this can be though as a similar problem than 
> CASSANDRA-10965, though the solution there of a single flag is not enough 
> since we can have to replace more than once.
> I think the proper solution would be to ship enough information to always be 
> able to decide when a view deletion is shadowed. Which means that both 
> liveness info (for updates) and shadowable deletion would need to ship the 
> timestamp of any base table column that is part the view PK (so {{a}} in the 
> example below).  It's doable (and not that hard really), but it does require 
> a change to the sstable and intra-node protocol, which makes this a bit 
> painful right now.
> But I'll also note that as CASSANDRA-1096 shows, the timestamp is not even 
> enough since on equal timestamp the value can be the deciding factor. So in 
> theory we'd have to ship the value of those columns (in the case of a 
> deletion at least since we have it in the view PK for updates). That said, on 
> that last problem, my preference would be that we start prioritizing 
> CASSANDRA-6123 seriously so we don't have to care about conflicting timestamp 
> anymore, which would make this problem go away.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13142) Upgradesstables cancels compactions unnecessarily

2017-07-19 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094175#comment-16094175
 ] 

Jeff Jirsa commented on CASSANDRA-13142:


{quote} tests will need some serious thought{quote}

Byteman is your friend here - you can create a byteman rule pause the 
compaction after starting it, and then test how upgradesstables interacts with 
that (running but not proceeding) compaction task.


> Upgradesstables cancels compactions unnecessarily
> -
>
> Key: CASSANDRA-13142
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13142
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
> Attachments: 13142-v1.patch
>
>
> Since at least 1.2 upgradesstables will cancel any compactions bar 
> validations when run. This was originally determined as a non-issue in 
> CASSANDRA-3430 however can be quite annoying (especially with STCS) as a 
> compaction will output the new version anyway. Furthermore, as per 
> CASSANDRA-12243 it also stops things like view builds and I assume secondary 
> index builds as well which is not ideal.
> We should avoid cancelling compactions unnecessarily.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13142) Upgradesstables cancels compactions unnecessarily

2017-07-19 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094165#comment-16094165
 ] 

Kurt Greaves commented on CASSANDRA-13142:
--

[~krummas] Did you see my question above? Also, updated link to 
[branch|https://github.com/apache/cassandra/compare/trunk...kgreav:13142?expand=1]
 (didn't mean to PR the first time - just dumb)

If this path even works, tests will need some serious thought, but rather make 
sense of the code before jumping into that.

> Upgradesstables cancels compactions unnecessarily
> -
>
> Key: CASSANDRA-13142
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13142
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
> Attachments: 13142-v1.patch
>
>
> Since at least 1.2 upgradesstables will cancel any compactions bar 
> validations when run. This was originally determined as a non-issue in 
> CASSANDRA-3430 however can be quite annoying (especially with STCS) as a 
> compaction will output the new version anyway. Furthermore, as per 
> CASSANDRA-12243 it also stops things like view builds and I assume secondary 
> index builds as well which is not ideal.
> We should avoid cancelling compactions unnecessarily.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13142) Upgradesstables cancels compactions unnecessarily

2017-07-19 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002136#comment-16002136
 ] 

Kurt Greaves edited comment on CASSANDRA-13142 at 7/20/17 4:38 AM:
---

First go at 
https://github.com/apache/cassandra/compare/trunk...kgreav:13142?expand=1

I've attached the patch here as well.
Patch is against 2.2. doesn't apply cleanly to >=3.0 but happy to fix that once 
ready for commit.

I just wrote a unit test that seems to work reliably, however it only tests the 
interrupt method. It could be made more extensive if deemed necessary but 
wanted to see if anyone had any better ideas on testing first.


was (Author: kurtg):
First go at https://github.com/apache/cassandra/pull/110/files

I've attached the patch here as well.
Patch is against 2.2. doesn't apply cleanly to >=3.0 but happy to fix that once 
ready for commit.

I just wrote a unit test that seems to work reliably, however it only tests the 
interrupt method. It could be made more extensive if deemed necessary but 
wanted to see if anyone had any better ideas on testing first.

> Upgradesstables cancels compactions unnecessarily
> -
>
> Key: CASSANDRA-13142
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13142
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
> Attachments: 13142-v1.patch
>
>
> Since at least 1.2 upgradesstables will cancel any compactions bar 
> validations when run. This was originally determined as a non-issue in 
> CASSANDRA-3430 however can be quite annoying (especially with STCS) as a 
> compaction will output the new version anyway. Furthermore, as per 
> CASSANDRA-12243 it also stops things like view builds and I assume secondary 
> index builds as well which is not ideal.
> We should avoid cancelling compactions unnecessarily.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens

2017-07-19 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094160#comment-16094160
 ] 

Jeff Jirsa commented on CASSANDRA-13701:


Should be blocked by CASSANDRA-13348


> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12173) Materialized View may turn on TRACING

2017-07-19 Thread Kurt Greaves (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Greaves updated CASSANDRA-12173:
-
Priority: Minor  (was: Major)

> Materialized View may turn on TRACING
> -
>
> Key: CASSANDRA-12173
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12173
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Hiroshi Usami
>Priority: Minor
>
> We observed this in our test cluster(C*3.0.6), but TRAING was OFF apparently.
> After creating Materialized View, the Write count jumped up to 20K from 5K, 
> and the ViewWrite rose up to 10K.
> This is supposed to be done by MV, but some nodes which had 14,000+ SSTables 
> in the system_traces directory went down in a half day, because of running 
> out of file descriptors.
> {code}
> Counting by: find /var/lib/cassandra/data/system_traces/ -name "*-Data.db"|wc 
> -l
>   node01: 0
>   node02: 3
>   node03: 1
>   node04: 0
>   node05: 0
>   node06: 0
>   node07: 2
>   node08: 0
>   node09: 0
>   node10: 0
>   node11: 2
>   node12: 2
>   node13: 1
>   node14: 7
>   node15: 1
>   node16: 5
>   node17: 0
>   node18: 0
>   node19: 0
>   node20: 0
>   node21: 1
>   node22: 0
>   node23: 2
>   node24: 14420
>   node25: 0
>   node26: 2
>   node27: 0
>   node28: 1
>   node29: 1
>   node30: 2
>   node31: 1
>   node32: 0
>   node33: 0
>   node34: 0
>   node35: 14371
>   node36: 0
>   node37: 1
>   node38: 0
>   node39: 0
>   node40: 1
> {code}
> In node24, the sstabledump of the oldest SSTable in system_traces/events 
> directory starts with:
> {code}
> [
>   {
> "partition" : {
>   "key" : [ "e07851d0-4421-11e6-abd7-59d7f275ba79" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 30,
> "clustering" : [ "e07878e0-4421-11e6-abd7-59d7f275ba79" ],
> "liveness_info" : { "tstamp" : "2016-07-07T09:04:57.197Z", "ttl" : 
> 86400, "expires_at" : "2016-07-08T09:04:57Z", "expired" : true },
> "cells" : [
>   { "name" : "activity", "value" : "Parsing CREATE MATERIALIZED VIEW
> ...
> {code}
> So this could be the begining of TRACING ON implicitly. In node35, the oldest 
> one also starts with the "Parsing CREATE MATERIALIZED VIEW".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-12173) Materialized View may turn on TRACING

2017-07-19 Thread Kurt Greaves (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Greaves resolved CASSANDRA-12173.
--
Resolution: Cannot Reproduce

> Materialized View may turn on TRACING
> -
>
> Key: CASSANDRA-12173
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12173
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Hiroshi Usami
>
> We observed this in our test cluster(C*3.0.6), but TRAING was OFF apparently.
> After creating Materialized View, the Write count jumped up to 20K from 5K, 
> and the ViewWrite rose up to 10K.
> This is supposed to be done by MV, but some nodes which had 14,000+ SSTables 
> in the system_traces directory went down in a half day, because of running 
> out of file descriptors.
> {code}
> Counting by: find /var/lib/cassandra/data/system_traces/ -name "*-Data.db"|wc 
> -l
>   node01: 0
>   node02: 3
>   node03: 1
>   node04: 0
>   node05: 0
>   node06: 0
>   node07: 2
>   node08: 0
>   node09: 0
>   node10: 0
>   node11: 2
>   node12: 2
>   node13: 1
>   node14: 7
>   node15: 1
>   node16: 5
>   node17: 0
>   node18: 0
>   node19: 0
>   node20: 0
>   node21: 1
>   node22: 0
>   node23: 2
>   node24: 14420
>   node25: 0
>   node26: 2
>   node27: 0
>   node28: 1
>   node29: 1
>   node30: 2
>   node31: 1
>   node32: 0
>   node33: 0
>   node34: 0
>   node35: 14371
>   node36: 0
>   node37: 1
>   node38: 0
>   node39: 0
>   node40: 1
> {code}
> In node24, the sstabledump of the oldest SSTable in system_traces/events 
> directory starts with:
> {code}
> [
>   {
> "partition" : {
>   "key" : [ "e07851d0-4421-11e6-abd7-59d7f275ba79" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 30,
> "clustering" : [ "e07878e0-4421-11e6-abd7-59d7f275ba79" ],
> "liveness_info" : { "tstamp" : "2016-07-07T09:04:57.197Z", "ttl" : 
> 86400, "expires_at" : "2016-07-08T09:04:57Z", "expired" : true },
> "cells" : [
>   { "name" : "activity", "value" : "Parsing CREATE MATERIALIZED VIEW
> ...
> {code}
> So this could be the begining of TRACING ON implicitly. In node35, the oldest 
> one also starts with the "Parsing CREATE MATERIALIZED VIEW".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12173) Materialized View may turn on TRACING

2017-07-19 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094136#comment-16094136
 ] 

Kurt Greaves commented on CASSANDRA-12173:
--

OK thanks. I'll close this ticket in that case but if it happens again and you 
can find out more info re-open/create a new ticket.

> Materialized View may turn on TRACING
> -
>
> Key: CASSANDRA-12173
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12173
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Hiroshi Usami
>
> We observed this in our test cluster(C*3.0.6), but TRAING was OFF apparently.
> After creating Materialized View, the Write count jumped up to 20K from 5K, 
> and the ViewWrite rose up to 10K.
> This is supposed to be done by MV, but some nodes which had 14,000+ SSTables 
> in the system_traces directory went down in a half day, because of running 
> out of file descriptors.
> {code}
> Counting by: find /var/lib/cassandra/data/system_traces/ -name "*-Data.db"|wc 
> -l
>   node01: 0
>   node02: 3
>   node03: 1
>   node04: 0
>   node05: 0
>   node06: 0
>   node07: 2
>   node08: 0
>   node09: 0
>   node10: 0
>   node11: 2
>   node12: 2
>   node13: 1
>   node14: 7
>   node15: 1
>   node16: 5
>   node17: 0
>   node18: 0
>   node19: 0
>   node20: 0
>   node21: 1
>   node22: 0
>   node23: 2
>   node24: 14420
>   node25: 0
>   node26: 2
>   node27: 0
>   node28: 1
>   node29: 1
>   node30: 2
>   node31: 1
>   node32: 0
>   node33: 0
>   node34: 0
>   node35: 14371
>   node36: 0
>   node37: 1
>   node38: 0
>   node39: 0
>   node40: 1
> {code}
> In node24, the sstabledump of the oldest SSTable in system_traces/events 
> directory starts with:
> {code}
> [
>   {
> "partition" : {
>   "key" : [ "e07851d0-4421-11e6-abd7-59d7f275ba79" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 30,
> "clustering" : [ "e07878e0-4421-11e6-abd7-59d7f275ba79" ],
> "liveness_info" : { "tstamp" : "2016-07-07T09:04:57.197Z", "ttl" : 
> 86400, "expires_at" : "2016-07-08T09:04:57Z", "expired" : true },
> "cells" : [
>   { "name" : "activity", "value" : "Parsing CREATE MATERIALIZED VIEW
> ...
> {code}
> So this could be the begining of TRACING ON implicitly. In node35, the oldest 
> one also starts with the "Parsing CREATE MATERIALIZED VIEW".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094118#comment-16094118
 ] 

ASF GitHub Bot commented on CASSANDRA-13526:


GitHub user jasonstack opened a pull request:

https://github.com/apache/cassandra-dtest/pull/1

CASSANDRA-13526: nodetool cleanup on KS with no replicas should remov…

JIRA: https://issues.apache.org/jira/browse/CASSANDRA-13526   pending for 
2.2/3.0/3.11

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jasonstack/cassandra-dtest-1 CASSANDRA-13526

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/cassandra-dtest/pull/1.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1


commit 3c8877c0fa3eb998ed2ee9945ebb8d43687e65fa
Author: Zhao Yang 
Date:   2017-07-20T03:18:18Z

CASSANDRA-13526: nodetool cleanup on KS with no replicas should remove old 
data, not silently complete




> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13561) Purge TTL on expiration

2017-07-19 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094112#comment-16094112
 ] 

Kurt Greaves commented on CASSANDRA-13561:
--

I disregarded HH because HH cannot be relied on to provide consistency 
guarantees. The scenario is still bad if either the node is down for longer 
than HH, or if HH fails for some reason.

Also note that GCGS=0 disables hinted handoff for the table (at least last time 
I checked this hadn't changed). So it's not exactly the same.

I see the point that this could be useful for cases where a default TTL is set, 
however even with a default TTL you can still update/remove the TTL of columns. 
This means the risk is only really mitigated where you set a default TTL, and 
you never do anything to alter that TTL. 

The best you can currently in this case is that you could set GCGS=hinted 
handoff window and you don't sacrifice consistency, and you only keep the 
expired cells around for min 3 hours. This case is really perfectly fine when 
you are using TWCS/DTCS, as the SSTables should expire efficiently and it's 
unlikely you'd be querying expired data anyway.
The only case I can think of where you would really get a benefit without any 
risk is when using the same write strategy (default TTL/always TTL) but on a 
table that doesn't work with TWCS/DTCS, so you use STCS/LCS, and you also have 
to keep GCGS high because you also do manual deletes.

Your proposal would make it so that you can remove data a little bit faster if 
it compacts within GCGS. I'm a bit skeptical if that's actually necessary, 
especially with the introduction of {{provide_overlapping_tombstones}} in 3.10, 
which should allow much more efficient removal of tombstones.
Really if you're generating that many tombstones within GCGS+time to compaction 
per partition/PK that it is causing significant latency I'd be very surprised. 
I'd be interested to see some metrics surrounding this, and confirm that other 
options don't perform well enough first. Maybe you could give us an example of 
your use case where this gave the benefits?

I feel that this is a sufficiently dangerous option with hard to understand 
implications that we should need pretty good justification before making it 
readily available for configure on a table level. That just screams 
impending-shoot-yourself-in-the-foot. Maybe we could put some other safety net 
around it (e.g a property passed into C*) that doesn't allow changing it unless 
you start C* with that option set, but yeah, let's figure out some concrete 
benefits first.


> Purge TTL on expiration
> ---
>
> Key: CASSANDRA-13561
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13561
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Andrew Whang
>Assignee: Andrew Whang
>Priority: Minor
> Fix For: 4.0
>
>
> Tables with mostly TTL columns tend to suffer from high droppable tombstone 
> ratio, which results in higher read latency, cpu utilization, and disk usage. 
> Expired TTL data become tombstones, and the nature of purging tombstones 
> during compaction (due to checking for overlapping SSTables) make them 
> susceptible to surviving much longer than expected. A table option to purge 
> TTL on expiration would address this issue, by preventing them from becoming 
> tombstones. A boolean purge_ttl_on_expiration table setting would allow users 
> to easily turn the feature on or off. 
> Being more aggressive with gc_grace could also address the problem of long 
> lasting tombstones, but that would affect tombstones from deletes as well. 
> Even if a purged [expired] cell is revived via repair from a node that hasn't 
> yet compacted away the cell, it would be revived as an expiring cell with the 
> same localDeletionTime, so reads should properly handle them. As well, it 
> would be purged in the next compaction. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9988) Introduce leaf-only iterator

2017-07-19 Thread Anthony Grasso (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094089#comment-16094089
 ] 

Anthony Grasso commented on CASSANDRA-9988:
---

[~jay.zhuang] one other thing I noticed is that the patch needs to be updated 
so that it applies cleanly to _trunk_. Let me know if you need a hand with 
that. I am happy to make that change as well.

> Introduce leaf-only iterator
> 
>
> Key: CASSANDRA-9988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9988
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Benedict
>Assignee: Jay Zhuang
>Priority: Minor
>  Labels: patch
> Fix For: 4.0
>
> Attachments: 9988-trunk-new.txt, 9988-trunk-new-update.txt, 
> trunk-9988.txt
>
>
> In many cases we have small btrees, small enough to fit in a single leaf 
> page. In this case it _may_ be more efficient to specialise our iterator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9988) Introduce leaf-only iterator

2017-07-19 Thread Anthony Grasso (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094069#comment-16094069
 ] 

Anthony Grasso commented on CASSANDRA-9988:
---

[~benedict], that is a very good point about falling out of the CPU cache. As 
you suggest, we should modify the generation of the objects such that they 
include at the very least a UUID sequence and possibly a string sequence. In 
addition, we should add a fourth B-Tree test; {{btreeExtraLarge}} which has 
100k elements in it.

Before committing, it would be good to make the above changes and then rerun 
the JMH benchmarks again. [~jay.zhuang] if you are busy, I am happy to make the 
changes and re-run the benchmarks.

> Introduce leaf-only iterator
> 
>
> Key: CASSANDRA-9988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9988
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Benedict
>Assignee: Jay Zhuang
>Priority: Minor
>  Labels: patch
> Fix For: 4.0
>
> Attachments: 9988-trunk-new.txt, 9988-trunk-new-update.txt, 
> trunk-9988.txt
>
>
> In many cases we have small btrees, small enough to fit in a single leaf 
> page. In this case it _may_ be more efficient to specialise our iterator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens

2017-07-19 Thread Nate McCall (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094039#comment-16094039
 ] 

Nate McCall commented on CASSANDRA-13701:
-

[~cnlwsu] Thanks and agree with you, but we'll probably get some opinions on 
this. Regardless, can you add in:
- an update the NEWS.txt with a couple of sentences summarizing with ticket 
numbers
- a fix to the comment in {{cassandra.yaml}} right above {{num_tokens}} to 
point to http://cassandra.apache.org/doc/latest/operating/index.html instead of 
that dead wiki page 

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12173) Materialized View may turn on TRACING

2017-07-19 Thread Hiroshi Usami (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092752#comment-16092752
 ] 

Hiroshi Usami edited comment on CASSANDRA-12173 at 7/20/17 12:23 AM:
-

I checked the conversation log of Jul 2016 again if someone turned on TRACING, 
but I couldn't discover any track of that 
 kind of operation.
And I agree that closing this would be rational if we cannot proceed any more 
from here...


was (Author: hiusami):
I checked the conversation log of Jul 2016 again if someone turned on TRACING, 
but I couldn't discover any track of that 
 kind of operation.
And I agree that enclosing this would be rational if we cannot proceed any more 
from here...

> Materialized View may turn on TRACING
> -
>
> Key: CASSANDRA-12173
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12173
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Hiroshi Usami
>
> We observed this in our test cluster(C*3.0.6), but TRAING was OFF apparently.
> After creating Materialized View, the Write count jumped up to 20K from 5K, 
> and the ViewWrite rose up to 10K.
> This is supposed to be done by MV, but some nodes which had 14,000+ SSTables 
> in the system_traces directory went down in a half day, because of running 
> out of file descriptors.
> {code}
> Counting by: find /var/lib/cassandra/data/system_traces/ -name "*-Data.db"|wc 
> -l
>   node01: 0
>   node02: 3
>   node03: 1
>   node04: 0
>   node05: 0
>   node06: 0
>   node07: 2
>   node08: 0
>   node09: 0
>   node10: 0
>   node11: 2
>   node12: 2
>   node13: 1
>   node14: 7
>   node15: 1
>   node16: 5
>   node17: 0
>   node18: 0
>   node19: 0
>   node20: 0
>   node21: 1
>   node22: 0
>   node23: 2
>   node24: 14420
>   node25: 0
>   node26: 2
>   node27: 0
>   node28: 1
>   node29: 1
>   node30: 2
>   node31: 1
>   node32: 0
>   node33: 0
>   node34: 0
>   node35: 14371
>   node36: 0
>   node37: 1
>   node38: 0
>   node39: 0
>   node40: 1
> {code}
> In node24, the sstabledump of the oldest SSTable in system_traces/events 
> directory starts with:
> {code}
> [
>   {
> "partition" : {
>   "key" : [ "e07851d0-4421-11e6-abd7-59d7f275ba79" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 30,
> "clustering" : [ "e07878e0-4421-11e6-abd7-59d7f275ba79" ],
> "liveness_info" : { "tstamp" : "2016-07-07T09:04:57.197Z", "ttl" : 
> 86400, "expires_at" : "2016-07-08T09:04:57Z", "expired" : true },
> "cells" : [
>   { "name" : "activity", "value" : "Parsing CREATE MATERIALIZED VIEW
> ...
> {code}
> So this could be the begining of TRACING ON implicitly. In node35, the oldest 
> one also starts with the "Parsing CREATE MATERIALIZED VIEW".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens

2017-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093937#comment-16093937
 ] 

ASF GitHub Bot commented on CASSANDRA-13701:


GitHub user clohfink opened a pull request:

https://github.com/apache/cassandra/pull/132

Reduce default num_tokens for CASSANDRA-13701



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/clohfink/cassandra 13701

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/cassandra/pull/132.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #132


commit 87039ed693aff57d0eb59a5e55f2f65bf03fef33
Author: Chris Lohfink 
Date:   2017-07-19T23:15:36Z

Reduce default num_tokens for CASSANDRA-13701




> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13701) Lower default num_tokens

2017-07-19 Thread Chris Lohfink (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-13701:
--
Status: Patch Available  (was: Open)

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13701) Lower default num_tokens

2017-07-19 Thread Chris Lohfink (JIRA)
Chris Lohfink created CASSANDRA-13701:
-

 Summary: Lower default num_tokens
 Key: CASSANDRA-13701
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Lohfink
Assignee: Chris Lohfink
Priority: Minor


For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
necessary. It is very expensive for operations processes and scanning. Its come 
up a lot and its pretty standard and known now to always reduce the num_tokens 
within the community. We should just lower the defaults.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX

2017-07-19 Thread Romain Hardouin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093870#comment-16093870
 ] 

Romain Hardouin edited comment on CASSANDRA-13699 at 7/19/17 10:01 PM:
---

Thanks for the review. I fixed coding style while you wrote your comment, it 
should be correct. I triggered a build here 
https://circleci.com/gh/rhardouin/cassandra/3


was (Author: rha):
Thanks for the review. I fixed coding style while you wrote your comment, it 
should be correct. I trigged a build here 
https://circleci.com/gh/rhardouin/cassandra/3

> Allow to set batch_size_warn_threshold_in_kb via JMX
> 
>
> Key: CASSANDRA-13699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13699
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Romain Hardouin
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 13699-trunk.txt
>
>
> We can set {{batch_size_fail_threshold_in_kb}} via JMX but not 
> {{batch_size_warn_threshold_in_kb}}. 
> The patch allows to set it dynamically and adds a INFO log for both 
> thresholds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX

2017-07-19 Thread Romain Hardouin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093870#comment-16093870
 ] 

Romain Hardouin commented on CASSANDRA-13699:
-

Thanks for the review. I fixed coding style while you wrote your comment, it 
should be correct. I trigged a build here 
https://circleci.com/gh/rhardouin/cassandra/3

> Allow to set batch_size_warn_threshold_in_kb via JMX
> 
>
> Key: CASSANDRA-13699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13699
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Romain Hardouin
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 13699-trunk.txt
>
>
> We can set {{batch_size_fail_threshold_in_kb}} via JMX but not 
> {{batch_size_warn_threshold_in_kb}}. 
> The patch allows to set it dynamically and adds a INFO log for both 
> thresholds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13700) Heartbeats can cause gossip information to go permanently missing on certain nodes

2017-07-19 Thread Joel Knighton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton updated CASSANDRA-13700:
--
Description: 
In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} 
from the corresponding {{EndpointState}} to the {{EndpointState}} to send. When 
we're getting state for ourselves, this means that we add a reference to the 
local {{HeartBeatState}}. Then, once we've built a message (in either the Syn 
or Ack handler), we send it through the {{MessagingService}}. In the case that 
the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may run 
before serialization of the Syn or Ack. This means that when the {{GossipTask}} 
acquires the gossip {{taskLock}}, it may increment the {{HeartBeatState}} 
version of the local node as stored in the endpoint state map. Then, when we 
finally serialize the Syn or Ack, we'll follow the reference to the 
{{HeartBeatState}} and serialize it with a higher version than we saw when 
constructing the Ack or Ack2.

Consider the case where we see {{HeartBeatState}} with version 4 when 
constructing an Ack and send it through the {{MessagingService}}. Then, we add 
some piece of state with version 5 to our local {{EndpointState}}. If 
{{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before 
the {{MessageOut}} containing the Ack is serialized, the node receiving the Ack 
will believe it is current to version 6, despite the fact that it has never 
received a message containing the {{ApplicationState}} tagged with version 5.

I've reproduced in this in several versions; so far, I believe this is possible 
in all versions.

  was:
In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} 
from the corresponding {{EndpointState}} to the {{EndpointState}} to send. When 
we're getting state for ourselves, this means that we add a reference to the 
local {{HeartBeatState}}. Then, once we've built a message (in either the Syn 
or Ack handler), we send it through the {{MessagingService}}. In the case that 
the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may run 
before serialization of the Syn or Ack. This means that when the {{GossipTask}} 
acquires the gossip {{taskLock}}, it may increment the {{HeartBeatState}} 
version of the local node as stored in the endpoint state map. Then, when we 
finally serialize the Syn or Ack, we'll follow the reference to the 
{{HeartBeatState}} and serialize it with a higher version than we saw when 
constructing the Ack or Ack2.

Consider the case where we see {{HeartBeatState}} with version 4 when 
constructing an Ack and send it through the {{Messaging Service}}. Then, we add 
some piece of state with version 5 to our local {{EndpointState}}. If 
{{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before 
the {{MessageOut}} containing the Ack is serialized, the node receiving the Ack 
will believe it is current to version 6, despite the fact that it has never 
received a message containing the {{ApplicationState}} tagged with version 5.

I've reproduced in this in several versions; so far, I believe this is possible 
in all versions.


> Heartbeats can cause gossip information to go permanently missing on certain 
> nodes
> --
>
> Key: CASSANDRA-13700
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13700
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Joel Knighton
>Assignee: Joel Knighton
>Priority: Critical
>
> In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} 
> from the corresponding {{EndpointState}} to the {{EndpointState}} to send. 
> When we're getting state for ourselves, this means that we add a reference to 
> the local {{HeartBeatState}}. Then, once we've built a message (in either the 
> Syn or Ack handler), we send it through the {{MessagingService}}. In the case 
> that the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may 
> run before serialization of the Syn or Ack. This means that when the 
> {{GossipTask}} acquires the gossip {{taskLock}}, it may increment the 
> {{HeartBeatState}} version of the local node as stored in the endpoint state 
> map. Then, when we finally serialize the Syn or Ack, we'll follow the 
> reference to the {{HeartBeatState}} and serialize it with a higher version 
> than we saw when constructing the Ack or Ack2.
> Consider the case where we see {{HeartBeatState}} with version 4 when 
> constructing an Ack and send it through the {{MessagingService}}. Then, we 
> add some piece of state with version 5 to our local {{EndpointState}}. If 
> {{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before 
> the 

[jira] [Created] (CASSANDRA-13700) Heartbeats can cause gossip information to go permanently missing on certain nodes

2017-07-19 Thread Joel Knighton (JIRA)
Joel Knighton created CASSANDRA-13700:
-

 Summary: Heartbeats can cause gossip information to go permanently 
missing on certain nodes
 Key: CASSANDRA-13700
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13700
 Project: Cassandra
  Issue Type: Bug
  Components: Distributed Metadata
Reporter: Joel Knighton
Assignee: Joel Knighton
Priority: Critical


In {{Gossiper.getStateForVersionBiggerThan}}, we add the {{HeartBeatState}} 
from the corresponding {{EndpointState}} to the {{EndpointState}} to send. When 
we're getting state for ourselves, this means that we add a reference to the 
local {{HeartBeatState}}. Then, once we've built a message (in either the Syn 
or Ack handler), we send it through the {{MessagingService}}. In the case that 
the {{MessagingService}} is sufficiently slow, the {{GossipTask}} may run 
before serialization of the Syn or Ack. This means that when the {{GossipTask}} 
acquires the gossip {{taskLock}}, it may increment the {{HeartBeatState}} 
version of the local node as stored in the endpoint state map. Then, when we 
finally serialize the Syn or Ack, we'll follow the reference to the 
{{HeartBeatState}} and serialize it with a higher version than we saw when 
constructing the Ack or Ack2.

Consider the case where we see {{HeartBeatState}} with version 4 when 
constructing an Ack and send it through the {{Messaging Service}}. Then, we add 
some piece of state with version 5 to our local {{EndpointState}}. If 
{{GossipTask}} runs and increases the {{HeartBeatState}} version to 6 before 
the {{MessageOut}} containing the Ack is serialized, the node receiving the Ack 
will believe it is current to version 6, despite the fact that it has never 
received a message containing the {{ApplicationState}} tagged with version 5.

I've reproduced in this in several versions; so far, I believe this is possible 
in all versions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX

2017-07-19 Thread Romain Hardouin (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Hardouin updated CASSANDRA-13699:

Flags:   (was: Patch)

> Allow to set batch_size_warn_threshold_in_kb via JMX
> 
>
> Key: CASSANDRA-13699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13699
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Romain Hardouin
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 13699-trunk.txt
>
>
> We can set {{batch_size_fail_threshold_in_kb}} via JMX but not 
> {{batch_size_warn_threshold_in_kb}}. 
> The patch allows to set it dynamically and adds a INFO log for both 
> thresholds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX

2017-07-19 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13699:
---
Reviewer: Jeff Jirsa
  Status: Patch Available  (was: Open)

Looks reasonable. Can you fix the brace style to make sure they're on a new 
line?

We should also really run tests, make sure everything compiles and there are no 
surprises. Do you have a circleCI account to initiate them, or would you like 
me to run it?


> Allow to set batch_size_warn_threshold_in_kb via JMX
> 
>
> Key: CASSANDRA-13699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13699
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Romain Hardouin
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 13699-trunk.txt
>
>
> We can set {{batch_size_fail_threshold_in_kb}} via JMX but not 
> {{batch_size_warn_threshold_in_kb}}. 
> The patch allows to set it dynamically and adds a INFO log for both 
> thresholds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX

2017-07-19 Thread Romain Hardouin (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Hardouin updated CASSANDRA-13699:

Attachment: 13699-trunk.txt

> Allow to set batch_size_warn_threshold_in_kb via JMX
> 
>
> Key: CASSANDRA-13699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13699
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Romain Hardouin
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 13699-trunk.txt
>
>
> We can set {{batch_size_fail_threshold_in_kb}} via JMX but not 
> {{batch_size_warn_threshold_in_kb}}. 
> The patch allows to set it dynamically and adds a INFO log for both 
> thresholds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX

2017-07-19 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa reassigned CASSANDRA-13699:
--

Assignee: Romain Hardouin

> Allow to set batch_size_warn_threshold_in_kb via JMX
> 
>
> Key: CASSANDRA-13699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13699
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Romain Hardouin
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 13699-trunk.txt
>
>
> We can set {{batch_size_fail_threshold_in_kb}} via JMX but not 
> {{batch_size_warn_threshold_in_kb}}. 
> The patch allows to set it dynamically and adds a INFO log for both 
> thresholds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX

2017-07-19 Thread Romain Hardouin (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Hardouin updated CASSANDRA-13699:

Attachment: (was: 13699-trunk.txt)

> Allow to set batch_size_warn_threshold_in_kb via JMX
> 
>
> Key: CASSANDRA-13699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13699
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Romain Hardouin
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 13699-trunk.txt
>
>
> We can set {{batch_size_fail_threshold_in_kb}} via JMX but not 
> {{batch_size_warn_threshold_in_kb}}. 
> The patch allows to set it dynamically and adds a INFO log for both 
> thresholds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX

2017-07-19 Thread Romain Hardouin (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Hardouin updated CASSANDRA-13699:

Attachment: 13699-trunk.txt

> Allow to set batch_size_warn_threshold_in_kb via JMX
> 
>
> Key: CASSANDRA-13699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13699
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 13699-trunk.txt
>
>
> We can set {{batch_size_fail_threshold_in_kb}} via JMX but not 
> {{batch_size_warn_threshold_in_kb}}. 
> The patch allows to set it dynamically and adds a INFO log for both 
> thresholds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX

2017-07-19 Thread Romain Hardouin (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Hardouin updated CASSANDRA-13699:

Attachment: (was: 
0001-Allow-to-set-batch_size_warn_threshold_in_kb-via-JMX.patch)

> Allow to set batch_size_warn_threshold_in_kb via JMX
> 
>
> Key: CASSANDRA-13699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13699
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 13699-trunk.txt
>
>
> We can set {{batch_size_fail_threshold_in_kb}} via JMX but not 
> {{batch_size_warn_threshold_in_kb}}. 
> The patch allows to set it dynamically and adds a INFO log for both 
> thresholds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12694) PAXOS Update Corrupted empty row exception

2017-07-19 Thread Nimi Wariboko Jr. (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093748#comment-16093748
 ] 

Nimi Wariboko Jr. commented on CASSANDRA-12694:
---

Here's the stacktrace I saw with 3.10:

{code}
ERROR [ReadRepairStage:9] 2017-07-19 18:51:32,607 CassandraDaemon.java:229 - 
Exception in thread Thread[ReadRepairStage:9,5,main]
java.io.IOError: java.io.IOException: Corrupt empty row found in unfiltered 
partition
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.rows.UnfilteredRowIterators.digest(UnfilteredRowIterators.java:178)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators.digest(UnfilteredPartitionIterators.java:270)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.ReadResponse.makeDigest(ReadResponse.java:98) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.ReadResponse$DataResponse.digest(ReadResponse.java:203) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.service.DigestResolver.compareResponses(DigestResolver.java:87)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:233)
 ~[apache-cassandra-3.10.jar:3.10]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_91]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[na:1.8.0_91]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 ~[apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_91]
Caused by: java.io.IOException: Corrupt empty row found in unfiltered partition
at 
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:445)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222)
 ~[apache-cassandra-3.10.jar:3.10]
... 12 common frames omitted
{/code}

However, I upgraded to 3.11, and the issue seems to have gone away.

> PAXOS Update Corrupted empty row exception
> --
>
> Key: CASSANDRA-12694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
> Environment: 3 node cluster using RF=3 running on cassandra 3.7
>Reporter: Cameron Zemek
>Assignee: Alex Petrov
> Fix For: 3.0.11, 3.10
>
>
> {noformat}
> cqlsh> create table test.test (test_id TEXT, last_updated TIMESTAMP, 
> message_id TEXT, PRIMARY KEY(test_id));
> update test.test set last_updated = 1474494363669 where test_id = 'test1' if 
> message_id = null;
> {noformat}
> Then nodetool flush on the all 3 nodes.
> {noformat}
> cqlsh> update test.test set last_updated = 1474494363669 where test_id = 
> 'test1' if message_id = null;
> ServerError: 
> {noformat}
> From cassandra log
> {noformat}
> ERROR [SharedPool-Worker-1] 2016-09-23 12:09:13,179 Message.java:611 - 
> Unexpected exception during request; channel = [id: 0x7a22599e, 
> L:/127.0.0.1:9042 - R:/127.0.0.1:58297]
> java.io.IOError: java.io.IOException: Corrupt empty row found in unfiltered 
> partition
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:224)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:212)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators.digest(UnfilteredRowIterators.java:125)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators.digest(UnfilteredPartitionIterators.java:249)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.ReadResponse.makeDigest(ReadResponse.java:87) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.ReadResponse$DataResponse.digest(ReadResponse.java:192)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:80) 
> ~[main/:na]
> at 
> 

[jira] [Comment Edited] (CASSANDRA-12694) PAXOS Update Corrupted empty row exception

2017-07-19 Thread Nimi Wariboko Jr. (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093748#comment-16093748
 ] 

Nimi Wariboko Jr. edited comment on CASSANDRA-12694 at 7/19/17 8:35 PM:


Here's the stacktrace I saw with 3.10:

{code}
ERROR [ReadRepairStage:9] 2017-07-19 18:51:32,607 CassandraDaemon.java:229 - 
Exception in thread Thread[ReadRepairStage:9,5,main]
java.io.IOError: java.io.IOException: Corrupt empty row found in unfiltered 
partition
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.rows.UnfilteredRowIterators.digest(UnfilteredRowIterators.java:178)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators.digest(UnfilteredPartitionIterators.java:270)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.ReadResponse.makeDigest(ReadResponse.java:98) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.ReadResponse$DataResponse.digest(ReadResponse.java:203) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.service.DigestResolver.compareResponses(DigestResolver.java:87)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:233)
 ~[apache-cassandra-3.10.jar:3.10]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_91]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[na:1.8.0_91]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 ~[apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_91]
Caused by: java.io.IOException: Corrupt empty row found in unfiltered partition
at 
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:445)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222)
 ~[apache-cassandra-3.10.jar:3.10]
... 12 common frames omitted
{code}

However, I upgraded to 3.11, and the issue seems to have gone away.


was (Author: nemothekid):
Here's the stacktrace I saw with 3.10:

{code}
ERROR [ReadRepairStage:9] 2017-07-19 18:51:32,607 CassandraDaemon.java:229 - 
Exception in thread Thread[ReadRepairStage:9,5,main]
java.io.IOError: java.io.IOException: Corrupt empty row found in unfiltered 
partition
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.rows.UnfilteredRowIterators.digest(UnfilteredRowIterators.java:178)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators.digest(UnfilteredPartitionIterators.java:270)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.ReadResponse.makeDigest(ReadResponse.java:98) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.db.ReadResponse$DataResponse.digest(ReadResponse.java:203) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.service.DigestResolver.compareResponses(DigestResolver.java:87)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:233)
 ~[apache-cassandra-3.10.jar:3.10]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_91]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[na:1.8.0_91]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
 ~[apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_91]
Caused by: java.io.IOException: Corrupt empty row found in unfiltered partition
at 
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:445)
 ~[apache-cassandra-3.10.jar:3.10]
at 

[jira] [Commented] (CASSANDRA-12694) PAXOS Update Corrupted empty row exception

2017-07-19 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093747#comment-16093747
 ] 

Jeff Jirsa commented on CASSANDRA-12694:


It appears to be 
[here|https://github.com/apache/cassandra/commit/1e067746e432dc0a450ad111a8ec545011bb5bc7]
 , which seems to be 3.0.10, 3.10, 3.11.0

It's still missing the CHANGES, and the JIRA is typo'd in it, but it appears to 
be in 3.10

If you're still seeing it in 3.10, can you please paste the full stack trace? 




> PAXOS Update Corrupted empty row exception
> --
>
> Key: CASSANDRA-12694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
> Environment: 3 node cluster using RF=3 running on cassandra 3.7
>Reporter: Cameron Zemek
>Assignee: Alex Petrov
> Fix For: 3.0.11, 3.10
>
>
> {noformat}
> cqlsh> create table test.test (test_id TEXT, last_updated TIMESTAMP, 
> message_id TEXT, PRIMARY KEY(test_id));
> update test.test set last_updated = 1474494363669 where test_id = 'test1' if 
> message_id = null;
> {noformat}
> Then nodetool flush on the all 3 nodes.
> {noformat}
> cqlsh> update test.test set last_updated = 1474494363669 where test_id = 
> 'test1' if message_id = null;
> ServerError: 
> {noformat}
> From cassandra log
> {noformat}
> ERROR [SharedPool-Worker-1] 2016-09-23 12:09:13,179 Message.java:611 - 
> Unexpected exception during request; channel = [id: 0x7a22599e, 
> L:/127.0.0.1:9042 - R:/127.0.0.1:58297]
> java.io.IOError: java.io.IOException: Corrupt empty row found in unfiltered 
> partition
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:224)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:212)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators.digest(UnfilteredRowIterators.java:125)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators.digest(UnfilteredPartitionIterators.java:249)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.ReadResponse.makeDigest(ReadResponse.java:87) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.ReadResponse$DataResponse.digest(ReadResponse.java:192)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:80) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:139) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.AbstractReadExecutor.get(AbstractReadExecutor.java:145)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(StorageProxy.java:1714)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1663) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1604) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1523) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.readOne(StorageProxy.java:1497) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.readOne(StorageProxy.java:1491) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.cas(StorageProxy.java:249) 
> ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.ModificationStatement.executeWithCondition(ModificationStatement.java:441)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:416)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:208)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:239) 
> ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:224) 
> ~[main/:na]
> at 
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115)
>  ~[main/:na]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
>  [main/:na]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
>  [main/:na]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 

[jira] [Created] (CASSANDRA-13699) Allow to set batch_size_warn_threshold_in_kb via JMX

2017-07-19 Thread Romain Hardouin (JIRA)
Romain Hardouin created CASSANDRA-13699:
---

 Summary: Allow to set batch_size_warn_threshold_in_kb via JMX
 Key: CASSANDRA-13699
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13699
 Project: Cassandra
  Issue Type: Improvement
Reporter: Romain Hardouin
Priority: Minor
 Fix For: 4.x
 Attachments: 
0001-Allow-to-set-batch_size_warn_threshold_in_kb-via-JMX.patch

We can set {{batch_size_fail_threshold_in_kb}} via JMX but not 
{{batch_size_warn_threshold_in_kb}}. 

The patch allows to set it dynamically and adds a INFO log for both thresholds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12694) PAXOS Update Corrupted empty row exception

2017-07-19 Thread Nimi Wariboko Jr. (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093636#comment-16093636
 ] 

Nimi Wariboko Jr. commented on CASSANDRA-12694:
---

Was the issue resolved in 3.10, or 3.11? Do I have to do anything special to 
fix a table after upgrading to 3.10? I'm still hitting this error when trying 
to update a static column with LWT.

> PAXOS Update Corrupted empty row exception
> --
>
> Key: CASSANDRA-12694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
> Environment: 3 node cluster using RF=3 running on cassandra 3.7
>Reporter: Cameron Zemek
>Assignee: Alex Petrov
> Fix For: 3.0.11, 3.10
>
>
> {noformat}
> cqlsh> create table test.test (test_id TEXT, last_updated TIMESTAMP, 
> message_id TEXT, PRIMARY KEY(test_id));
> update test.test set last_updated = 1474494363669 where test_id = 'test1' if 
> message_id = null;
> {noformat}
> Then nodetool flush on the all 3 nodes.
> {noformat}
> cqlsh> update test.test set last_updated = 1474494363669 where test_id = 
> 'test1' if message_id = null;
> ServerError: 
> {noformat}
> From cassandra log
> {noformat}
> ERROR [SharedPool-Worker-1] 2016-09-23 12:09:13,179 Message.java:611 - 
> Unexpected exception during request; channel = [id: 0x7a22599e, 
> L:/127.0.0.1:9042 - R:/127.0.0.1:58297]
> java.io.IOError: java.io.IOException: Corrupt empty row found in unfiltered 
> partition
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:224)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:212)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators.digest(UnfilteredRowIterators.java:125)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators.digest(UnfilteredPartitionIterators.java:249)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.ReadResponse.makeDigest(ReadResponse.java:87) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.ReadResponse$DataResponse.digest(ReadResponse.java:192)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:80) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:139) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.AbstractReadExecutor.get(AbstractReadExecutor.java:145)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(StorageProxy.java:1714)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1663) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1604) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1523) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.readOne(StorageProxy.java:1497) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.readOne(StorageProxy.java:1491) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy.cas(StorageProxy.java:249) 
> ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.ModificationStatement.executeWithCondition(ModificationStatement.java:441)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:416)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:208)
>  ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:239) 
> ~[main/:na]
> at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:224) 
> ~[main/:na]
> at 
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115)
>  ~[main/:na]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
>  [main/:na]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
>  [main/:na]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13694) sstabledump does not show full precision of timestamp columns

2017-07-19 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093606#comment-16093606
 ] 

Jeff Jirsa commented on CASSANDRA-13694:


{{TimestampSerializer}} is used for a lot of things other than SSTableExport - 
changing its default seems like a big hammer?


> sstabledump does not show full precision of timestamp columns
> -
>
> Key: CASSANDRA-13694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Ubuntu 16.04 LTS
>Reporter: Tim Reeves
>  Labels: patch-available
> Fix For: 3.7
>
> Attachments: CASSANDRA-13694.patch
>
>
> Create a table:
> CREATE TABLE test_table (
> unit_no bigint,
> event_code text,
> active_time timestamp,
> ack_time timestamp,
> PRIMARY KEY ((unit_no, event_code), active_time)
> ) WITH CLUSTERING ORDER BY (active_time DESC)
> Insert a row:
> INSERT INTO test_table (unit_no, event_code, active_time, ack_time)
>   VALUES (1234, 'TEST EVENT', toTimestamp(now()), 
> toTimestamp(now()));
> Verify that it is in the database with a full timestamp:
> cqlsh:pentaho> select * from test_table;
>  unit_no | event_code | active_time | ack_time
> -++-+-
> 1234 | TEST EVENT | 2017-07-14 14:52:39.919000+ | 2017-07-14 
> 14:52:39.919000+
> (1 rows)
> Write file:
> nodetool flush
> nodetool compact pentaho
> Use sstabledump:
> treeves@ubuntu:~$ sstabledump 
> /var/lib/cassandra/data/pentaho/test_table-99ba228068a311e7ac30953b79ac2c3e/mb-2-big-Data.db
> [
>   {
> "partition" : {
>   "key" : [ "1234", "TEST EVENT" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 38,
> "clustering" : [ "2017-07-14 15:52+0100" ],
> "liveness_info" : { "tstamp" : "2017-07-14T14:52:39.888701Z" },
> "cells" : [
>   { "name" : "ack_time", "value" : "2017-07-14 15:52+0100" }
> ]
>   }
> ]
>   }
> ]
> treeves@ubuntu:~$ 
> The timestamp in the cluster key, and the regular column, are both truncated 
> to the minute.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13694) sstabledump does not show full precision of timestamp columns

2017-07-19 Thread Varun Barala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Barala updated CASSANDRA-13694:
-
Status: Ready to Commit  (was: Patch Available)

> sstabledump does not show full precision of timestamp columns
> -
>
> Key: CASSANDRA-13694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Ubuntu 16.04 LTS
>Reporter: Tim Reeves
>  Labels: patch
> Fix For: 3.7
>
> Attachments: CASSANDRA-13694.patch
>
>
> Create a table:
> CREATE TABLE test_table (
> unit_no bigint,
> event_code text,
> active_time timestamp,
> ack_time timestamp,
> PRIMARY KEY ((unit_no, event_code), active_time)
> ) WITH CLUSTERING ORDER BY (active_time DESC)
> Insert a row:
> INSERT INTO test_table (unit_no, event_code, active_time, ack_time)
>   VALUES (1234, 'TEST EVENT', toTimestamp(now()), 
> toTimestamp(now()));
> Verify that it is in the database with a full timestamp:
> cqlsh:pentaho> select * from test_table;
>  unit_no | event_code | active_time | ack_time
> -++-+-
> 1234 | TEST EVENT | 2017-07-14 14:52:39.919000+ | 2017-07-14 
> 14:52:39.919000+
> (1 rows)
> Write file:
> nodetool flush
> nodetool compact pentaho
> Use sstabledump:
> treeves@ubuntu:~$ sstabledump 
> /var/lib/cassandra/data/pentaho/test_table-99ba228068a311e7ac30953b79ac2c3e/mb-2-big-Data.db
> [
>   {
> "partition" : {
>   "key" : [ "1234", "TEST EVENT" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 38,
> "clustering" : [ "2017-07-14 15:52+0100" ],
> "liveness_info" : { "tstamp" : "2017-07-14T14:52:39.888701Z" },
> "cells" : [
>   { "name" : "ack_time", "value" : "2017-07-14 15:52+0100" }
> ]
>   }
> ]
>   }
> ]
> treeves@ubuntu:~$ 
> The timestamp in the cluster key, and the regular column, are both truncated 
> to the minute.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13694) sstabledump does not show full precision of timestamp columns

2017-07-19 Thread Varun Barala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Barala updated CASSANDRA-13694:
-
Labels: patch-available  (was: patch)
Status: Patch Available  (was: Open)

> sstabledump does not show full precision of timestamp columns
> -
>
> Key: CASSANDRA-13694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Ubuntu 16.04 LTS
>Reporter: Tim Reeves
>  Labels: patch-available
> Fix For: 3.7
>
> Attachments: CASSANDRA-13694.patch
>
>
> Create a table:
> CREATE TABLE test_table (
> unit_no bigint,
> event_code text,
> active_time timestamp,
> ack_time timestamp,
> PRIMARY KEY ((unit_no, event_code), active_time)
> ) WITH CLUSTERING ORDER BY (active_time DESC)
> Insert a row:
> INSERT INTO test_table (unit_no, event_code, active_time, ack_time)
>   VALUES (1234, 'TEST EVENT', toTimestamp(now()), 
> toTimestamp(now()));
> Verify that it is in the database with a full timestamp:
> cqlsh:pentaho> select * from test_table;
>  unit_no | event_code | active_time | ack_time
> -++-+-
> 1234 | TEST EVENT | 2017-07-14 14:52:39.919000+ | 2017-07-14 
> 14:52:39.919000+
> (1 rows)
> Write file:
> nodetool flush
> nodetool compact pentaho
> Use sstabledump:
> treeves@ubuntu:~$ sstabledump 
> /var/lib/cassandra/data/pentaho/test_table-99ba228068a311e7ac30953b79ac2c3e/mb-2-big-Data.db
> [
>   {
> "partition" : {
>   "key" : [ "1234", "TEST EVENT" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 38,
> "clustering" : [ "2017-07-14 15:52+0100" ],
> "liveness_info" : { "tstamp" : "2017-07-14T14:52:39.888701Z" },
> "cells" : [
>   { "name" : "ack_time", "value" : "2017-07-14 15:52+0100" }
> ]
>   }
> ]
>   }
> ]
> treeves@ubuntu:~$ 
> The timestamp in the cluster key, and the regular column, are both truncated 
> to the minute.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13694) sstabledump does not show full precision of timestamp columns

2017-07-19 Thread Varun Barala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Barala updated CASSANDRA-13694:
-
Status: Open  (was: Ready to Commit)

> sstabledump does not show full precision of timestamp columns
> -
>
> Key: CASSANDRA-13694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Ubuntu 16.04 LTS
>Reporter: Tim Reeves
>  Labels: patch
> Fix For: 3.7
>
> Attachments: CASSANDRA-13694.patch
>
>
> Create a table:
> CREATE TABLE test_table (
> unit_no bigint,
> event_code text,
> active_time timestamp,
> ack_time timestamp,
> PRIMARY KEY ((unit_no, event_code), active_time)
> ) WITH CLUSTERING ORDER BY (active_time DESC)
> Insert a row:
> INSERT INTO test_table (unit_no, event_code, active_time, ack_time)
>   VALUES (1234, 'TEST EVENT', toTimestamp(now()), 
> toTimestamp(now()));
> Verify that it is in the database with a full timestamp:
> cqlsh:pentaho> select * from test_table;
>  unit_no | event_code | active_time | ack_time
> -++-+-
> 1234 | TEST EVENT | 2017-07-14 14:52:39.919000+ | 2017-07-14 
> 14:52:39.919000+
> (1 rows)
> Write file:
> nodetool flush
> nodetool compact pentaho
> Use sstabledump:
> treeves@ubuntu:~$ sstabledump 
> /var/lib/cassandra/data/pentaho/test_table-99ba228068a311e7ac30953b79ac2c3e/mb-2-big-Data.db
> [
>   {
> "partition" : {
>   "key" : [ "1234", "TEST EVENT" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 38,
> "clustering" : [ "2017-07-14 15:52+0100" ],
> "liveness_info" : { "tstamp" : "2017-07-14T14:52:39.888701Z" },
> "cells" : [
>   { "name" : "ack_time", "value" : "2017-07-14 15:52+0100" }
> ]
>   }
> ]
>   }
> ]
> treeves@ubuntu:~$ 
> The timestamp in the cluster key, and the regular column, are both truncated 
> to the minute.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily

2017-07-19 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093491#comment-16093491
 ] 

Jay Zhuang commented on CASSANDRA-13696:


{quote}
Question: does this happen in mixed version cluster or all the nodes actually 
have the same protocol version?
{quote}
All the nodes are on the same messagingVersion 
{{[VERSION_3014|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/net/MessagingService.java#L95]}}
It can be reproduced with a new 3.0.14 cluster.

> Digest mismatch Exception if hints file has UnknownColumnFamily
> ---
>
> Key: CASSANDRA-13696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Blocker
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> {noformat}
> WARN  [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - 
> Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - 
> table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints
> ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 
> HintsDispatchExecutor.java:234 - Failed to dispatch hints file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
> exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
>  [main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [main/:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
> Caused by: java.io.IOException: Digest mismatch exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190)
>  ~[main/:na]
> ... 16 common frames omitted
> {noformat}
> It causes multiple cassandra nodes stop [by 
> default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188].
> Here is the reproduce steps on a 3 nodes cluster, RF=3:
> 1. stop node1
> 2. send some data with quorum (or one), it will generate hints file on 
> node2/node3
> 3. drop the table
> 4. start node1
> node2/node3 will report "corrupted hints file" and stop. The impact is very 
> bad for a large cluster, when it happens, almost all the nodes are down at 
> the same time and we have to remove all the hints files (which contain the 
> dropped table) to bring the node back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Issue Comment Deleted] (CASSANDRA-13561) Purge TTL on expiration

2017-07-19 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13561:
---
Comment: was deleted

(was: I'm just trying to page this into my mind and also try to correlate with 
other recent tickets I've seen. This seems pretty close to what [~bdeggleston] 
touched on with CASSANDRA-13643 , though this is more aggressive (and more 
invasive, in the sense that it needs a new table property). Does Blake's 
addition to call purge on the tombstone generated have a similar effect here?


)

> Purge TTL on expiration
> ---
>
> Key: CASSANDRA-13561
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13561
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Andrew Whang
>Assignee: Andrew Whang
>Priority: Minor
> Fix For: 4.0
>
>
> Tables with mostly TTL columns tend to suffer from high droppable tombstone 
> ratio, which results in higher read latency, cpu utilization, and disk usage. 
> Expired TTL data become tombstones, and the nature of purging tombstones 
> during compaction (due to checking for overlapping SSTables) make them 
> susceptible to surviving much longer than expected. A table option to purge 
> TTL on expiration would address this issue, by preventing them from becoming 
> tombstones. A boolean purge_ttl_on_expiration table setting would allow users 
> to easily turn the feature on or off. 
> Being more aggressive with gc_grace could also address the problem of long 
> lasting tombstones, but that would affect tombstones from deletes as well. 
> Even if a purged [expired] cell is revived via repair from a node that hasn't 
> yet compacted away the cell, it would be revived as an expiring cell with the 
> same localDeletionTime, so reads should properly handle them. As well, it 
> would be purged in the next compaction. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13561) Purge TTL on expiration

2017-07-19 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093463#comment-16093463
 ] 

Jeff Jirsa commented on CASSANDRA-13561:


I'm just trying to page this into my mind and also try to correlate with other 
recent tickets I've seen. This seems pretty close to what [~bdeggleston] 
touched on with CASSANDRA-13643 , though this is more aggressive (and more 
invasive, in the sense that it needs a new table property). Does Blake's 
addition to call purge on the tombstone generated have a similar effect here?




> Purge TTL on expiration
> ---
>
> Key: CASSANDRA-13561
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13561
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Andrew Whang
>Assignee: Andrew Whang
>Priority: Minor
> Fix For: 4.0
>
>
> Tables with mostly TTL columns tend to suffer from high droppable tombstone 
> ratio, which results in higher read latency, cpu utilization, and disk usage. 
> Expired TTL data become tombstones, and the nature of purging tombstones 
> during compaction (due to checking for overlapping SSTables) make them 
> susceptible to surviving much longer than expected. A table option to purge 
> TTL on expiration would address this issue, by preventing them from becoming 
> tombstones. A boolean purge_ttl_on_expiration table setting would allow users 
> to easily turn the feature on or off. 
> Being more aggressive with gc_grace could also address the problem of long 
> lasting tombstones, but that would affect tombstones from deletes as well. 
> Even if a purged [expired] cell is revived via repair from a node that hasn't 
> yet compacted away the cell, it would be revived as an expiring cell with the 
> same localDeletionTime, so reads should properly handle them. As well, it 
> would be purged in the next compaction. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13561) Purge TTL on expiration

2017-07-19 Thread Andrew Whang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093343#comment-16093343
 ] 

Andrew Whang commented on CASSANDRA-13561:
--

Correct, this implementation would still require a compaction to purge the 
expired cells.

Also correct, in the scenario you described there is a risk of the expired cell 
returning from the dead. The scenario does disregard HH. If hints are properly 
delivered, the risk is mitigated. To be clear, the scenario is similar to 
setting GCGS=0. The user has to understand the risk of using these settings. 

The risk is also mitigated in use cases where there is a default TTL on the 
table or client mutations use a default TTL. These are the scenarios for which 
we use this feature in our environment. In these use cases, we noticed the 
table suffered from high droppable tombstone ratio and high read latency. Using 
this feature to purge TTL as they expired addressed both droppable tombstone 
ratio and read latency. 

> Purge TTL on expiration
> ---
>
> Key: CASSANDRA-13561
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13561
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Andrew Whang
>Assignee: Andrew Whang
>Priority: Minor
> Fix For: 4.0
>
>
> Tables with mostly TTL columns tend to suffer from high droppable tombstone 
> ratio, which results in higher read latency, cpu utilization, and disk usage. 
> Expired TTL data become tombstones, and the nature of purging tombstones 
> during compaction (due to checking for overlapping SSTables) make them 
> susceptible to surviving much longer than expected. A table option to purge 
> TTL on expiration would address this issue, by preventing them from becoming 
> tombstones. A boolean purge_ttl_on_expiration table setting would allow users 
> to easily turn the feature on or off. 
> Being more aggressive with gc_grace could also address the problem of long 
> lasting tombstones, but that would affect tombstones from deletes as well. 
> Even if a purged [expired] cell is revived via repair from a node that hasn't 
> yet compacted away the cell, it would be revived as an expiring cell with the 
> same localDeletionTime, so reads should properly handle them. As well, it 
> would be purged in the next compaction. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11483) Enhance sstablemetadata

2017-07-19 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-11483:

Reviewer: Marcus Eriksson  (was: Yuki Morishita)

> Enhance sstablemetadata
> ---
>
> Key: CASSANDRA-11483
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11483
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Fix For: 4.0
>
> Attachments: CASSANDRA-11483.txt, CASSANDRA-11483v2.txt, 
> CASSANDRA-11483v3.txt, CASSANDRA-11483v4.txt, CASSANDRA-11483v5.txt, Screen 
> Shot 2016-04-03 at 11.40.32 PM.png
>
>
> sstablemetadata provides quite a bit of useful information but theres a few 
> hiccups I would like to see addressed:
> * Does not use client mode
> * Units are not provided (or anything for that matter). There is data in 
> micros, millis, seconds as durations and timestamps from epoch. But there is 
> no way to tell what one is without a non-trival code dive
> * in general pretty frustrating to parse



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093220#comment-16093220
 ] 

ZhaoYang edited comment on CASSANDRA-13526 at 7/19/17 2:54 PM:
---

thanks for reviewing, I will back port to 3.0/3.11 this week.  I was stuck in 
other issues..


was (Author: jasonstack):
thanks for reviewing, I will back port to 3.0/3.11.  I was stuck in other 
issues..

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13526) nodetool cleanup on KS with no replicas should remove old data, not silently complete

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093220#comment-16093220
 ] 

ZhaoYang commented on CASSANDRA-13526:
--

thanks for reviewing, I will back port to 3.0/3.11.  I was stuck in other 
issues..

> nodetool cleanup on KS with no replicas should remove old data, not silently 
> complete
> -
>
> Key: CASSANDRA-13526
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13526
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: ZhaoYang
>  Labels: usability
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> From the user list:
> https://lists.apache.org/thread.html/5d49cc6bbc6fd2e5f8b12f2308a3e24212a55afbb441af5cb8cd4167@%3Cuser.cassandra.apache.org%3E
> If you have a multi-dc cluster, but some keyspaces not replicated to a given 
> DC, you'll be unable to run cleanup on those keyspaces in that DC, because 
> [the cleanup code will see no ranges and exit 
> early|https://github.com/apache/cassandra/blob/4cfaf85/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L427-L441]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089856#comment-16089856
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/19/17 2:26 PM:
---

I plan to solve: {{partial update}},{{ttl}}, {{co-existed shadowable 
tombstone}}, {{view timestamp tie}} all inside this ticket using extended 
shadowable approach(mentioned 
[here|https://issues.apache.org/jira/browse/CASSANDRA-11500?focusedCommentId=16082241=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16082241]).
 Because all these issues require some storage format changes(extendedFlag), 
it's better to fix them and refactor in one commit.

I will drafrt a patch using {{ViewTombstone}} and {{ViewLiveness}}.

Any suggestions would be appreciated.




was (Author: jasonstack):
I plan to solve: {{partial update}},{{ttl}}, {{co-existed shadowable 
tombstone}}, {{view timestamp tie}} all inside this ticket using extended 
shadowable approach(mentioned 
[here|https://issues.apache.org/jira/browse/CASSANDRA-11500?focusedCommentId=16082241=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16082241]).
 Because all these issues require some storage format changes(extendedFlag), 
it's better to fix them and refactor in one commit.

Drafted a 
[patch|https://github.com/jasonstack/cassandra/commits/CASSANDRA-11500-update-time]..(refactoring
 and adding more dtest)  

Any suggestions would be appreciated.



> Obsolete MV entry may not be properly deleted
> -
>
> Key: CASSANDRA-11500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11500
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Sylvain Lebresne
>Assignee: ZhaoYang
>
> When a Materialized View uses a non-PK base table column in its PK, if an 
> update changes that column value, we add the new view entry and remove the 
> old one. When doing that removal, the current code uses the same timestamp 
> than for the liveness info of the new entry, which is the max timestamp for 
> any columns participating to the view PK. This is not correct for the 
> deletion as the old view entry could have other columns with higher timestamp 
> which won't be deleted as can easily shown by the failing of the following 
> test:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 4 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1;
> SELECT * FROM mv WHERE k = 1; // This currently return 2 entries, the old 
> (invalid) and the new one
> {noformat}
> So the correct timestamp to use for the deletion is the biggest timestamp in 
> the old view entry (which we know since we read the pre-existing base row), 
> and that is what CASSANDRA-11475 does (the test above thus doesn't fail on 
> that branch).
> Unfortunately, even then we can still have problems if further updates 
> requires us to overide the old entry. Consider the following case:
> {noformat}
> CREATE TABLE t (k int PRIMARY KEY, a int, b int);
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS 
> NOT NULL PRIMARY KEY (k, a);
> INSERT INTO t(k, a, b) VALUES (1, 1, 1) USING TIMESTAMP 0;
> UPDATE t USING TIMESTAMP 10 SET b = 2 WHERE k = 1;
> UPDATE t USING TIMESTAMP 2 SET a = 2 WHERE k = 1; // This will delete the 
> entry for a=1 with timestamp 10
> UPDATE t USING TIMESTAMP 3 SET a = 1 WHERE k = 1; // This needs to re-insert 
> an entry for a=1 but shouldn't be deleted by the prior deletion
> UPDATE t USING TIMESTAMP 4 SET a = 2 WHERE k = 1; // ... and we can play this 
> game more than once
> UPDATE t USING TIMESTAMP 5 SET a = 1 WHERE k = 1;
> ...
> {noformat}
> In a way, this is saying that the "shadowable" deletion mechanism is not 
> general enough: we need to be able to re-insert an entry when a prior one had 
> been deleted before, but we can't rely on timestamps being strictly bigger on 
> the re-insert. In that sense, this can be though as a similar problem than 
> CASSANDRA-10965, though the solution there of a single flag is not enough 
> since we can have to replace more than once.
> I think the proper solution would be to ship enough information to always be 
> able to decide when a view deletion is shadowed. Which means that both 
> liveness info (for updates) and shadowable deletion would need to ship the 
> timestamp of any base table column that is part the view PK (so {{a}} in the 
> example below).  It's doable (and not that hard really), but it does require 
> a change to the sstable and intra-node protocol, which makes this a bit 
> painful 

[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-07-19 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082241#comment-16082241
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 7/19/17 2:23 PM:
---

h3. Relation: base -> view

First of all, I think all of us should agree on what cases view row should 
exists.

IMO, there are two main cases:

1. base pk and view pk are the same (order doesn't matter) and view has no 
filter conditions or only conditions on base pk.
(filter condition is not a concern here, since no previous view data to be 
cleared)

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) and base row pk satifies 
view's filter conditions if any.
* b. or one of base row columns selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. this is 
handled by existing mechanism of liveness and tombstone since all info are 
included in view row
* c. or one of base row columns not selected in view has live timestamp (via 
update) and base row pk satifies view's filter conditions if any. Those 
unselected columns' timestamp/ttl/cell-deletion info currently are not stored 
on view row. 

2. base column used in view pk or view has filter conditions on base non-key 
column which can also lead to entire view row being wiped.

view row exists if any of following is true:
* a. base row pk has live livenessInfo(timestamp) && base column used in view 
pk is not null but no timestamp && conditions are satisfied. ( pk having live 
livenesInfo means it is not deleted by tombstone)
* b. or base row column in view pk has timestamp (via update) && conditions are 
satisfied. eg. if base column used in view pk is TTLed, entire view row should 
be wiped.

Next thing is to model "shadowable tombstone or shadowable liveness" to 
maintain view data based on above cases.
 
h3. Previous known issues: 
(I might miss some issues, feel free to ping me..)

ttl
* view row is not wiped when TTLed on base column used in view pk or TTLed on 
base non-key column with filter condition
* cells with same timestamp, merging ttls are not deterministic.

partial update on base columns not selected in view
* it results in no view data. because of current update semantics, no view 
updates are generated
* corresponding view row liveness is not depending on liveness of base columns

filter conditions or base column used in view pk causes
* view row is shadowed after a few modification on base column used in view pk 
if the base non-key column has TS greater than base pk's ts and view key 
column's ts. (as mentioned by sylvain: we need to be able to re-insert an entry 
when a prior one had been deleted need to be careful to hanlde timestamp tie)

tombstone merging is not commutative
* in current code, shadowable tombstone doesn't co-exist with regular tombstone

sstabledump doesn't not support current shadowable tombstone

h3. Model

I can think of two ways to ship all required base column info to view:
* make base columns that are not selected in view as "virtual cell" and 
store their imestamp/ttl to view without their actual values. so we can reuse 
current ts/tb/ttl mechanism with additional validation logic to check if a view 
row is alive.
* or storing those info on view's livenessInfo/deletion with addition merge 
logic. 

I will go ahead with second way since there is an existing shadowable tombstone 
mechanism.


View PrimaryKey LivenessInfo, its timestamp, payloads, merging

{code}
ColumnInfo: // generated from base column
0. timestamp
1. ttl 
2. localDeletionTime:  could be used to represent tombstone or TTLed 
depends on if there is ttl

supersedes(): if timestamps are different, greater timestamp 
supersedes; if timestamps are same, greater localDeletionTime supersedes.

ViewLivenessInfo
// corresponding to base pk livenessInfo
0. timestamp
1. ttl / localDeletionTime

// base column that are used in view pk or has filter condition.
// if any column is not live or doesn't exist, entire view row is wiped.
// if a column in base is filtered and not selected, it's stored here.
2. Map keyOrConditions; 

// if any column is live
3. Map unselected;

// to determina if a row is live
isRowAlive(Deletion delete):
get timestamp or columnInfo that is greater than those in Deletion

if any colummn in {{keyOrConditions}} is TTLed or tombstone(dead) 
or not existed, false
if {{timestamp or ttl}} are alive, true
if any column in {{unselected}} is alive, true
otherwise check any columns in view row are alive

// cannot use supersedes, because timestamp can tie, we cannot compare 
keyOrConditions.  

[jira] [Comment Edited] (CASSANDRA-13387) Metrics for repair

2017-07-19 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093096#comment-16093096
 ] 

Stefan Podkowinski edited comment on CASSANDRA-13387 at 7/19/17 1:53 PM:
-

-I think we should have a dedicated RepairMetrics class, so we can add more 
metrics later. It would be really interesting to have numbers on sessions, 
validation results, streamed sstables and repair durations.- 
What was the reason for removing getExceptionCount()? 

Edit: just realized we already have some of the suggested metrics on ks/table 
level (as you mentioned)


was (Author: spo...@gmail.com):
I think we should have a dedicated RepairMetrics class, so we can add more 
metrics later. It would be really interesting to have numbers on sessions, 
validation results, streamed sstables and repair durations. 
What was the reason for removing getExceptionCount()? 

> Metrics for repair
> --
>
> Key: CASSANDRA-13387
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13387
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Simon Zhou
>Assignee: Simon Zhou
>Priority: Minor
>
> We're missing metrics for repair, especially for errors. From what I observed 
> now, the exception will be caught by UncaughtExceptionHandler set in 
> CassandraDaemon and is categorized as StorageMetrics.exceptions. This is one 
> example:
> {code}
> ERROR [AntiEntropyStage:1] 2017-03-27 18:17:08,385 CassandraDaemon.java:207 - 
> Exception in thread Thread[AntiEntropyStage:1,5,main]
> java.lang.RuntimeException: Parent repair session with id = 
> 8c85d260-1319-11e7-82a2-25090a89015f has failed.
> at 
> org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:377)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.service.ActiveRepairService.removeParentRepairSession(ActiveRepairService.java:392)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:172)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13387) Metrics for repair

2017-07-19 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093096#comment-16093096
 ] 

Stefan Podkowinski commented on CASSANDRA-13387:


I think we should have a dedicated RepairMetrics class, so we can add more 
metrics later. It would be really interesting to have numbers on sessions, 
validation results, streamed sstables and repair durations. 
What was the reason for removing getExceptionCount()? 

> Metrics for repair
> --
>
> Key: CASSANDRA-13387
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13387
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Simon Zhou
>Assignee: Simon Zhou
>Priority: Minor
>
> We're missing metrics for repair, especially for errors. From what I observed 
> now, the exception will be caught by UncaughtExceptionHandler set in 
> CassandraDaemon and is categorized as StorageMetrics.exceptions. This is one 
> example:
> {code}
> ERROR [AntiEntropyStage:1] 2017-03-27 18:17:08,385 CassandraDaemon.java:207 - 
> Exception in thread Thread[AntiEntropyStage:1,5,main]
> java.lang.RuntimeException: Parent repair session with id = 
> 8c85d260-1319-11e7-82a2-25090a89015f has failed.
> at 
> org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:377)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.service.ActiveRepairService.removeParentRepairSession(ActiveRepairService.java:392)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:172)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13043) Unable to achieve CL while applying counters from commitlog

2017-07-19 Thread Stefano Ortolani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092977#comment-16092977
 ] 

Stefano Ortolani commented on CASSANDRA-13043:
--

[~iamaleksey] did you manage to give it a look by any chance?

> Unable to achieve CL while applying counters from commitlog
> ---
>
> Key: CASSANDRA-13043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13043
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Debian
>Reporter: Catalin Alexandru Zamfir
>
> In version 3.9 of Cassandra, we get the following exceptions on the 
> system.log whenever booting an agent. They seem to grow in number with each 
> reboot. Any idea where they come from or what can we do about them? Note that 
> the cluster is healthy (has sufficient live nodes).
> {noformat}
> 2/14/2016 12:39:47 PMINFO  10:39:47 Updating topology for /10.136.64.120
> 12/14/2016 12:39:47 PMINFO  10:39:47 Updating topology for /10.136.64.120
> 12/14/2016 12:39:47 PMWARN  10:39:47 Uncaught exception on thread 
> Thread[CounterMutationStage-111,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: 
> Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat java.lang.Thread.run(Thread.java:745) 
> [na:1.8.0_111]
> 12/14/2016 12:39:47 PMWARN  10:39:47 Uncaught exception on thread 
> Thread[CounterMutationStage-118,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: 
> Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM

[jira] [Comment Edited] (CASSANDRA-13436) Stopping Cassandra shows status "failed" due to non-zero exit status

2017-07-19 Thread Tomas Repik (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092971#comment-16092971
 ] 

Tomas Repik edited comment on CASSANDRA-13436 at 7/19/17 12:07 PM:
---

In fedora we are using the following procedure. Basically doing the same, with 
the {{nc}} tool.

{code:none}
wait_for_service_available()
{
  host=$(head -1 /etc/hosts | cut -d' ' -f1)
  port=$(cat $CASSANDRA_CONF/cassandra.yaml | grep native_transport_port | head 
-1 | cut -d' ' -f2)
  if ! nc -z $host $port; then
# echo "Waiting for Cassandra to start..."
while ! nc -z $host $port; do
   sleep 1
done
# echo "Cassandra is ready."
  fi
}
{code}



was (Author: trepik):
In fedora we are using the following procedure. Basically doing the same, with 
the {{nc}} tool.

{code:bash}
wait_for_service_available()
{
  host=$(head -1 /etc/hosts | cut -d' ' -f1)
  port=$(cat $CASSANDRA_CONF/cassandra.yaml | grep native_transport_port | head 
-1 | cut -d' ' -f2)
  if ! nc -z $host $port; then
# echo "Waiting for Cassandra to start..."
while ! nc -z $host $port; do
   sleep 1
done
# echo "Cassandra is ready."
  fi
}
{code}


> Stopping Cassandra shows status "failed" due to non-zero exit status
> 
>
> Key: CASSANDRA-13436
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13436
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Packaging
>Reporter: Stefan Podkowinski
>
> Systemd will monitor the process from the pid file and save the return status 
> once if has been stopped. In case the process terminates with a status other 
> than zero, it will assume the process terminated abnormaly. Stopping 
> Cassandra using the cassandra script will send a kill signal to the JVM 
> causing it to terminate. If this happen, the JVM will exit with status 143, 
> no matter if shutdown hooks have been executed or not. In order to make 
> systemd recognize this as a normal exit code, the following should be added 
> to the yet to be created unit file:
> {noformat}
> [Service]
> ...
> SuccessExitStatus=0 143
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13436) Stopping Cassandra shows status "failed" due to non-zero exit status

2017-07-19 Thread Tomas Repik (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092971#comment-16092971
 ] 

Tomas Repik commented on CASSANDRA-13436:
-

In fedora we are using the following procedure. Basically doing the same, with 
the {{nc}} tool.

{code:bash}
wait_for_service_available()
{
  host=$(head -1 /etc/hosts | cut -d' ' -f1)
  port=$(cat $CASSANDRA_CONF/cassandra.yaml | grep native_transport_port | head 
-1 | cut -d' ' -f2)
  if ! nc -z $host $port; then
# echo "Waiting for Cassandra to start..."
while ! nc -z $host $port; do
   sleep 1
done
# echo "Cassandra is ready."
  fi
}
{code}


> Stopping Cassandra shows status "failed" due to non-zero exit status
> 
>
> Key: CASSANDRA-13436
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13436
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Packaging
>Reporter: Stefan Podkowinski
>
> Systemd will monitor the process from the pid file and save the return status 
> once if has been stopped. In case the process terminates with a status other 
> than zero, it will assume the process terminated abnormaly. Stopping 
> Cassandra using the cassandra script will send a kill signal to the JVM 
> causing it to terminate. If this happen, the JVM will exit with status 143, 
> no matter if shutdown hooks have been executed or not. In order to make 
> systemd recognize this as a normal exit code, the following should be added 
> to the yet to be created unit file:
> {noformat}
> [Service]
> ...
> SuccessExitStatus=0 143
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13397) Return value of CountDownLatch.await() not being checked in Repair

2017-07-19 Thread Stefan Podkowinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-13397:
---
Reproduced In: 3.0.10  (was: 3.0.14, 3.11.0)

> Return value of CountDownLatch.await() not being checked in Repair
> --
>
> Key: CASSANDRA-13397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13397
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Simon Zhou
>Assignee: Simon Zhou
>Priority: Minor
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: CASSANDRA-13397-v1.patch
>
>
> While looking into repair code, I realize that we should check return value 
> of CountDownLatch.await(). Most of the places that we don't check the return 
> value, nothing bad would happen due to other protection. However, 
> ActiveRepairService#prepareForRepair should have the check. Code to reproduce:
> {code}
> public static void testLatch() throws InterruptedException {
> CountDownLatch latch = new CountDownLatch(2);
> latch.countDown();
> new Thread(() -> {
> try {
> Thread.sleep(1200);
> } catch (InterruptedException e) {
> System.err.println("interrupted");
> }
> latch.countDown();
> System.out.println("counted down");
> }).start();
> latch.await(1, TimeUnit.SECONDS);
> if (latch.getCount() > 0) {
> System.err.println("failed");
> } else {
> System.out.println("success");
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13397) Return value of CountDownLatch.await() not being checked in Repair

2017-07-19 Thread Stefan Podkowinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-13397:
---
Fix Version/s: (was: 3.0.x)
   3.0.14
   3.11.0
   4.0

> Return value of CountDownLatch.await() not being checked in Repair
> --
>
> Key: CASSANDRA-13397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13397
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Simon Zhou
>Assignee: Simon Zhou
>Priority: Minor
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: CASSANDRA-13397-v1.patch
>
>
> While looking into repair code, I realize that we should check return value 
> of CountDownLatch.await(). Most of the places that we don't check the return 
> value, nothing bad would happen due to other protection. However, 
> ActiveRepairService#prepareForRepair should have the check. Code to reproduce:
> {code}
> public static void testLatch() throws InterruptedException {
> CountDownLatch latch = new CountDownLatch(2);
> latch.countDown();
> new Thread(() -> {
> try {
> Thread.sleep(1200);
> } catch (InterruptedException e) {
> System.err.println("interrupted");
> }
> latch.countDown();
> System.out.println("counted down");
> }).start();
> latch.await(1, TimeUnit.SECONDS);
> if (latch.getCount() > 0) {
> System.err.println("failed");
> } else {
> System.out.println("success");
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13436) Stopping Cassandra shows status "failed" due to non-zero exit status

2017-07-19 Thread Felix Paetow (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092933#comment-16092933
 ] 

Felix Paetow edited comment on CASSANDRA-13436 at 7/19/17 11:34 AM:


ok, my solution is now to add

{code:bash}
ExecStartPost=cqlsh -e exit
{code}

The cqlsh statemant is in a while loop though.

But I'm still not sure if this is the best way to test cassandra is up and 
reachable. Any suggestions?


was (Author: hoall):
ok, my solution is now to add

{code:bash}
ExecStartPost=cqlsh -e exit
{code}

But I'm still not sure if this is the best way to test cassandra is up and 
reachable. Any suggestions?

> Stopping Cassandra shows status "failed" due to non-zero exit status
> 
>
> Key: CASSANDRA-13436
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13436
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Packaging
>Reporter: Stefan Podkowinski
>
> Systemd will monitor the process from the pid file and save the return status 
> once if has been stopped. In case the process terminates with a status other 
> than zero, it will assume the process terminated abnormaly. Stopping 
> Cassandra using the cassandra script will send a kill signal to the JVM 
> causing it to terminate. If this happen, the JVM will exit with status 143, 
> no matter if shutdown hooks have been executed or not. In order to make 
> systemd recognize this as a normal exit code, the following should be added 
> to the yet to be created unit file:
> {noformat}
> [Service]
> ...
> SuccessExitStatus=0 143
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily

2017-07-19 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092952#comment-16092952
 ] 

Alex Petrov edited comment on CASSANDRA-13696 at 7/19/17 11:30 AM:
---

Might be we want to use either 30 or 3014 depending on which one is active?

Question: does this happen in mixed version cluster or all the nodes actually 
have the same protocol version?


was (Author: ifesdjeen):
Might be we want to use either 30 or 3014 depending on which one is active?

> Digest mismatch Exception if hints file has UnknownColumnFamily
> ---
>
> Key: CASSANDRA-13696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Blocker
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> {noformat}
> WARN  [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - 
> Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - 
> table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints
> ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 
> HintsDispatchExecutor.java:234 - Failed to dispatch hints file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
> exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
>  [main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [main/:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
> Caused by: java.io.IOException: Digest mismatch exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190)
>  ~[main/:na]
> ... 16 common frames omitted
> {noformat}
> It causes multiple cassandra nodes stop [by 
> default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188].
> Here is the reproduce steps on a 3 nodes cluster, RF=3:
> 1. stop node1
> 2. send some data with quorum (or one), it will generate hints file on 
> node2/node3
> 3. drop the table
> 4. start node1
> node2/node3 will report "corrupted hints file" and stop. The impact is very 
> bad for a large cluster, when it happens, almost all the nodes are down at 
> the same time and we have to remove all the hints files (which contain the 
> dropped table) to bring the node back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily

2017-07-19 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092952#comment-16092952
 ] 

Alex Petrov commented on CASSANDRA-13696:
-

Might be we want to use either 30 or 3014 depending on which one is active?

> Digest mismatch Exception if hints file has UnknownColumnFamily
> ---
>
> Key: CASSANDRA-13696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Blocker
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> {noformat}
> WARN  [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - 
> Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - 
> table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints
> ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 
> HintsDispatchExecutor.java:234 - Failed to dispatch hints file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
> exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
>  [main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [main/:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
> Caused by: java.io.IOException: Digest mismatch exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190)
>  ~[main/:na]
> ... 16 common frames omitted
> {noformat}
> It causes multiple cassandra nodes stop [by 
> default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188].
> Here is the reproduce steps on a 3 nodes cluster, RF=3:
> 1. stop node1
> 2. send some data with quorum (or one), it will generate hints file on 
> node2/node3
> 3. drop the table
> 4. start node1
> node2/node3 will report "corrupted hints file" and stop. The impact is very 
> bad for a large cluster, when it happens, almost all the nodes are down at 
> the same time and we have to remove all the hints files (which contain the 
> dropped table) to bring the node back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13694) sstabledump does not show full precision of timestamp columns

2017-07-19 Thread Varun Barala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092942#comment-16092942
 ] 

Varun Barala commented on CASSANDRA-13694:
--

After this patch output will look like:-

[
  {
"partition" : {
  "key" : [ "1234", "TEST EVENT" ],
  "position" : 0
},
"rows" : [
  {
"type" : "row",
"position" : 38,
"clustering" : [ "1970-01-18 16:16:13.000183+0730" ],
"liveness_info" : { "tstamp" : "2017-07-18T09:19:55.623Z" },
"cells" : [
  { "name" : "ack_time", "value" : "1970-01-18 16:16:13.03+0730" }
]
  }
]
  }
]

> sstabledump does not show full precision of timestamp columns
> -
>
> Key: CASSANDRA-13694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Ubuntu 16.04 LTS
>Reporter: Tim Reeves
>  Labels: patch
> Fix For: 3.7
>
> Attachments: CASSANDRA-13694.patch
>
>
> Create a table:
> CREATE TABLE test_table (
> unit_no bigint,
> event_code text,
> active_time timestamp,
> ack_time timestamp,
> PRIMARY KEY ((unit_no, event_code), active_time)
> ) WITH CLUSTERING ORDER BY (active_time DESC)
> Insert a row:
> INSERT INTO test_table (unit_no, event_code, active_time, ack_time)
>   VALUES (1234, 'TEST EVENT', toTimestamp(now()), 
> toTimestamp(now()));
> Verify that it is in the database with a full timestamp:
> cqlsh:pentaho> select * from test_table;
>  unit_no | event_code | active_time | ack_time
> -++-+-
> 1234 | TEST EVENT | 2017-07-14 14:52:39.919000+ | 2017-07-14 
> 14:52:39.919000+
> (1 rows)
> Write file:
> nodetool flush
> nodetool compact pentaho
> Use sstabledump:
> treeves@ubuntu:~$ sstabledump 
> /var/lib/cassandra/data/pentaho/test_table-99ba228068a311e7ac30953b79ac2c3e/mb-2-big-Data.db
> [
>   {
> "partition" : {
>   "key" : [ "1234", "TEST EVENT" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 38,
> "clustering" : [ "2017-07-14 15:52+0100" ],
> "liveness_info" : { "tstamp" : "2017-07-14T14:52:39.888701Z" },
> "cells" : [
>   { "name" : "ack_time", "value" : "2017-07-14 15:52+0100" }
> ]
>   }
> ]
>   }
> ]
> treeves@ubuntu:~$ 
> The timestamp in the cluster key, and the regular column, are both truncated 
> to the minute.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13694) sstabledump does not show full precision of timestamp columns

2017-07-19 Thread Varun Barala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Barala updated CASSANDRA-13694:
-
Attachment: (was: CASSANDRA-13694)

> sstabledump does not show full precision of timestamp columns
> -
>
> Key: CASSANDRA-13694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Ubuntu 16.04 LTS
>Reporter: Tim Reeves
>  Labels: patch
> Fix For: 3.7
>
> Attachments: CASSANDRA-13694.patch
>
>
> Create a table:
> CREATE TABLE test_table (
> unit_no bigint,
> event_code text,
> active_time timestamp,
> ack_time timestamp,
> PRIMARY KEY ((unit_no, event_code), active_time)
> ) WITH CLUSTERING ORDER BY (active_time DESC)
> Insert a row:
> INSERT INTO test_table (unit_no, event_code, active_time, ack_time)
>   VALUES (1234, 'TEST EVENT', toTimestamp(now()), 
> toTimestamp(now()));
> Verify that it is in the database with a full timestamp:
> cqlsh:pentaho> select * from test_table;
>  unit_no | event_code | active_time | ack_time
> -++-+-
> 1234 | TEST EVENT | 2017-07-14 14:52:39.919000+ | 2017-07-14 
> 14:52:39.919000+
> (1 rows)
> Write file:
> nodetool flush
> nodetool compact pentaho
> Use sstabledump:
> treeves@ubuntu:~$ sstabledump 
> /var/lib/cassandra/data/pentaho/test_table-99ba228068a311e7ac30953b79ac2c3e/mb-2-big-Data.db
> [
>   {
> "partition" : {
>   "key" : [ "1234", "TEST EVENT" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 38,
> "clustering" : [ "2017-07-14 15:52+0100" ],
> "liveness_info" : { "tstamp" : "2017-07-14T14:52:39.888701Z" },
> "cells" : [
>   { "name" : "ack_time", "value" : "2017-07-14 15:52+0100" }
> ]
>   }
> ]
>   }
> ]
> treeves@ubuntu:~$ 
> The timestamp in the cluster key, and the regular column, are both truncated 
> to the minute.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13694) sstabledump does not show full precision of timestamp columns

2017-07-19 Thread Varun Barala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Barala updated CASSANDRA-13694:
-
Attachment: CASSANDRA-13694.patch

> sstabledump does not show full precision of timestamp columns
> -
>
> Key: CASSANDRA-13694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Ubuntu 16.04 LTS
>Reporter: Tim Reeves
>  Labels: patch
> Fix For: 3.7
>
> Attachments: CASSANDRA-13694.patch
>
>
> Create a table:
> CREATE TABLE test_table (
> unit_no bigint,
> event_code text,
> active_time timestamp,
> ack_time timestamp,
> PRIMARY KEY ((unit_no, event_code), active_time)
> ) WITH CLUSTERING ORDER BY (active_time DESC)
> Insert a row:
> INSERT INTO test_table (unit_no, event_code, active_time, ack_time)
>   VALUES (1234, 'TEST EVENT', toTimestamp(now()), 
> toTimestamp(now()));
> Verify that it is in the database with a full timestamp:
> cqlsh:pentaho> select * from test_table;
>  unit_no | event_code | active_time | ack_time
> -++-+-
> 1234 | TEST EVENT | 2017-07-14 14:52:39.919000+ | 2017-07-14 
> 14:52:39.919000+
> (1 rows)
> Write file:
> nodetool flush
> nodetool compact pentaho
> Use sstabledump:
> treeves@ubuntu:~$ sstabledump 
> /var/lib/cassandra/data/pentaho/test_table-99ba228068a311e7ac30953b79ac2c3e/mb-2-big-Data.db
> [
>   {
> "partition" : {
>   "key" : [ "1234", "TEST EVENT" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 38,
> "clustering" : [ "2017-07-14 15:52+0100" ],
> "liveness_info" : { "tstamp" : "2017-07-14T14:52:39.888701Z" },
> "cells" : [
>   { "name" : "ack_time", "value" : "2017-07-14 15:52+0100" }
> ]
>   }
> ]
>   }
> ]
> treeves@ubuntu:~$ 
> The timestamp in the cluster key, and the regular column, are both truncated 
> to the minute.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13694) sstabledump does not show full precision of timestamp columns

2017-07-19 Thread Varun Barala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Barala updated CASSANDRA-13694:
-
Attachment: CASSANDRA-13694

> sstabledump does not show full precision of timestamp columns
> -
>
> Key: CASSANDRA-13694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Ubuntu 16.04 LTS
>Reporter: Tim Reeves
>  Labels: patch
> Fix For: 3.7
>
> Attachments: CASSANDRA-13694.patch
>
>
> Create a table:
> CREATE TABLE test_table (
> unit_no bigint,
> event_code text,
> active_time timestamp,
> ack_time timestamp,
> PRIMARY KEY ((unit_no, event_code), active_time)
> ) WITH CLUSTERING ORDER BY (active_time DESC)
> Insert a row:
> INSERT INTO test_table (unit_no, event_code, active_time, ack_time)
>   VALUES (1234, 'TEST EVENT', toTimestamp(now()), 
> toTimestamp(now()));
> Verify that it is in the database with a full timestamp:
> cqlsh:pentaho> select * from test_table;
>  unit_no | event_code | active_time | ack_time
> -++-+-
> 1234 | TEST EVENT | 2017-07-14 14:52:39.919000+ | 2017-07-14 
> 14:52:39.919000+
> (1 rows)
> Write file:
> nodetool flush
> nodetool compact pentaho
> Use sstabledump:
> treeves@ubuntu:~$ sstabledump 
> /var/lib/cassandra/data/pentaho/test_table-99ba228068a311e7ac30953b79ac2c3e/mb-2-big-Data.db
> [
>   {
> "partition" : {
>   "key" : [ "1234", "TEST EVENT" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 38,
> "clustering" : [ "2017-07-14 15:52+0100" ],
> "liveness_info" : { "tstamp" : "2017-07-14T14:52:39.888701Z" },
> "cells" : [
>   { "name" : "ack_time", "value" : "2017-07-14 15:52+0100" }
> ]
>   }
> ]
>   }
> ]
> treeves@ubuntu:~$ 
> The timestamp in the cluster key, and the regular column, are both truncated 
> to the minute.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13694) sstabledump does not show full precision of timestamp columns

2017-07-19 Thread Varun Barala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Barala updated CASSANDRA-13694:
-
   Labels: patch  (was: )
Reproduced In: 3.7
   Status: Patch Available  (was: Open)

In order to match with cqlsh input. I added one more date format in 
`*TimestampSerializer.java*`.

Previously default format was `-MM-dd HH:mmXX` which has minute level 
precision. In this patch I changed it to `-MM-dd HH:mm:ss.SSXX`.

I appended at the end of *dateStringPatterns* array to make sure minimum 
changes.

Please do let me know If I didn;t consider any case. Thank you!!

> sstabledump does not show full precision of timestamp columns
> -
>
> Key: CASSANDRA-13694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Ubuntu 16.04 LTS
>Reporter: Tim Reeves
>  Labels: patch
> Fix For: 3.7
>
>
> Create a table:
> CREATE TABLE test_table (
> unit_no bigint,
> event_code text,
> active_time timestamp,
> ack_time timestamp,
> PRIMARY KEY ((unit_no, event_code), active_time)
> ) WITH CLUSTERING ORDER BY (active_time DESC)
> Insert a row:
> INSERT INTO test_table (unit_no, event_code, active_time, ack_time)
>   VALUES (1234, 'TEST EVENT', toTimestamp(now()), 
> toTimestamp(now()));
> Verify that it is in the database with a full timestamp:
> cqlsh:pentaho> select * from test_table;
>  unit_no | event_code | active_time | ack_time
> -++-+-
> 1234 | TEST EVENT | 2017-07-14 14:52:39.919000+ | 2017-07-14 
> 14:52:39.919000+
> (1 rows)
> Write file:
> nodetool flush
> nodetool compact pentaho
> Use sstabledump:
> treeves@ubuntu:~$ sstabledump 
> /var/lib/cassandra/data/pentaho/test_table-99ba228068a311e7ac30953b79ac2c3e/mb-2-big-Data.db
> [
>   {
> "partition" : {
>   "key" : [ "1234", "TEST EVENT" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 38,
> "clustering" : [ "2017-07-14 15:52+0100" ],
> "liveness_info" : { "tstamp" : "2017-07-14T14:52:39.888701Z" },
> "cells" : [
>   { "name" : "ack_time", "value" : "2017-07-14 15:52+0100" }
> ]
>   }
> ]
>   }
> ]
> treeves@ubuntu:~$ 
> The timestamp in the cluster key, and the regular column, are both truncated 
> to the minute.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13436) Stopping Cassandra shows status "failed" due to non-zero exit status

2017-07-19 Thread Felix Paetow (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092933#comment-16092933
 ] 

Felix Paetow commented on CASSANDRA-13436:
--

ok, my solution is now to add

{code:bash}
ExecStartPost=cqlsh -e exit
{code}

But I'm still not sure if this is the best way to test cassandra is up and 
reachable. Any suggestions?

> Stopping Cassandra shows status "failed" due to non-zero exit status
> 
>
> Key: CASSANDRA-13436
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13436
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Packaging
>Reporter: Stefan Podkowinski
>
> Systemd will monitor the process from the pid file and save the return status 
> once if has been stopped. In case the process terminates with a status other 
> than zero, it will assume the process terminated abnormaly. Stopping 
> Cassandra using the cassandra script will send a kill signal to the JVM 
> causing it to terminate. If this happen, the JVM will exit with status 143, 
> no matter if shutdown hooks have been executed or not. In order to make 
> systemd recognize this as a normal exit code, the following should be added 
> to the yet to be created unit file:
> {noformat}
> [Service]
> ...
> SuccessExitStatus=0 143
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13072) Cassandra failed to run on Linux-aarch64

2017-07-19 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092922#comment-16092922
 ] 

Stefan Podkowinski commented on CASSANDRA-13072:


Misses CHANGES.txt update.

> Cassandra failed to run on Linux-aarch64
> 
>
> Key: CASSANDRA-13072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13072
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Hardware: ARM aarch64
> OS: Ubuntu 16.04.1 LTS
>Reporter: Jun He
>Assignee: Benjamin Lerer
>  Labels: incompatible
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: compat_report.html
>
>
> Steps to reproduce:
> 1. Download cassandra latest source
> 2. Build it with "ant"
> 3. Run with "./bin/cassandra". Daemon is crashed with following error message:
> {quote}
> INFO  05:30:21 Initializing system.schema_functions
> INFO  05:30:21 Initializing system.schema_aggregates
> ERROR 05:30:22 Exception in thread Thread[MemtableFlushWriter:1,5,main]
> java.lang.NoClassDefFoundError: Could not initialize class com.sun.jna.Native
> at 
> org.apache.cassandra.utils.memory.MemoryUtil.allocate(MemoryUtil.java:97) 
> ~[main/:na]
> at org.apache.cassandra.io.util.Memory.(Memory.java:74) 
> ~[main/:na]
> at org.apache.cassandra.io.util.SafeMemory.(SafeMemory.java:32) 
> ~[main/:na]
> at 
> org.apache.cassandra.io.compress.CompressionMetadata$Writer.(CompressionMetadata.java:316)
>  ~[main/:na]
> at 
> org.apache.cassandra.io.compress.CompressionMetadata$Writer.open(CompressionMetadata.java:330)
>  ~[main/:na]
> at 
> org.apache.cassandra.io.compress.CompressedSequentialWriter.(CompressedSequentialWriter.java:76)
>  ~[main/:na]
> at 
> org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:163) 
> ~[main/:na]
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.(BigTableWriter.java:73)
>  ~[main/:na]
> at 
> org.apache.cassandra.io.sstable.format.big.BigFormat$WriterFactory.open(BigFormat.java:93)
>  ~[main/:na]
> at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.create(SSTableWriter.java:96)
>  ~[main/:na]
> at 
> org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.create(SimpleSSTableMultiWriter.java:114)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.createSSTableMultiWriter(AbstractCompactionStrategy.java:519)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.createSSTableMultiWriter(CompactionStrategyManager.java:497)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.ColumnFamilyStore.createSSTableMultiWriter(ColumnFamilyStore.java:480)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.Memtable.createFlushWriter(Memtable.java:439) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:371) 
> ~[main/:na]
> at org.apache.cassandra.db.Memtable.flush(Memtable.java:332) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1054)
>  ~[main/:na]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[na:1.8.0_111]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
> {quote}
> Analyze:
> This issue is caused by bundled jna-4.0.0.jar which doesn't come with aarch64 
> native support. Replace lib/jna-4.0.0.jar with jna-4.2.0.jar from 
> http://central.maven.org/maven2/net/java/dev/jna/jna/4.2.0/ can fix this 
> problem.
> Attached is the binary compatibility report of jna.jar between 4.0 and 4.2. 
> The result is good (97.4%). So is there possibility to upgrade jna to 4.2.0 
> in upstream? Should there be any kind of tests to execute, please kindly 
> point me. Thanks a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9988) Introduce leaf-only iterator

2017-07-19 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092846#comment-16092846
 ] 

Benedict commented on CASSANDRA-9988:
-

Before committing, you may want to consider modifying the microbenchmark to use 
objects that are of a representative complexity, and over a large enough domain 
that the contents will not all be in the processor cache.  This might affect 
the choice of exponential search vs. binary search, as the extra comparisons 
needed are not currently representatively counted.

> Introduce leaf-only iterator
> 
>
> Key: CASSANDRA-9988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9988
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Benedict
>Assignee: Jay Zhuang
>Priority: Minor
>  Labels: patch
> Fix For: 4.0
>
> Attachments: 9988-trunk-new.txt, 9988-trunk-new-update.txt, 
> trunk-9988.txt
>
>
> In many cases we have small btrees, small enough to fit in a single leaf 
> page. In this case it _may_ be more efficient to specialise our iterator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9988) Introduce leaf-only iterator

2017-07-19 Thread Anthony Grasso (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092832#comment-16092832
 ] 

Anthony Grasso commented on CASSANDRA-9988:
---

Have reviewed the code.

Ran the Microbench tests
{noformat}
ant test -Dtest.name=org.apache.cassandra.test.microbench.CompactionBench
ant test 
-Dtest.name=org.apache.cassandra.test.microbench.BTreeSearchIteratorBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.DirectorySizerBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.FastThreadExecutor
ant test -Dtest.name=org.apache.cassandra.test.microbench.FastThreadLocalBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.MutationBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.OutputStreamBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.OutputStreamBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.OutputStreamBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.PendingRangesBench
ant test -Dtest.name=org.apache.cassandra.test.microbench.ReadWriteTest
ant test 
-Dtest.name=org.apache.cassandra.test.microbench.StreamingHistogramBench
ant test 
-Dtest.name=org.apache.cassandra.test.microbench.PartitionImplementationTest
{noformat}

Ran the partition unit tests
{noformat}
ant test 
-Dtest.name=org.apache.cassandra.db.partition.PartitionImplementationTest
ant test -Dtest.name=org.apache.cassandra.db.partition.PartitionUpdateTest
{noformat}

Changes look good to me.

> Introduce leaf-only iterator
> 
>
> Key: CASSANDRA-9988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9988
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Benedict
>Assignee: Jay Zhuang
>Priority: Minor
>  Labels: patch
> Fix For: 4.0
>
> Attachments: 9988-trunk-new.txt, 9988-trunk-new-update.txt, 
> trunk-9988.txt
>
>
> In many cases we have small btrees, small enough to fit in a single leaf 
> page. In this case it _may_ be more efficient to specialise our iterator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13627) Index queries are rejected on COMPACT tables

2017-07-19 Thread Stefan Podkowinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-13627:
---
Fix Version/s: 4.0
   3.11.1
   3.0.15

> Index queries are rejected on COMPACT tables
> 
>
> Key: CASSANDRA-13627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13627
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
> Fix For: 3.0.15, 3.11.1, 4.0
>
>
> Since {{3.0}}, {{compact}} tables are using under the hood {{static}} 
> columns. Due to that {{SELECT}} queries using secondary indexes get rejected 
> with the following error:
> {{Queries using 2ndary indexes don't support selecting only static columns}}.
> This problem can be reproduced using the following unit test:
> {code}@Test
> public void testIndicesOnCompactTable() throws Throwable
> {
> createTable("CREATE TABLE %s (pk int PRIMARY KEY, v int) WITH COMPACT 
> STORAGE");
> createIndex("CREATE INDEX ON %s(v)");
> execute("INSERT INTO %S (pk, v) VALUES (?, ?)", 1, 1);
> execute("INSERT INTO %S (pk, v) VALUES (?, ?)", 2, 1);
> execute("INSERT INTO %S (pk, v) VALUES (?, ?)", 3, 3);
> assertRows(execute("SELECT pk, v FROM %s WHERE v = 1"),
>row(1, 1),
>row(2, 1));
> }{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12606) CQLSSTableWriter unable to use blob conversion functions

2017-07-19 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092813#comment-16092813
 ] 

Stefan Podkowinski commented on CASSANDRA-12606:


[~ifesdjeen], can you double check the changes.txt entry (I think it should be 
3.11.1 instead of 3.11.0) and set fix version?

> CQLSSTableWriter unable to use blob conversion functions
> 
>
> Key: CASSANDRA-12606
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12606
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL, Tools
>Reporter: Mark Reddy
>Assignee: Alex Petrov
>Priority: Minor
> Fix For: 4.0, 3.0.x, 3.11.x
>
>
> Attempting to use blob conversion functions e.g. textAsBlob, from 3.0 - 3.7 
> results in:
> {noformat}
> Exception in thread "main" 
> org.apache.cassandra.exceptions.InvalidRequestException: Unknown function 
> textasblob called
>   at 
> org.apache.cassandra.cql3.functions.FunctionCall$Raw.prepare(FunctionCall.java:136)
>   at 
> org.apache.cassandra.cql3.Operation$SetValue.prepare(Operation.java:163)
>   at 
> org.apache.cassandra.cql3.statements.UpdateStatement$ParsedInsert.prepareInternal(UpdateStatement.java:173)
>   at 
> org.apache.cassandra.cql3.statements.ModificationStatement$Parsed.prepare(ModificationStatement.java:785)
>   at 
> org.apache.cassandra.cql3.statements.ModificationStatement$Parsed.prepare(ModificationStatement.java:771)
>   at 
> org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.prepareInsert(CQLSSTableWriter.java:567)
>   at 
> org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.build(CQLSSTableWriter.java:510)
> {noformat}
> The following snippet will reproduce the issue
> {code}
> String table = String.format("%s.%s", "test_ks", "test_table");
> String schema = String.format("CREATE TABLE %s (test_text text, test_blob 
> blob, PRIMARY KEY(test_text));", table);
> String insertStatement = String.format("INSERT INTO %s (test_text, test_blob) 
> VALUES (?, textAsBlob(?))", table);
> File tempDir = Files.createTempDirectory("tempDir").toFile();
> CQLSSTableWriter sstableWriter = CQLSSTableWriter.builder()
> .forTable(schema)
> .using(insertStatement)
> .inDirectory(tempDir)
> .build();
> {code}
> This is caused in FunctionResolver.get(...) when 
> candidates.addAll(Schema.instance.getFunctions(name.asNativeFunction())); is 
> called, as there is no system keyspace initialised.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13639) SSTableLoader always uses hostname to stream files from

2017-07-19 Thread Jan Karlsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092807#comment-16092807
 ] 

Jan Karlsson commented on CASSANDRA-13639:
--

If SSL is enabled, {SSTableLoader} always uses the hostname no matter how your 
routing is set up. If you have a second interface that you route all 
{SSTableLoader} traffic from, it will still pick your first network interface 
because it corresponds with your hostname. Thereby overriding any routing you 
might have set up. This screams bug to me.

The correct behavior would be for {SSTableLoader} to use the normal routing of 
the server. I am unclear why we set the from address specifically ourself 
instead of leaving it blank. I can see that it might be useful to have it as a 
command variable as well. However it is quite strange to set up a 'connect 
from' address.
{code}
if (encryptionOptions != null && encryptionOptions.internode_encryption 
!= EncryptionOptions.ServerEncryptionOptions.InternodeEncryption.none)
{
if (outboundBindAny)
return SSLFactory.getSocket(encryptionOptions, peer, 
secureStoragePort);
else
return SSLFactory.getSocket(encryptionOptions, peer, 
secureStoragePort, FBUtilities.getLocalAddress(), 0);
}{code}

I am a little unclear of why the code is the way it is. The method is only 
called with {outboundBindAny} set to false. It seems to me that calling it 
without the {FBUtilities} call would be the correct way of calling it.

> SSTableLoader always uses hostname to stream files from
> ---
>
> Key: CASSANDRA-13639
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13639
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Jan Karlsson
>Assignee: Jan Karlsson
> Fix For: 4.x
>
> Attachments: 13639-trunk
>
>
> I stumbled upon an issue where SSTableLoader was ignoring our routing by 
> using the wrong interface to send the SSTables to the other nodes. Looking at 
> the code, it seems that we are using FBUtilities.getLocalAddress() to fetch 
> out the hostname, even if the yaml file specifies a different host. I am not 
> sure why we call this function instead of using the routing by leaving it 
> blank, perhaps someone could enlighten me.
> This behaviour comes from the fact that we use a default created 
> DatabaseDescriptor which does not set the values for listenAddress and 
> listenInterface. This causes the aforementioned function to retrieve the 
> hostname at all times, even if it is not the interface used in the yaml file.
> I propose we break out the function that handles listenAddress and 
> listenInterface and call it so that listenAddress or listenInterface is 
> getting populated in the DatabaseDescriptor.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13639) SSTableLoader always uses hostname to stream files from

2017-07-19 Thread Jan Karlsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092807#comment-16092807
 ] 

Jan Karlsson edited comment on CASSANDRA-13639 at 7/19/17 8:54 AM:
---

If SSL is enabled, {{SSTableLoader}} always uses the hostname no matter how 
your routing is set up. If you have a second interface that you route all 
{{SSTableLoader}} traffic from, it will still pick your first network interface 
because it corresponds with your hostname. Thereby overriding any routing you 
might have set up. This screams bug to me.

The correct behavior would be for {{SSTableLoader}} to use the normal routing 
of the server. I am unclear why we set the from address specifically ourself 
instead of leaving it blank. I can see that it might be useful to have it as a 
command variable as well. However it is quite strange to set up a 'connect 
from' address.
{code}
if (encryptionOptions != null && encryptionOptions.internode_encryption 
!= EncryptionOptions.ServerEncryptionOptions.InternodeEncryption.none)
{
if (outboundBindAny)
return SSLFactory.getSocket(encryptionOptions, peer, 
secureStoragePort);
else
return SSLFactory.getSocket(encryptionOptions, peer, 
secureStoragePort, FBUtilities.getLocalAddress(), 0);
}{code}

I am a little unclear of why the code is the way it is. The method is only 
called with {{outboundBindAny}} set to false. It seems to me that calling it 
without the {{FBUtilities}} call would be the correct way of calling it.


was (Author: jan karlsson):
If SSL is enabled, {SSTableLoader} always uses the hostname no matter how your 
routing is set up. If you have a second interface that you route all 
{SSTableLoader} traffic from, it will still pick your first network interface 
because it corresponds with your hostname. Thereby overriding any routing you 
might have set up. This screams bug to me.

The correct behavior would be for {SSTableLoader} to use the normal routing of 
the server. I am unclear why we set the from address specifically ourself 
instead of leaving it blank. I can see that it might be useful to have it as a 
command variable as well. However it is quite strange to set up a 'connect 
from' address.
{code}
if (encryptionOptions != null && encryptionOptions.internode_encryption 
!= EncryptionOptions.ServerEncryptionOptions.InternodeEncryption.none)
{
if (outboundBindAny)
return SSLFactory.getSocket(encryptionOptions, peer, 
secureStoragePort);
else
return SSLFactory.getSocket(encryptionOptions, peer, 
secureStoragePort, FBUtilities.getLocalAddress(), 0);
}{code}

I am a little unclear of why the code is the way it is. The method is only 
called with {outboundBindAny} set to false. It seems to me that calling it 
without the {FBUtilities} call would be the correct way of calling it.

> SSTableLoader always uses hostname to stream files from
> ---
>
> Key: CASSANDRA-13639
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13639
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Jan Karlsson
>Assignee: Jan Karlsson
> Fix For: 4.x
>
> Attachments: 13639-trunk
>
>
> I stumbled upon an issue where SSTableLoader was ignoring our routing by 
> using the wrong interface to send the SSTables to the other nodes. Looking at 
> the code, it seems that we are using FBUtilities.getLocalAddress() to fetch 
> out the hostname, even if the yaml file specifies a different host. I am not 
> sure why we call this function instead of using the routing by leaving it 
> blank, perhaps someone could enlighten me.
> This behaviour comes from the fact that we use a default created 
> DatabaseDescriptor which does not set the values for listenAddress and 
> listenInterface. This causes the aforementioned function to retrieve the 
> hostname at all times, even if it is not the interface used in the yaml file.
> I propose we break out the function that handles listenAddress and 
> listenInterface and call it so that listenAddress or listenInterface is 
> getting populated in the DatabaseDescriptor.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13482) NPE on non-existing row read when row cache is enabled

2017-07-19 Thread Stefan Podkowinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-13482:
---
Fix Version/s: (was: 3.11.x)
   (was: 3.0.x)
   3.11.1
   3.0.15

> NPE on non-existing row read when row cache is enabled
> --
>
> Key: CASSANDRA-13482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13482
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alex Petrov
>Assignee: Alex Petrov
> Fix For: 3.0.15, 3.11.1, 4.0
>
>
> The problem is reproducible on 3.0 with:
> {code}
> -# row_cache_class_name: org.apache.cassandra.cache.OHCProvider
> +row_cache_class_name: org.apache.cassandra.cache.OHCProvider
> -row_cache_size_in_mb: 0
> +row_cache_size_in_mb: 100
> {code}
> Table setup:
> {code}
> CREATE TABLE cache_tables (pk int, v1 int, v2 int, v3 int, primary key (pk, 
> v1)) WITH CACHING = { 'keys': 'ALL', 'rows_per_partition': '1' } ;
> {code}
> No data is required, only a head query (or any pk/ck query but with full 
> partitions cached). 
> {code}
> select * from cross_page_queries where pk = 1 ;
> {code}
> {code}
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators.concat(UnfilteredRowIterators.java:193)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.SinglePartitionReadCommand.getThroughCache(SinglePartitionReadCommand.java:461)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.SinglePartitionReadCommand.queryStorage(SinglePartitionReadCommand.java:358)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:395) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1794)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2472)
>  ~[main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[main/:na]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [main/:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12173) Materialized View may turn on TRACING

2017-07-19 Thread Hiroshi Usami (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092752#comment-16092752
 ] 

Hiroshi Usami commented on CASSANDRA-12173:
---

I checked the conversation log of Jul 2016 again if someone turned on TRACING, 
but I couldn't discover any track of that 
 kind of operation.
And I agree that enclosing this would be rational if we cannot proceed any more 
from here...

> Materialized View may turn on TRACING
> -
>
> Key: CASSANDRA-12173
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12173
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Hiroshi Usami
>
> We observed this in our test cluster(C*3.0.6), but TRAING was OFF apparently.
> After creating Materialized View, the Write count jumped up to 20K from 5K, 
> and the ViewWrite rose up to 10K.
> This is supposed to be done by MV, but some nodes which had 14,000+ SSTables 
> in the system_traces directory went down in a half day, because of running 
> out of file descriptors.
> {code}
> Counting by: find /var/lib/cassandra/data/system_traces/ -name "*-Data.db"|wc 
> -l
>   node01: 0
>   node02: 3
>   node03: 1
>   node04: 0
>   node05: 0
>   node06: 0
>   node07: 2
>   node08: 0
>   node09: 0
>   node10: 0
>   node11: 2
>   node12: 2
>   node13: 1
>   node14: 7
>   node15: 1
>   node16: 5
>   node17: 0
>   node18: 0
>   node19: 0
>   node20: 0
>   node21: 1
>   node22: 0
>   node23: 2
>   node24: 14420
>   node25: 0
>   node26: 2
>   node27: 0
>   node28: 1
>   node29: 1
>   node30: 2
>   node31: 1
>   node32: 0
>   node33: 0
>   node34: 0
>   node35: 14371
>   node36: 0
>   node37: 1
>   node38: 0
>   node39: 0
>   node40: 1
> {code}
> In node24, the sstabledump of the oldest SSTable in system_traces/events 
> directory starts with:
> {code}
> [
>   {
> "partition" : {
>   "key" : [ "e07851d0-4421-11e6-abd7-59d7f275ba79" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 30,
> "clustering" : [ "e07878e0-4421-11e6-abd7-59d7f275ba79" ],
> "liveness_info" : { "tstamp" : "2016-07-07T09:04:57.197Z", "ttl" : 
> 86400, "expires_at" : "2016-07-08T09:04:57Z", "expired" : true },
> "cells" : [
>   { "name" : "activity", "value" : "Parsing CREATE MATERIALIZED VIEW
> ...
> {code}
> So this could be the begining of TRACING ON implicitly. In node35, the oldest 
> one also starts with the "Parsing CREATE MATERIALIZED VIEW".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily

2017-07-19 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092704#comment-16092704
 ] 

Jeff Jirsa commented on CASSANDRA-13696:


cc [~iamaleksey] as well.


> Digest mismatch Exception if hints file has UnknownColumnFamily
> ---
>
> Key: CASSANDRA-13696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Blocker
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> {noformat}
> WARN  [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - 
> Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - 
> table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints
> ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 
> HintsDispatchExecutor.java:234 - Failed to dispatch hints file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
> exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
>  [main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [main/:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
> Caused by: java.io.IOException: Digest mismatch exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190)
>  ~[main/:na]
> ... 16 common frames omitted
> {noformat}
> It causes multiple cassandra nodes stop [by 
> default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188].
> Here is the reproduce steps on a 3 nodes cluster, RF=3:
> 1. stop node1
> 2. send some data with quorum (or one), it will generate hints file on 
> node2/node3
> 3. drop the table
> 4. start node1
> node2/node3 will report "corrupted hints file" and stop. The impact is very 
> bad for a large cluster, when it happens, almost all the nodes are down at 
> the same time and we have to remove all the hints files (which contain the 
> dropped table) to bring the node back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily

2017-07-19 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13696:
---
Priority: Blocker  (was: Critical)

> Digest mismatch Exception if hints file has UnknownColumnFamily
> ---
>
> Key: CASSANDRA-13696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Blocker
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> {noformat}
> WARN  [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - 
> Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - 
> table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints
> ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 
> HintsDispatchExecutor.java:234 - Failed to dispatch hints file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
> exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
>  [main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [main/:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
> Caused by: java.io.IOException: Digest mismatch exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190)
>  ~[main/:na]
> ... 16 common frames omitted
> {noformat}
> It causes multiple cassandra nodes stop [by 
> default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188].
> Here is the reproduce steps on a 3 nodes cluster, RF=3:
> 1. stop node1
> 2. send some data with quorum (or one), it will generate hints file on 
> node2/node3
> 3. drop the table
> 4. start node1
> node2/node3 will report "corrupted hints file" and stop. The impact is very 
> bad for a large cluster, when it happens, almost all the nodes are down at 
> the same time and we have to remove all the hints files (which contain the 
> dropped table) to bring the node back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily

2017-07-19 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13696:
---
Fix Version/s: 4.x
   3.11.x
   3.0.x

> Digest mismatch Exception if hints file has UnknownColumnFamily
> ---
>
> Key: CASSANDRA-13696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Blocker
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> {noformat}
> WARN  [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - 
> Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - 
> table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints
> ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 
> HintsDispatchExecutor.java:234 - Failed to dispatch hints file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
> exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
>  [main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [main/:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
> Caused by: java.io.IOException: Digest mismatch exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190)
>  ~[main/:na]
> ... 16 common frames omitted
> {noformat}
> It causes multiple cassandra nodes stop [by 
> default|https://github.com/apache/cassandra/blob/cassandra-3.0/conf/cassandra.yaml#L188].
> Here is the reproduce steps on a 3 nodes cluster, RF=3:
> 1. stop node1
> 2. send some data with quorum (or one), it will generate hints file on 
> node2/node3
> 3. drop the table
> 4. start node1
> node2/node3 will report "corrupted hints file" and stop. The impact is very 
> bad for a large cluster, when it happens, almost all the nodes are down at 
> the same time and we have to remove all the hints files (which contain the 
> dropped table) to bring the node back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily

2017-07-19 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092658#comment-16092658
 ] 

Jay Zhuang edited comment on CASSANDRA-13696 at 7/19/17 6:32 AM:
-

Thanks [~jjirsa].
I did more investigation today. Seems it's more serious than I thought. Even 
there's no down node, "drop table" + write traffic, will trigger the problem.
Here is reproduce steps:
1. Create a 3 nodes cluster:
  {{$ ccm create test13696 -v 3.0.14 && ccm populate -n 3 && ccm start}}
2. Send some traffics with cassandra-stress (blogpost.yaml is only in trunk, if 
you use another yaml file, change the RF=3)
  {{$ tools/bin/cassandra-stress user profile=test/resources/blogpost.yaml 
cl=QUORUM truncate=never ops\(insert=1\) duration=30m -rate threads=2 -mode 
native cql3 -node 127.0.0.1}}
3. While the traffic is running, drop table
  {{$ cqlsh -e "drop table  stresscql.blogposts"}}
*All 3 nodes go down because of "Digest mismatch Exception".*

The CRC calculation problem has been there for a long time, but only got 
exposed after CASSANDRA-13004 because of the MessagingService version bump. In 
the normal case when the versions are the same, HintsDispatcher uses 
{{[page.buffersIterator()|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsDispatcher.java#L138]}}
 instead of 
{{[page.hintsIterator()|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsDispatcher.java#L139]}}.
 {{buffersIterator()}} doesn't need to decode hints, so it won't have the 
problem.

I think the messagingVersion for the hints file should be updated: 
https://github.com/apache/cassandra/compare/cassandra-3.0...cooldoger:13696.2-3.0?expand=1
 so it could dispatch hints in an optimized way. Not sure if we need to 
check/bump other {{MessagingService.VERSION_30}}s in the 3.0 branch.
cc [~ifesdjeen]


was (Author: jay.zhuang):
Thanks [~jjirsa].
I did more investigation today. Seems it's more serious than I thought. Even 
there's no down node, "drop table" while there's write traffic, it will trigger 
the problem.
Here is reproduce steps:
1. Create a 3 nodes cluster:
  {{$ ccm create test13696 -v 3.0.14 && ccm populate -n 3 && ccm start}}
2. Send some traffics with cassandra-stress (blogpost.yaml is only in trunk, if 
you use another yaml file, change the RF=3)
  {{$ tools/bin/cassandra-stress user profile=test/resources/blogpost.yaml 
cl=QUORUM truncate=never ops\(insert=1\) duration=30m -rate threads=2 -mode 
native cql3 -node 127.0.0.1}}
3. While the traffic is running, drop table
  {{$ cqlsh -e "drop table  stresscql.blogposts"}}
*All 3 nodes go down because of "Digest mismatch Exception".*

The CRC calculation problem has been there for a long time, but only got 
exposed after CASSANDRA-13004 because of the MessagingService version bump. In 
the normal case when the versions are the same, HintsDispatcher uses 
{{[page.buffersIterator()|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsDispatcher.java#L138]}}
 instead of 
{{[page.hintsIterator()|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsDispatcher.java#L139]}}.
 {{buffersIterator()}} doesn't need to decode hints, so it won't have the 
problem.

I think the messagingVersion for the hints file should be updated: 
https://github.com/apache/cassandra/compare/cassandra-3.0...cooldoger:13696.2-3.0?expand=1
 so it could dispatch hints in an optimized way. Not sure if we need to 
check/bump other {{MessagingService.VERSION_30}}s in the 3.0 branch.
cc [~ifesdjeen]

> Digest mismatch Exception if hints file has UnknownColumnFamily
> ---
>
> Key: CASSANDRA-13696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Critical
>
> {noformat}
> WARN  [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - 
> Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - 
> table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints
> ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 
> HintsDispatchExecutor.java:234 - Failed to dispatch hints file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
> exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164)
>  

[jira] [Commented] (CASSANDRA-13696) Digest mismatch Exception if hints file has UnknownColumnFamily

2017-07-19 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092658#comment-16092658
 ] 

Jay Zhuang commented on CASSANDRA-13696:


Thanks [~jjirsa].
I did more investigation today. Seems it's more serious than I thought. Even 
there's no down node, "drop table" while there's write traffic, it will trigger 
the problem.
Here is reproduce steps:
1. Create a 3 nodes cluster:
  {{$ ccm create test13696 -v 3.0.14 && ccm populate -n 3 && ccm start}}
2. Send some traffics with cassandra-stress (blogpost.yaml is only in trunk, if 
you use another yaml file, change the RF=3)
  {{$ tools/bin/cassandra-stress user profile=test/resources/blogpost.yaml 
cl=QUORUM truncate=never ops\(insert=1\) duration=30m -rate threads=2 -mode 
native cql3 -node 127.0.0.1}}
3. While the traffic is running, drop table
  {{$ cqlsh -e "drop table  stresscql.blogposts"}}
*All 3 nodes go down because of "Digest mismatch Exception".*

The CRC calculation problem has been there for a long time, but only got 
exposed after CASSANDRA-13004 because of the MessagingService version bump. In 
the normal case when the versions are the same, HintsDispatcher uses 
{{[page.buffersIterator()|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsDispatcher.java#L138]}}
 instead of 
{{[page.hintsIterator()|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsDispatcher.java#L139]}}.
 {{buffersIterator()}} doesn't need to decode hints, so it won't have the 
problem.

I think the messagingVersion for the hints file should be updated: 
https://github.com/apache/cassandra/compare/cassandra-3.0...cooldoger:13696.2-3.0?expand=1
 so it could dispatch hints in an optimized way. Not sure if we need to 
check/bump other {{MessagingService.VERSION_30}}s in the 3.0 branch.
cc [~ifesdjeen]

> Digest mismatch Exception if hints file has UnknownColumnFamily
> ---
>
> Key: CASSANDRA-13696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Critical
>
> {noformat}
> WARN  [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - 
> Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - 
> table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints
> ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 
> HintsDispatchExecutor.java:234 - Failed to dispatch hints file 
> a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch 
> exception
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139)
>  ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) 
> ~[main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229)
>  [main/:na]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208)
>  [main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [main/:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
> Caused by: java.io.IOException: Digest mismatch exception
> at 
>