[jira] [Updated] (CASSANDRA-13933) Handle mutateRepaired failure in nodetool verify
[ https://issues.apache.org/jira/browse/CASSANDRA-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-13933: Reviewer: Marcus Eriksson > Handle mutateRepaired failure in nodetool verify > > > Key: CASSANDRA-13933 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13933 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Sumanth Pasupuleti >Priority: Major > Labels: lhf > Attachments: CASSANDRA-13933-3.0.txt, CASSANDRA-13933-3.11.txt, > CASSANDRA-13933-trunk.txt > > > See comment here: > https://issues.apache.org/jira/browse/CASSANDRA-13922?focusedCommentId=16189875&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16189875 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13816) Document and test CAS and non-CAS batch behavior for deleting and inserting the same key
[ https://issues.apache.org/jira/browse/CASSANDRA-13816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354942#comment-16354942 ] Ariel Weisberg edited comment on CASSANDRA-13816 at 2/7/18 4:12 AM: The part where this isn't LHF is that I wasn't able to verify the difference in behavior with a unit test. I know the behavior is not as expected just from some comparisons done with some other data I can't share. That's the reason I never made progress on this ticket. The increased difficulty dropping it down my priority list relative to its perceived value. I would work with someone new on it though so it wouldn't be insurmountable. was (Author: aweisberg): The part where this isn't LHF is that I wasn't able to verify the difference in behavior with a unit test. I know the behavior is not as expected just from some comparisons done with some other data I can't share. That's the reason I never made progress on this ticket. The increased difficulty dropping it down my priority list relative to its perceived value. > Document and test CAS and non-CAS batch behavior for deleting and inserting > the same key > > > Key: CASSANDRA-13816 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13816 > Project: Cassandra > Issue Type: Bug > Components: Documentation and Website, Testing >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg >Priority: Minor > Labels: lhf > Fix For: 3.11.x, 4.x > > > Add/verify unit tests for inserting and deleting the same key for cell > deletion, partition deletion, and range tombstone deletion for both CAS and > non-CAS batches. > Don't change the existing behavior. > The behavior differs between batch and CAS so in the both the CAS and batch > documentation mention that the behavior is not consistent. Make sure it is > visible in both high level and reference docs for that functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13816) Document and test CAS and non-CAS batch behavior for deleting and inserting the same key
[ https://issues.apache.org/jira/browse/CASSANDRA-13816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354942#comment-16354942 ] Ariel Weisberg commented on CASSANDRA-13816: The part where this isn't LHF is that I wasn't able to verify the difference in behavior with a unit test. I know the behavior is not as expected just from some comparisons done with some other data I can't share. That's the reason I never made progress on this ticket. The increased difficulty dropping it down my priority list relative to its perceived value. > Document and test CAS and non-CAS batch behavior for deleting and inserting > the same key > > > Key: CASSANDRA-13816 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13816 > Project: Cassandra > Issue Type: Bug > Components: Documentation and Website, Testing >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg >Priority: Minor > Labels: lhf > Fix For: 3.11.x, 4.x > > > Add/verify unit tests for inserting and deleting the same key for cell > deletion, partition deletion, and range tombstone deletion for both CAS and > non-CAS batches. > Don't change the existing behavior. > The behavior differs between batch and CAS so in the both the CAS and batch > documentation mention that the behavior is not consistent. Make sure it is > visible in both high level and reference docs for that functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12245) initial view build can be parallel
[ https://issues.apache.org/jira/browse/CASSANDRA-12245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-12245: Fix Version/s: (was: 4.x) 4.0 > initial view build can be parallel > -- > > Key: CASSANDRA-12245 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12245 > Project: Cassandra > Issue Type: Improvement > Components: Materialized Views >Reporter: Tom van der Woerdt >Assignee: Andrés de la Peña >Priority: Major > Fix For: 4.0 > > > On a node with lots of data (~3TB) building a materialized view takes several > weeks, which is not ideal. It's doing this in a single thread. > There are several potential ways this can be optimized : > * do vnodes in parallel, instead of going through the entire range in one > thread > * just iterate through sstables, not worrying about duplicates, and include > the timestamp of the original write in the MV mutation. since this doesn't > exclude duplicates it does increase the amount of work and could temporarily > surface ghost rows (yikes) but I guess that's why they call it eventual > consistency. doing it this way can avoid holding references to all tables on > disk, allows parallelization, and removes the need to check other sstables > for existing data. this is essentially the 'do a full repair' path -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-14210) Optimize SSTables upgrade task scheduling
[ https://issues.apache.org/jira/browse/CASSANDRA-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Greaves reassigned CASSANDRA-14210: Assignee: Kurt Greaves > Optimize SSTables upgrade task scheduling > - > > Key: CASSANDRA-14210 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14210 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Oleksandr Shulgin >Assignee: Kurt Greaves >Priority: Major > > When starting the SSTable-rewrite process by running {{nodetool > upgradesstables --jobs N}}, with N > 1, not all of the provided N slots are > used. > For example, we were testing with {{concurrent_compactors=5}} and {{N=4}}. > What we observed both for version 2.2 and 3.0, is that initially all 4 > provided slots are used for "Upgrade sstables" compactions, but later when > some of the 4 tasks are finished, no new tasks are scheduled immediately. It > takes the last of the 4 tasks to finish before new 4 tasks would be > scheduled. This happens on every node we've observed. > This doesn't utilize available resources to the full extent allowed by the > --jobs N parameter. In the field, on a cluster of 12 nodes with 4-5 TiB data > each, we've seen that the whole process was taking more than 7 days, instead > of estimated 1.5-2 days (provided there would be close to full N slots > utilization). > Instead, new tasks should be scheduled as soon as there is a free compaction > slot. > Additionally, starting from the biggest SSTables could further reduce the > total time required for the whole process to finish on any given node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-13740: Status: Patch Available (was: Open) > Orphan hint file gets created while node is being removed from cluster > -- > > Key: CASSANDRA-13740 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13740 > Project: Cassandra > Issue Type: Bug > Components: Core, Hints >Reporter: Jaydeepkumar Chovatia >Assignee: Jaydeepkumar Chovatia >Priority: Minor > Fix For: 3.0.x, 3.11.x > > Attachments: 13740-3.0.15.txt, gossip_hang_test.py > > > I have found this new issue during my test, whenever node is being removed > then hint file for that node gets written and stays inside the hint directory > forever. I debugged the code and found that it is due to the race condition > between [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > and [HintsWriteExecutor.java::closeWriter | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106] > . > > *Time t1* Node is down, as a result Hints are being written by > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > *Time t2* Node is removed from cluster as a result it calls > [HintsService.java-exciseStore | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327] > which removes hint files for the node being removed > *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write > | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145] > which again calls [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215] > and new orphan file gets created > I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that > helped me reproduce this new bug. I will submit patch for this new dtest > later. > I also tried following to check how this orphan hint file responds: > 1. I tried {{nodetool truncatehints }} but it fails as node is no > longer part of the ring > 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint > file because it is not yet included in the [dispatchDequeue | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53] > Reproducible steps: > Please find dTest python file {{gossip_hang_test.py}} attached which > reproduces this bug. > Solution: > This is due to race condition as mentioned above. Since > {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so > solution becomes little simple. Whenever we [HintService.java::excise | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303] > a host, just store it in-memory, and check for already evicted host inside > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]. > If already evicted host is found then ignore hints. > Jaydeep -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14212) Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non bootstrap case as well)
[ https://issues.apache.org/jira/browse/CASSANDRA-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-14212: Fix Version/s: 3.11.x Status: Patch Available (was: In Progress) > Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non > bootstrap case as well) > - > > Key: CASSANDRA-14212 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14212 > Project: Cassandra > Issue Type: Improvement >Reporter: mck >Assignee: mck >Priority: Major > Fix For: 3.11.x > > > Backport CASSANDRA-13080 to 3.11.x > > The patch applies without conflict to the {{cassandra-3.11}} and equally > concerns to users of Cassandra-3.11.1 > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14215) Cassandra does not seem to be respecting max hint window
[ https://issues.apache.org/jira/browse/CASSANDRA-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354724#comment-16354724 ] Jeff Jirsa edited comment on CASSANDRA-14215 at 2/6/18 11:31 PM: - We don't know when the node is going to come back up, and we don't know what you as a user expect to happen. We guarantee we won't replay a hint past the time you specify, but as I explained, if your limit is N hours, and you're down for N + 1 hour, we can still replay N hours of data via hints, making you repair only 1 hour instead of N. For many users, this is preferable to replaying nothing. There is a {{deleteAllHintsForEndpoint}} JMX target that would let you purge hints (manually), but it does perhaps seem like a missing feature that we don't more aggressively clean up hints that are expired. [~iamaleksey] any thoughts here? Edit: reading your last comment, I may be wrong (it happens a lot, so maybe I shouldn't be surprised). Aleksey will know for sure. was (Author: jjirsa): We don't know when the node is going to come back up, and we don't know what you as a user expect to happen. We guarantee we won't replay a hint past the time you specify, but as I explained, if your limit is N hours, and you're down for N + 1 hour, we can still replay N hours of data via hints, making you repair only 1 hour instead of N. For many users, this is preferable to replaying nothing. There is a {{deleteAllHintsForEndpoint}} JMX target that would let you purge hints (manually), but it does perhaps seem like a missing feature that we don't more aggressively clean up hints that are expired. [~iamaleksey] any thoughts here? > Cassandra does not seem to be respecting max hint window > > > Key: CASSANDRA-14215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14215 > Project: Cassandra > Issue Type: Bug > Components: Hints, Streaming and Messaging >Reporter: Arijit >Priority: Major > > On Cassandra 3.0.9, it was observed that Cassandra continues to write hints > even though a node remains down (and does not come up) for longer than the > default 3 hour window. > > After doing "nodetool setlogginglevel org.apache.cassandra TRACE", we see the > following log line in cassandra (debug) logs: > StorageProxy.java:2625 - Adding hints for [/10.0.100.84] > > One possible code path seems to be: > cas -> commitPaxos(proposal, consistencyForCommit, true); -> submitHint (in > StorageProxy.java) > > The "true" parameter above explicitly states that a hint should be recorded > and ignores the time window calculation performed by the shouldHint method > invoked in other code paths. Is there a reason for this behavior? > > Edit: There are actually two stacks that seem to be producing hints, the > "cas" and "syncWriteBatchedMutations" methods. I have posted them below. > > A third issue seems to be that Cassandra seems to reset the timer which > counts how long a node has been down after a restart. Thus if Cassandra is > restarted on a good node, it continues to accumulate hints for a down node > over the next three hours. > > WARN [SharedPool-Worker-14] 2018-02-06 22:15:51,136 StorageProxy.java:2636 - > Adding hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at > org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2603) > at > org.apache.cassandra.service.StorageProxy.commitPaxos(StorageProxy.java:540) > at org.apache.cassandra.service.StorageProxy.cas(StorageProxy.java:282) at > org.apache.cassandra.cql3.statements.ModificationStatement.executeWithCondition(ModificationStatement.java:432) > at > org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:407) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:237) > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115) > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > at > io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) > at
[jira] [Commented] (CASSANDRA-14215) Cassandra does not seem to be respecting max hint window
[ https://issues.apache.org/jira/browse/CASSANDRA-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354724#comment-16354724 ] Jeff Jirsa commented on CASSANDRA-14215: We don't know when the node is going to come back up, and we don't know what you as a user expect to happen. We guarantee we won't replay a hint past the time you specify, but as I explained, if your limit is N hours, and you're down for N + 1 hour, we can still replay N hours of data via hints, making you repair only 1 hour instead of N. For many users, this is preferable to replaying nothing. There is a {{deleteAllHintsForEndpoint}} JMX target that would let you purge hints (manually), but it does perhaps seem like a missing feature that we don't more aggressively clean up hints that are expired. [~iamaleksey] any thoughts here? > Cassandra does not seem to be respecting max hint window > > > Key: CASSANDRA-14215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14215 > Project: Cassandra > Issue Type: Bug > Components: Hints, Streaming and Messaging >Reporter: Arijit >Priority: Major > > On Cassandra 3.0.9, it was observed that Cassandra continues to write hints > even though a node remains down (and does not come up) for longer than the > default 3 hour window. > > After doing "nodetool setlogginglevel org.apache.cassandra TRACE", we see the > following log line in cassandra (debug) logs: > StorageProxy.java:2625 - Adding hints for [/10.0.100.84] > > One possible code path seems to be: > cas -> commitPaxos(proposal, consistencyForCommit, true); -> submitHint (in > StorageProxy.java) > > The "true" parameter above explicitly states that a hint should be recorded > and ignores the time window calculation performed by the shouldHint method > invoked in other code paths. Is there a reason for this behavior? > > Edit: There are actually two stacks that seem to be producing hints, the > "cas" and "syncWriteBatchedMutations" methods. I have posted them below. > > A third issue seems to be that Cassandra seems to reset the timer which > counts how long a node has been down after a restart. Thus if Cassandra is > restarted on a good node, it continues to accumulate hints for a down node > over the next three hours. > > WARN [SharedPool-Worker-14] 2018-02-06 22:15:51,136 StorageProxy.java:2636 - > Adding hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at > org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2603) > at > org.apache.cassandra.service.StorageProxy.commitPaxos(StorageProxy.java:540) > at org.apache.cassandra.service.StorageProxy.cas(StorageProxy.java:282) at > org.apache.cassandra.cql3.statements.ModificationStatement.executeWithCondition(ModificationStatement.java:432) > at > org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:407) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:237) > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115) > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > at > io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) > at > io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) at > java.lang.Thread.run(Thread.java:748) WARN > > > [SharedPool-Worker-8] 2018-02-06 22:15:51,153 StorageProxy.java:2636 - Adding > hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at > org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) > at > org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:1247) > at
[jira] [Commented] (CASSANDRA-14215) Cassandra does not seem to be respecting max hint window
[ https://issues.apache.org/jira/browse/CASSANDRA-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354723#comment-16354723 ] Arijit commented on CASSANDRA-14215: [~jjirsa] Is Hint.isLive a bit orthogonal to this, it seems to be dealing with gc grace period? Also hints *are not written* in some code paths once the node is observed to be down for more than the window. Also the documentation around hints explicitly says that hints will not *be generated* when a node is down or longer than the window. Finally, check out the bit around the node down timer being reset on Cassandra restarts. This does not matter if we are ok with generating hints beyond the max hint window, but I really don't think that we should be... > Cassandra does not seem to be respecting max hint window > > > Key: CASSANDRA-14215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14215 > Project: Cassandra > Issue Type: Bug > Components: Hints, Streaming and Messaging >Reporter: Arijit >Priority: Major > > On Cassandra 3.0.9, it was observed that Cassandra continues to write hints > even though a node remains down (and does not come up) for longer than the > default 3 hour window. > > After doing "nodetool setlogginglevel org.apache.cassandra TRACE", we see the > following log line in cassandra (debug) logs: > StorageProxy.java:2625 - Adding hints for [/10.0.100.84] > > One possible code path seems to be: > cas -> commitPaxos(proposal, consistencyForCommit, true); -> submitHint (in > StorageProxy.java) > > The "true" parameter above explicitly states that a hint should be recorded > and ignores the time window calculation performed by the shouldHint method > invoked in other code paths. Is there a reason for this behavior? > > Edit: There are actually two stacks that seem to be producing hints, the > "cas" and "syncWriteBatchedMutations" methods. I have posted them below. > > A third issue seems to be that Cassandra seems to reset the timer which > counts how long a node has been down after a restart. Thus if Cassandra is > restarted on a good node, it continues to accumulate hints for a down node > over the next three hours. > > WARN [SharedPool-Worker-14] 2018-02-06 22:15:51,136 StorageProxy.java:2636 - > Adding hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at > org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2603) > at > org.apache.cassandra.service.StorageProxy.commitPaxos(StorageProxy.java:540) > at org.apache.cassandra.service.StorageProxy.cas(StorageProxy.java:282) at > org.apache.cassandra.cql3.statements.ModificationStatement.executeWithCondition(ModificationStatement.java:432) > at > org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:407) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:237) > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115) > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > at > io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) > at > io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) at > java.lang.Thread.run(Thread.java:748) WARN > > > [SharedPool-Worker-8] 2018-02-06 22:15:51,153 StorageProxy.java:2636 - Adding > hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at > org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) > at > org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:1247) > at > org.apache.cassandra.service.StorageProxy.syncWriteBatchedMutations(StorageProxy.ja
[jira] [Commented] (CASSANDRA-14215) Cassandra does not seem to be respecting max hint window
[ https://issues.apache.org/jira/browse/CASSANDRA-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354712#comment-16354712 ] Arijit commented on CASSANDRA-14215: [~jjirsa] Thanks for the response. What's the reason for writing hints beyond the hint window if there's no chance that they would be replayed? On our production cluster, this causes downtime as the Cassandra partition fills up quickly with hints when a node goes down. > Cassandra does not seem to be respecting max hint window > > > Key: CASSANDRA-14215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14215 > Project: Cassandra > Issue Type: Bug > Components: Hints, Streaming and Messaging >Reporter: Arijit >Priority: Major > > On Cassandra 3.0.9, it was observed that Cassandra continues to write hints > even though a node remains down (and does not come up) for longer than the > default 3 hour window. > > After doing "nodetool setlogginglevel org.apache.cassandra TRACE", we see the > following log line in cassandra (debug) logs: > StorageProxy.java:2625 - Adding hints for [/10.0.100.84] > > One possible code path seems to be: > cas -> commitPaxos(proposal, consistencyForCommit, true); -> submitHint (in > StorageProxy.java) > > The "true" parameter above explicitly states that a hint should be recorded > and ignores the time window calculation performed by the shouldHint method > invoked in other code paths. Is there a reason for this behavior? > > Edit: There are actually two stacks that seem to be producing hints, the > "cas" and "syncWriteBatchedMutations" methods. I have posted them below. > > A third issue seems to be that Cassandra seems to reset the timer which > counts how long a node has been down after a restart. Thus if Cassandra is > restarted on a good node, it continues to accumulate hints for a down node > over the next three hours. > > WARN [SharedPool-Worker-14] 2018-02-06 22:15:51,136 StorageProxy.java:2636 - > Adding hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at > org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2603) > at > org.apache.cassandra.service.StorageProxy.commitPaxos(StorageProxy.java:540) > at org.apache.cassandra.service.StorageProxy.cas(StorageProxy.java:282) at > org.apache.cassandra.cql3.statements.ModificationStatement.executeWithCondition(ModificationStatement.java:432) > at > org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:407) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:237) > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115) > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > at > io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) > at > io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) at > java.lang.Thread.run(Thread.java:748) WARN > > > [SharedPool-Worker-8] 2018-02-06 22:15:51,153 StorageProxy.java:2636 - Adding > hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at > org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) > at > org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:1247) > at > org.apache.cassandra.service.StorageProxy.syncWriteBatchedMutations(StorageProxy.java:1014) > at > org.apache.cassandra.service.StorageProxy.mutateAtomically(StorageProxy.java:899) > at > org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:834) > at > org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java
[jira] [Commented] (CASSANDRA-14215) Cassandra does not seem to be respecting max hint window
[ https://issues.apache.org/jira/browse/CASSANDRA-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354710#comment-16354710 ] Jeff Jirsa commented on CASSANDRA-14215: Fairly sure this is working as intended; we should write hints beyond the hint window, we just won't replay hints beyond the hint window (so if a host is down for 4 hours, and hint lifetime is 3 hours, we'll lose the first hour of hints, but still replay the last 3). Check out {{Hint.isLive()}} for more info, and let me know if we can close this if you agree. > Cassandra does not seem to be respecting max hint window > > > Key: CASSANDRA-14215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14215 > Project: Cassandra > Issue Type: Bug > Components: Hints, Streaming and Messaging >Reporter: Arijit >Priority: Major > > On Cassandra 3.0.9, it was observed that Cassandra continues to write hints > even though a node remains down (and does not come up) for longer than the > default 3 hour window. > > After doing "nodetool setlogginglevel org.apache.cassandra TRACE", we see the > following log line in cassandra (debug) logs: > StorageProxy.java:2625 - Adding hints for [/10.0.100.84] > > One possible code path seems to be: > cas -> commitPaxos(proposal, consistencyForCommit, true); -> submitHint (in > StorageProxy.java) > > The "true" parameter above explicitly states that a hint should be recorded > and ignores the time window calculation performed by the shouldHint method > invoked in other code paths. Is there a reason for this behavior? > > Edit: There are actually two stacks that seem to be producing hints, the > "cas" and "syncWriteBatchedMutations" methods > WARN [SharedPool-Worker-14] 2018-02-06 22:15:51,136 StorageProxy.java:2636 - > Adding hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at > org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2603) > at > org.apache.cassandra.service.StorageProxy.commitPaxos(StorageProxy.java:540) > at org.apache.cassandra.service.StorageProxy.cas(StorageProxy.java:282) at > org.apache.cassandra.cql3.statements.ModificationStatement.executeWithCondition(ModificationStatement.java:432) > at > org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:407) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:237) > at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115) > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > at > io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) > at > io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) at > java.lang.Thread.run(Thread.java:748) WARN > > > [SharedPool-Worker-8] 2018-02-06 22:15:51,153 StorageProxy.java:2636 - Adding > hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at > org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) > at > org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:1247) > at > org.apache.cassandra.service.StorageProxy.syncWriteBatchedMutations(StorageProxy.java:1014) > at > org.apache.cassandra.service.StorageProxy.mutateAtomically(StorageProxy.java:899) > at > org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:834) > at > org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java:365) > at > org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:343) > at > org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:
[jira] [Updated] (CASSANDRA-14215) Cassandra does not seem to be respecting max hint window
[ https://issues.apache.org/jira/browse/CASSANDRA-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arijit updated CASSANDRA-14215: --- Description: On Cassandra 3.0.9, it was observed that Cassandra continues to write hints even though a node remains down (and does not come up) for longer than the default 3 hour window. After doing "nodetool setlogginglevel org.apache.cassandra TRACE", we see the following log line in cassandra (debug) logs: StorageProxy.java:2625 - Adding hints for [/10.0.100.84] One possible code path seems to be: cas -> commitPaxos(proposal, consistencyForCommit, true); -> submitHint (in StorageProxy.java) The "true" parameter above explicitly states that a hint should be recorded and ignores the time window calculation performed by the shouldHint method invoked in other code paths. Is there a reason for this behavior? Edit: There are actually two stacks that seem to be producing hints, the "cas" and "syncWriteBatchedMutations" methods. I have posted them below. A third issue seems to be that Cassandra seems to reset the timer which counts how long a node has been down after a restart. Thus if Cassandra is restarted on a good node, it continues to accumulate hints for a down node over the next three hours. WARN [SharedPool-Worker-14] 2018-02-06 22:15:51,136 StorageProxy.java:2636 - Adding hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) at org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) at org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2603) at org.apache.cassandra.service.StorageProxy.commitPaxos(StorageProxy.java:540) at org.apache.cassandra.service.StorageProxy.cas(StorageProxy.java:282) at org.apache.cassandra.cql3.statements.ModificationStatement.executeWithCondition(ModificationStatement.java:432) at org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:407) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:237) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222) at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115) at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) at java.lang.Thread.run(Thread.java:748) WARN [SharedPool-Worker-8] 2018-02-06 22:15:51,153 StorageProxy.java:2636 - Adding hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) at org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:1247) at org.apache.cassandra.service.StorageProxy.syncWriteBatchedMutations(StorageProxy.java:1014) at org.apache.cassandra.service.StorageProxy.mutateAtomically(StorageProxy.java:899) at org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:834) at org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java:365) at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:343) at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:329) at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:324) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:237) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222) at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115) at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.
[jira] [Updated] (CASSANDRA-14215) Cassandra does not seem to be respecting max hint window
[ https://issues.apache.org/jira/browse/CASSANDRA-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arijit updated CASSANDRA-14215: --- Description: On Cassandra 3.0.9, it was observed that Cassandra continues to write hints even though a node remains down (and does not come up) for longer than the default 3 hour window. After doing "nodetool setlogginglevel org.apache.cassandra TRACE", we see the following log line in cassandra (debug) logs: StorageProxy.java:2625 - Adding hints for [/10.0.100.84] One possible code path seems to be: cas -> commitPaxos(proposal, consistencyForCommit, true); -> submitHint (in StorageProxy.java) The "true" parameter above explicitly states that a hint should be recorded and ignores the time window calculation performed by the shouldHint method invoked in other code paths. Is there a reason for this behavior? Edit: There are actually two stacks that seem to be producing hints, the "cas" and "syncWriteBatchedMutations" methods WARN [SharedPool-Worker-14] 2018-02-06 22:15:51,136 StorageProxy.java:2636 - Adding hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) at org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) at org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2603) at org.apache.cassandra.service.StorageProxy.commitPaxos(StorageProxy.java:540) at org.apache.cassandra.service.StorageProxy.cas(StorageProxy.java:282) at org.apache.cassandra.cql3.statements.ModificationStatement.executeWithCondition(ModificationStatement.java:432) at org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:407) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:237) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222) at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115) at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) at java.lang.Thread.run(Thread.java:748) WARN [SharedPool-Worker-8] 2018-02-06 22:15:51,153 StorageProxy.java:2636 - Adding hints for [/10.0.100.84] with stack trace: java.lang.Throwable: at org.apache.cassandra.service.StorageProxy.stackTrace(StorageProxy.java:2608) at org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:2617) at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:1247) at org.apache.cassandra.service.StorageProxy.syncWriteBatchedMutations(StorageProxy.java:1014) at org.apache.cassandra.service.StorageProxy.mutateAtomically(StorageProxy.java:899) at org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:834) at org.apache.cassandra.cql3.statements.BatchStatement.executeWithoutConditions(BatchStatement.java:365) at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:343) at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:329) at org.apache.cassandra.cql3.statements.BatchStatement.execute(BatchStatement.java:324) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:237) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222) at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115) at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) at io.netty.channel.AbstractChannelHandlerContext$8.run(Abstra
[jira] [Updated] (CASSANDRA-13816) Document and test CAS and non-CAS batch behavior for deleting and inserting the same key
[ https://issues.apache.org/jira/browse/CASSANDRA-13816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13816: --- Labels: lhf (was: ) > Document and test CAS and non-CAS batch behavior for deleting and inserting > the same key > > > Key: CASSANDRA-13816 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13816 > Project: Cassandra > Issue Type: Bug > Components: Documentation and Website, Testing >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg >Priority: Minor > Labels: lhf > Fix For: 3.11.x, 4.x > > > Add/verify unit tests for inserting and deleting the same key for cell > deletion, partition deletion, and range tombstone deletion for both CAS and > non-CAS batches. > Don't change the existing behavior. > The behavior differs between batch and CAS so in the both the CAS and batch > documentation mention that the behavior is not consistent. Make sure it is > visible in both high level and reference docs for that functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Resolved] (CASSANDRA-14174) Remove GossipDigestSynVerbHandler#doSort()
[ https://issues.apache.org/jira/browse/CASSANDRA-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown resolved CASSANDRA-14174. - Resolution: Fixed Fix Version/s: (was: 4.x) 4.0 Thanks for the review and confirmation, [~jkni]. committed as sha {{f2fc2e96738505118d9ad161ec77d66a2369fe44}}. > Remove GossipDigestSynVerbHandler#doSort() > -- > > Key: CASSANDRA-14174 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14174 > Project: Cassandra > Issue Type: Improvement >Reporter: Jason Brown >Assignee: Jason Brown >Priority: Minor > Fix For: 4.0 > > > I have personally tripped up on this function a couple of times over the > years, believing that it contributes to bugs in some way or another. While I > have not found that (necessarily!) to be the case, I feel this function is > completely useless in the grand scope of things. > Going back through the mists of time (that is, {{git log}}), it appears this > function was part of the original code drop from Facebook when they open > sourced cassandra. Looking at the {{#doSort()}} method, all it does is sort > the incoming list of {{GossipDigest}} s by the difference between the remote > node's maxValue for a given peer and the local nodes' maxValue. > The only universe where this is actually an optimization is if you go back > and read the [Scuttlebutt > paper|https://www.cs.cornell.edu/home/rvr/papers/flowgossip.pdf] (upon which > cassandra's Gossip anti-entropy reconciliation is based). The end of section > 3.2 describes ordering of the incoming digests such that, in the case where > you do not return all of the differences (because you are optimizing for the > return message size), you can gather the differences for the peers which are > most of out sync. The ordering implemented in cassandra is the second > ordering described in the paper, called "scuttle depth". > As we always send all differences between two nodes (message size be damned), > this optimization, borrowed from the paper, is largely irrelevant for > Cassandra's purposes. > Thus, I propose we remove this method for the following gains: > - less garbage created > - less CPU (sure, it's mostly trivial; see next point) > - less time spent on unnecessary functionality on the *single threaded* > gossip stage. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Remove GossipDigestSynVerbHandler#doSort()
Repository: cassandra Updated Branches: refs/heads/trunk a8ce4b6cd -> f2fc2e967 Remove GossipDigestSynVerbHandler#doSort() patch by jasobrown; reviewed by Joel Knighton for CASSANDRA-14174 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f2fc2e96 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f2fc2e96 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f2fc2e96 Branch: refs/heads/trunk Commit: f2fc2e96738505118d9ad161ec77d66a2369fe44 Parents: a8ce4b6 Author: Jason Brown Authored: Tue Feb 6 13:38:46 2018 -0800 Committer: Jason Brown Committed: Tue Feb 6 13:38:46 2018 -0800 -- CHANGES.txt | 1 + .../gms/GossipDigestSynVerbHandler.java | 47 2 files changed, 1 insertion(+), 47 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f2fc2e96/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a10b6eb..62775ce 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Remove GossipDigestSynVerbHandler#doSort() (CASSANDRA-14174) * Add nodetool clientlist (CASSANDRA-13665) * Revert ProtocolVersion changes from CASSANDRA-7544 (CASSANDRA-14211) * Non-disruptive seed node list reload (CASSANDRA-14190) http://git-wip-us.apache.org/repos/asf/cassandra/blob/f2fc2e96/src/java/org/apache/cassandra/gms/GossipDigestSynVerbHandler.java -- diff --git a/src/java/org/apache/cassandra/gms/GossipDigestSynVerbHandler.java b/src/java/org/apache/cassandra/gms/GossipDigestSynVerbHandler.java index 9619f4e..b06c24d 100644 --- a/src/java/org/apache/cassandra/gms/GossipDigestSynVerbHandler.java +++ b/src/java/org/apache/cassandra/gms/GossipDigestSynVerbHandler.java @@ -19,8 +19,6 @@ package org.apache.cassandra.gms; import java.util.*; -import com.google.common.collect.Maps; - import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -99,8 +97,6 @@ public class GossipDigestSynVerbHandler implements IVerbHandler logger.trace("Gossip syn digests are : {}", sb); } -doSort(gDigestList); - List deltaGossipDigestList = new ArrayList(); Map deltaEpStateMap = new HashMap(); Gossiper.instance.examineGossiper(gDigestList, deltaGossipDigestList, deltaEpStateMap); @@ -112,47 +108,4 @@ public class GossipDigestSynVerbHandler implements IVerbHandler logger.trace("Sending a GossipDigestAckMessage to {}", from); MessagingService.instance().sendOneWay(gDigestAckMessage, from); } - -/* - * First construct a map whose key is the endpoint in the GossipDigest and the value is the - * GossipDigest itself. Then build a list of version differences i.e difference between the - * version in the GossipDigest and the version in the local state for a given InetAddressAndPort. - * Sort this list. Now loop through the sorted list and retrieve the GossipDigest corresponding - * to the endpoint from the map that was initially constructed. -*/ -private void doSort(List gDigestList) -{ -/* Construct a map of endpoint to GossipDigest. */ -Map epToDigestMap = Maps.newHashMapWithExpectedSize(gDigestList.size()); -for (GossipDigest gDigest : gDigestList) -{ -epToDigestMap.put(gDigest.getEndpoint(), gDigest); -} - -/* - * These digests have their maxVersion set to the difference of the version - * of the local EndpointState and the version found in the GossipDigest. -*/ -List diffDigests = new ArrayList(gDigestList.size()); -for (GossipDigest gDigest : gDigestList) -{ -InetAddressAndPort ep = gDigest.getEndpoint(); -EndpointState epState = Gossiper.instance.getEndpointStateForEndpoint(ep); -int version = (epState != null) ? Gossiper.instance.getMaxEndpointStateVersion(epState) : 0; -int diffVersion = Math.abs(version - gDigest.getMaxVersion()); -diffDigests.add(new GossipDigest(ep, gDigest.getGeneration(), diffVersion)); -} - -gDigestList.clear(); -Collections.sort(diffDigests); -int size = diffDigests.size(); -/* - * Report the digests in descending order. This takes care of the endpoints - * that are far behind w.r.t this local endpoint -*/ -for (int i = size - 1; i >= 0; --i) -{ - gDigestList.add(epToDigestMap.get(diffDigests.get(i).getEndpoint())); -} -} } - To unsubscribe, e-mail: commits-unsubscr...@cassa
[jira] [Commented] (CASSANDRA-13314) Config file based SSL settings
[ https://issues.apache.org/jira/browse/CASSANDRA-13314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354207#comment-16354207 ] Ron Blechman commented on CASSANDRA-13314: -- Adding the ability for Cassandra to use SSLContext.getDefault() and/or SSLFactory.getDefault() would be helpful here, particularly in cases where one would like to implement additional checking / custom implementations for certificate validation (i.e. certificate revocation, host name validation, etc.) > Config file based SSL settings > -- > > Key: CASSANDRA-13314 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13314 > Project: Cassandra > Issue Type: Improvement > Components: Configuration, Tools >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 4.x > > > As follow up of CASSANDRA-13259, I'd like to continue discussing how we can > make SSL less awkward to use and further move SSL related code out of our > code base. Currently we construct our own SSLContext in SSLFactory based on > EncryptionOptions passed by the MessagingService or any individual tool where > we need to offer SSL support. This leads to a situation where the user has > not only to learn how to enable the correct settings in cassandra.yaml, but > these settings must also be reflected in each tool's own command line > options. As argued in CASSANDRA-13259, these settings could be done as well > by setting the appropriate system and security properties > ([overview|http://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html#InstallationAndCustomization]) > and we should just point the user to the right files to do that (jvm.options > and java.security) and make sure that daemon and all affected tools will > source them. > Since giving this a quick try on my WIP branch, I've noticed the following > issues in doing so: > * Keystore passwords will show up in process list > (-Djavax.net.ssl.keyStorePassword=..). We should keep the password setting in > cassandra.yaml and clis and do a System.setProperty() if they have been > provided. > * It's only possible to configure settings for a single default > key-/truststore. Since we currently allow configuring both > ServerEncryptionOptions and ClientEncryptionOptions with different settings, > we'd have to make this a breaking change. I don't really see why you would > want to use different stores for node-to-node and node-to-client, but that > wouldn't be possible anymore. > * This would probably only make sense if we really remove the affected CLI > options, or we'll end up with just another way to configure this stuff. This > will break existing scripts and obsolete existing documentation. > Any opinions? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14217) nodetool verify needs to use the correct digest file and reload sstable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-14217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16353931#comment-16353931 ] Marcus Eriksson edited comment on CASSANDRA-14217 at 2/6/18 2:28 PM: - 3.0: https://github.com/krummas/cassandra/commits/marcuse/14217 3.11: https://github.com/krummas/cassandra/commits/marcuse/14217-3.11 trunk only needs the sstable metadata reload patch and that will go in CASSANDRA-14201 was (Author: krummas): 3.0: https://github.com/krummas/cassandra/commits/marcuse/14217 3.11: https://github.com/krummas/cassandra/commits/marcuse/14217 trunk only needs the sstable metadata reload patch and that will go in CASSANDRA-14201 > nodetool verify needs to use the correct digest file and reload sstable > metadata > > > Key: CASSANDRA-14217 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14217 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Major > Fix For: 3.0.x, 3.11.x > > > {{nodetool verify}} tries to use the wrong digest file when verifying old > version sstables and it also needs to reload the sstable metadata and notify > compaction strategies when it mutates the repaired at field -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14217) nodetool verify needs to use the correct digest file and reload sstable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-14217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-14217: Fix Version/s: (was: 4.x) Status: Patch Available (was: Open) 3.0: https://github.com/krummas/cassandra/commits/marcuse/14217 3.11: https://github.com/krummas/cassandra/commits/marcuse/14217 trunk only needs the sstable metadata reload patch and that will go in CASSANDRA-14201 > nodetool verify needs to use the correct digest file and reload sstable > metadata > > > Key: CASSANDRA-14217 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14217 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Major > Fix For: 3.0.x, 3.11.x > > > {{nodetool verify}} tries to use the wrong digest file when verifying old > version sstables and it also needs to reload the sstable metadata and notify > compaction strategies when it mutates the repaired at field -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14217) nodetool verify needs to use the correct digest file and reload sstable metadata
Marcus Eriksson created CASSANDRA-14217: --- Summary: nodetool verify needs to use the correct digest file and reload sstable metadata Key: CASSANDRA-14217 URL: https://issues.apache.org/jira/browse/CASSANDRA-14217 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Assignee: Marcus Eriksson Fix For: 3.0.x, 3.11.x, 4.x {{nodetool verify}} tries to use the wrong digest file when verifying old version sstables and it also needs to reload the sstable metadata and notify compaction strategies when it mutates the repaired at field -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14216) node map does not handle InetAddressAndPort correctly.
[ https://issues.apache.org/jira/browse/CASSANDRA-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16353760#comment-16353760 ] Jason Brown commented on CASSANDRA-14216: - /cc [~aweisberg] > node map does not handle InetAddressAndPort correctly. > -- > > Key: CASSANDRA-14216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14216 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Dave Brosius >Assignee: Dave Brosius >Priority: Trivial > Fix For: 4.0 > > Attachments: 14216.txt > > > Collection of node information in nodeMap does not use the correct types for > accessing data. Since these maps are keyed by Strings, they are not > metatype-safe, and so i can't be certain what data was meant to be in them. > I'm assuming it was meant that host and port information should be used, but > perhaps it's just host. > > I have created a patch assuming it's host and port info. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14173) JDK 8u161 breaks JMX integration
[ https://issues.apache.org/jira/browse/CASSANDRA-14173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16353682#comment-16353682 ] Thomas Steinmaurer commented on CASSANDRA-14173: [~beobal]: locally built Cassandra 3.11 from source including the fix and deployed in our loadtest environment. Starts up fine now with 8u162. Thanks a lot! > JDK 8u161 breaks JMX integration > > > Key: CASSANDRA-14173 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14173 > Project: Cassandra > Issue Type: Bug >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Critical > Fix For: 3.11.2, 4.0 > > > {\{org.apache.cassandra.utils.JMXServerUtils}} which is used to > programatically configure the JMX server and RMI registry (CASSANDRA-2967, > CASSANDRA-10091) depends on some JDK internal classes/interfaces. A change to > one of these, introduced in Oracle JDK 1.8.0_162 is incompatible, which means > we cannot build using that JDK version. Upgrading the JVM on a node running > 3.6+ will result in Cassandra being unable to start. > {noformat} > ERROR [main] 2018-01-18 07:33:18,804 CassandraDaemon.java:706 - Exception > encountered during startup > java.lang.AbstractMethodError: > org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote; > at > javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:150) > ~[na:1.8.0_162] > at > javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:135) > ~[na:1.8.0_162] > at > javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:405) > ~[na:1.8.0_162] > at > org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:104) > ~[apache-cassandra-3.11.2-SNAPSHOT.jar:3.11.2-SNAPSHOT] > at > org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:143) > [apache-cassandra-3.11.2-SNAPSHOT.jar:3.11.2-SNAPSHOT] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:188) > [apache-cassandra-3.11.2-SNAPSHOT.jar:3.11.2-SNAPSHOT] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:600) > [apache-cassandra-3.11.2-SNAPSHOT.jar:3.11.2-SNAPSHOT] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689) > [apache-cassandra-3.11.2-SNAPSHOT.jar:3.11.2-SNAPSHOT]{noformat} > This is also a problem for CASSANDRA-9608, as the internals are completely > re-organised in JDK9, so a more stable solution that can be applied to both > JDK8 & JDK9 is required. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14201) Add a few options to nodetool verify
[ https://issues.apache.org/jira/browse/CASSANDRA-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16353666#comment-16353666 ] Marcus Eriksson commented on CASSANDRA-14201: - realised we shouldn't change the api for this in 3.0/3.11, so [here|https://github.com/krummas/cassandra/commits/marcuse/14201-trunk] is a new trunk-only patch which does the following; * add option to check sstable versions * add option to not invoke the disk failure policy when corrupt sstables are found, this is true default in this patch since it can stop/kill the node if the disk_failure_policy is configured to do that * add option to not mutate repair status, this is true by default since it could be dangerous moving repaired sstables to unrepaired, see CASSANDRA-9947 * add option to verify that all tokens in the sstable are owned by the node Will open new issue with the sstable metadata/digest fixes for 3.0/3.11 > Add a few options to nodetool verify > > > Key: CASSANDRA-14201 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14201 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Major > Fix For: 4.x > > > {{nodetool verify}} currently invokes the disk failure policy when it finds a > corrupt sstable - we should add an option to avoid that. It should also have > an option to check if all sstables are the latest version to be able to run > {{nodetool verify}} as a pre-upgrade check -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14201) Add a few options to nodetool verify
[ https://issues.apache.org/jira/browse/CASSANDRA-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-14201: Fix Version/s: (was: 3.11.x) (was: 3.0.x) > Add a few options to nodetool verify > > > Key: CASSANDRA-14201 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14201 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Major > Fix For: 4.x > > > {{nodetool verify}} currently invokes the disk failure policy when it finds a > corrupt sstable - we should add an option to avoid that. It should also have > an option to check if all sstables are the latest version to be able to run > {{nodetool verify}} as a pre-upgrade check -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org