[jira] [Commented] (CASSANDRA-13871) cassandra-stress user command misbehaves when retrying operations
[ https://issues.apache.org/jira/browse/CASSANDRA-13871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640549#comment-16640549 ] Andy Tolbert commented on CASSANDRA-13871: -- I recently ran into this and was confounded why stress started misbehaving very oddly. I had {{-errors ignore}} specified, so the exception wasn't surfaced at all, but after removing that option and updating code to include the full cause trace, I realized I was encountering this issue. It looks like the patch is no longer valid because of recent changes, I will attach a follow on patch sometime this weekend. > cassandra-stress user command misbehaves when retrying operations > - > > Key: CASSANDRA-13871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13871 > Project: Cassandra > Issue Type: Bug > Components: Stress >Reporter: Daniel Cranford >Priority: Minor > Attachments: 0001-Fixing-cassandra-stress-user-operations-retry.patch > > > o.a.c.stress.Operation will retry queries a configurable number of times. > When the "user" command is invoked the o.a.c.stress.operations.userdefined > SchemaInsert and SchemaQuery operations are used. > When SchemaInsert and SchemaQuery are retried (eg after a Read/WriteTimeout > exception), they advance the PartitionIterator used to generate the keys to > insert/query (SchemaInsert.java:85 SchemaQuery.java:129) This means each > retry will use a different set of keys. > The predefined set of operations avoid this problem by packaging up the > arguments to bind to the query into the RunOp object so that retrying the > operation results in exactly the same query with the same arguments being run. > This problem was introduced by CASSANDRA-7964. Prior to CASSANDRA-7964 the > PartitionIterator (Partition.RowIterator before the change) was reinitialized > prior to each query retry, thus generating the same set of keys each time. > This problem is reported rather confusingly. The only error that shows up in > a log file (specified with -log file=foo.log) is the unhelpful > {noformat} > java.io.IOException Operation x10 on key(s) [foobarkey]: Error executing: > (NoSuchElementException) > at org.apache.cassandra.stress.Operation.error(Operation.java:136) > at org.apache.cassandra.stress.Operation.timeWithRetry(Operation.java:114) > at > org.apache.cassandra.stress.userdefined.SchemaQuery.run(SchemaQuery.java:158) > at > org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:459) > {noformat} > Standard error is only slightly more helpful, displaying the ignorable > initial read/write error, and confusing java.util.NoSuchElementException > lines (caused by PartitionIterator exhaustion) followed by the above > IOException with stack trace, eg > {noformat} > com.datastax.drive.core.exceptions.ReadTimeoutException: Cassandra timeout > during read query > java.util.NoSuchElementException > java.util.NoSuchElementException > java.util.NoSuchElementException > java.util.NoSuchElementException > java.util.NoSuchElementException > java.util.NoSuchElementException > java.util.NoSuchElementException > java.util.NoSuchElementException > java.util.NoSuchElementException > java.io.IOException Operation x10 on key(s) [foobarkey]: Error executing: > (NoSuchElementException) > at org.apache.cassandra.stress.Operation.error(Operation.java:136) > at org.apache.cassandra.stress.Operation.timeWithRetry(Operation.java:114) > at > org.apache.cassandra.stress.userdefined.SchemaQuery.run(SchemaQuery.java:158) > at > org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:459) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14495) Memory Leak /High Memory usage post 3.11.2 upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640544#comment-16640544 ] Chris Lohfink commented on CASSANDRA-14495: --- > heap memory usage bumps up This is how the jvm work, objects created sit on heap and build up until a GC. Heap usage going up is expected normal behavior. > Memory Leak /High Memory usage post 3.11.2 upgrade > -- > > Key: CASSANDRA-14495 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14495 > Project: Cassandra > Issue Type: Bug > Components: Metrics >Reporter: Abdul Patel >Priority: Major > Attachments: cas_heap.txt > > > Hi All, > > I recently upgraded my non prod cassandra cluster( 4 nodes single DC) from > 3.10 to 3.11.2 version. > No issues reported apart from only nodetool info reporting 80% usage . > I intially had 16GB memory on each node, later i bumped up to 20GB, and > rebooted all nodes. > Waited for an week and now again i have seen memory usage more than 80% , > 16GB + . > this means some memory leaks are happening over the time. > Any one has faced such issue or do we have any workaround ? my 3.11.2 version > upgrade rollout has been halted because of this bug. > === > ID : 65b64f5a-7fe6-4036-94c8-8da9c57718cc > Gossip active : true > Thrift active : true > Native Transport active: true > Load : 985.24 MiB > Generation No : 1526923117 > Uptime (seconds) : 1097684 > Heap Memory (MB) : 16875.64 / 20480.00 > Off Heap Memory (MB) : 20.42 > Data Center : DC7 > Rack : rac1 > Exceptions : 0 > Key Cache : entries 3569, size 421.44 KiB, capacity 100 MiB, > 7931933 hits, 8098632 requests, 0.979 recent hit rate, 14400 save period in > seconds > Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 > requests, NaN recent hit rate, 0 save period in seconds > Counter Cache : entries 0, size 0 bytes, capacity 50 MiB, 0 hits, 0 > requests, NaN recent hit rate, 7200 save period in seconds > Chunk Cache : entries 2361, size 147.56 MiB, capacity 3.97 GiB, > 2412803 misses, 72594047 requests, 0.967 recent hit rate, NaN microseconds > miss latency > Percent Repaired : 99.88086234106282% > Token : (invoke with -T/--tokens to see all 256 tokens) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14495) Memory Leak /High Memory usage post 3.11.2 upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640541#comment-16640541 ] Abdul Patel commented on CASSANDRA-14495: - i have seen same pattern in 3.11.3 as well , it works for 2-3 weeks and suddenly heap memory usage bumps up. and then every hr i get alerts. the only new thing is , i am installing cassandra reaper as well with new patch, but even with repaer down , i see same behavior. do we just bump of max heap ? or is it bug > Memory Leak /High Memory usage post 3.11.2 upgrade > -- > > Key: CASSANDRA-14495 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14495 > Project: Cassandra > Issue Type: Bug > Components: Metrics >Reporter: Abdul Patel >Priority: Major > Attachments: cas_heap.txt > > > Hi All, > > I recently upgraded my non prod cassandra cluster( 4 nodes single DC) from > 3.10 to 3.11.2 version. > No issues reported apart from only nodetool info reporting 80% usage . > I intially had 16GB memory on each node, later i bumped up to 20GB, and > rebooted all nodes. > Waited for an week and now again i have seen memory usage more than 80% , > 16GB + . > this means some memory leaks are happening over the time. > Any one has faced such issue or do we have any workaround ? my 3.11.2 version > upgrade rollout has been halted because of this bug. > === > ID : 65b64f5a-7fe6-4036-94c8-8da9c57718cc > Gossip active : true > Thrift active : true > Native Transport active: true > Load : 985.24 MiB > Generation No : 1526923117 > Uptime (seconds) : 1097684 > Heap Memory (MB) : 16875.64 / 20480.00 > Off Heap Memory (MB) : 20.42 > Data Center : DC7 > Rack : rac1 > Exceptions : 0 > Key Cache : entries 3569, size 421.44 KiB, capacity 100 MiB, > 7931933 hits, 8098632 requests, 0.979 recent hit rate, 14400 save period in > seconds > Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 > requests, NaN recent hit rate, 0 save period in seconds > Counter Cache : entries 0, size 0 bytes, capacity 50 MiB, 0 hits, 0 > requests, NaN recent hit rate, 7200 save period in seconds > Chunk Cache : entries 2361, size 147.56 MiB, capacity 3.97 GiB, > 2412803 misses, 72594047 requests, 0.967 recent hit rate, NaN microseconds > miss latency > Percent Repaired : 99.88086234106282% > Token : (invoke with -T/--tokens to see all 256 tokens) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Resolved] (CASSANDRA-14804) Running repair on multiple nodes in parallel could halt entire repair
[ https://issues.apache.org/jira/browse/CASSANDRA-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston resolved CASSANDRA-14804. - Resolution: Fixed No problem, glad you got it figured out > Running repair on multiple nodes in parallel could halt entire repair > -- > > Key: CASSANDRA-14804 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14804 > Project: Cassandra > Issue Type: Bug > Components: Repair >Reporter: Jaydeepkumar Chovatia >Priority: Major > Fix For: 3.0.18 > > > Possible deadlock if we run repair on multiple nodes at the same time. We > have come across a situation in production in which if we repair multiple > nodes at the same time then repair hangs forever. Here are the details: > Time t1 > {{node-1}} has issued repair command to {{node-2}} but due to some reason > {{node-2}} didn't receive request hence {{node-1}} is awaiting at > [prepareForRepair > |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/ActiveRepairService.java#L333] > for 1 hour *with lock* > Time t2 > {{node-2}} sent prepare repair request to {{node-1}}, some exception > occurred on {{node-1}} and it is trying to cleanup parent session > [here|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/repair/RepairMessageVerbHandler.java#L172] > but {{node-1}} cannot get lock as 1 hour of time has not yet elapsed (above > one) > snippet of jstack on {{node-1}} > {quote}"Thread-888" #262588 daemon prio=5 os_prio=0 waiting on condition > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for (a java.util.concurrent.CountDownLatch$Sync) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.cassandra.service.ActiveRepairService.prepareForRepair(ActiveRepairService.java:332) > - locked <> (a org.apache.cassandra.service.ActiveRepairService) > at > org.apache.cassandra.repair.RepairRunnable.runMayThrow(RepairRunnable.java:214) > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$9/864248990.run(Unknown > Source) > at java.lang.Thread.run(Thread.java:748) > "AntiEntropyStage:1" #1789 daemon prio=5 os_prio=0 waiting for monitor entry > [] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.cassandra.service.ActiveRepairService.removeParentRepairSession(ActiveRepairService.java:421) > - waiting to lock <> (a org.apache.cassandra.service.ActiveRepairService) > at > org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:172) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$9/864248990.run(Unknown > Source) > at java.lang.Thread.run(Thread.java:748){quote} > Time t3: > {{node-2}}(and possibly other nodes {{node-3}}…) sent [prepare request > |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/ActiveRepairService.java#L333] > to {{node-1}}, but {{node-1}}’s AntiEntropyStage thread is busy awaiting for > lock at {{ActiveRepairService.removeParentRepairSession}}, hence {{node-2}}, > {{node-3}} (and possibly other nodes) will also go in 1 hour wait *with > lock*. This rolling effect continues and stalls repair in entire ring. > If we totally stop triggering repair then system would recover slowly but > here are the two major problems with this: > 1. Externally there is no way to decide whether to trigger new repair or > wait for system to recover > 2. In this case system recovers eventu
[jira] [Commented] (CASSANDRA-14776) Transient Replication: Hints on timeout should be disabled for writes to transient nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-14776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640424#comment-16640424 ] Benedict commented on CASSANDRA-14776: -- My only issue here is that, if we're depending on hints anyway, why don't we just rely on them for the non-transient node? It feels like once we've lost the ability to meet any of our guarantees *and* promptness, we may as well avoid polluting the nodes. Hints delivered after a repair during a node being brought back online, for instance, will only cause read-repairs unnecessarily until the next repair, despite the data being consistently replicated everywhere. I agree it's a bit of a grey area though. > Transient Replication: Hints on timeout should be disabled for writes to > transient nodes > - > > Key: CASSANDRA-14776 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14776 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benedict >Priority: Minor > Fix For: 4.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14761) Rename speculative_write_threshold to something more appropriate
[ https://issues.apache.org/jira/browse/CASSANDRA-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640423#comment-16640423 ] Benedict commented on CASSANDRA-14761: -- I thought we had agreed on {{transient_write_threshold}}, although for the *counters* (which we also need to rename) {{transient_write}} is ambiguous, as we are really counting only those triggered by our threshold. In this case, maybe simple {{transient_threshold}} for the percentile, and {{transient_threshold_writes}} for the counts? We can later have a straight-forward {{transient_writes}} to include all those triggered by the failure detector, perhaps. > Rename speculative_write_threshold to something more appropriate > > > Key: CASSANDRA-14761 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14761 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Ariel Weisberg >Priority: Major > Fix For: 4.0 > > > It's not really speculative. This commit is where it was last named and shows > what to update > https://github.com/aweisberg/cassandra/commit/e1df8e977d942a1b0da7c2a7554149c781d0e6c3 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14804) Running repair on multiple nodes in parallel could halt entire repair
[ https://issues.apache.org/jira/browse/CASSANDRA-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640334#comment-16640334 ] Jaydeepkumar Chovatia commented on CASSANDRA-14804: --- In our branch we have {{prepareForRepair}} *{{synchronized}}* yet, it was fixed in CASSANDRA-13849 which we missed to backport. Let me back port CASSANDRA-13849 to our branch and then hopefully this will fix the issue. Thanks a lot [~bdeggleston] for your help! > Running repair on multiple nodes in parallel could halt entire repair > -- > > Key: CASSANDRA-14804 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14804 > Project: Cassandra > Issue Type: Bug > Components: Repair >Reporter: Jaydeepkumar Chovatia >Priority: Major > Fix For: 3.0.18 > > > Possible deadlock if we run repair on multiple nodes at the same time. We > have come across a situation in production in which if we repair multiple > nodes at the same time then repair hangs forever. Here are the details: > Time t1 > {{node-1}} has issued repair command to {{node-2}} but due to some reason > {{node-2}} didn't receive request hence {{node-1}} is awaiting at > [prepareForRepair > |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/ActiveRepairService.java#L333] > for 1 hour *with lock* > Time t2 > {{node-2}} sent prepare repair request to {{node-1}}, some exception > occurred on {{node-1}} and it is trying to cleanup parent session > [here|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/repair/RepairMessageVerbHandler.java#L172] > but {{node-1}} cannot get lock as 1 hour of time has not yet elapsed (above > one) > snippet of jstack on {{node-1}} > {quote}"Thread-888" #262588 daemon prio=5 os_prio=0 waiting on condition > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for (a java.util.concurrent.CountDownLatch$Sync) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.cassandra.service.ActiveRepairService.prepareForRepair(ActiveRepairService.java:332) > - locked <> (a org.apache.cassandra.service.ActiveRepairService) > at > org.apache.cassandra.repair.RepairRunnable.runMayThrow(RepairRunnable.java:214) > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$9/864248990.run(Unknown > Source) > at java.lang.Thread.run(Thread.java:748) > "AntiEntropyStage:1" #1789 daemon prio=5 os_prio=0 waiting for monitor entry > [] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.cassandra.service.ActiveRepairService.removeParentRepairSession(ActiveRepairService.java:421) > - waiting to lock <> (a org.apache.cassandra.service.ActiveRepairService) > at > org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:172) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$9/864248990.run(Unknown > Source) > at java.lang.Thread.run(Thread.java:748){quote} > Time t3: > {{node-2}}(and possibly other nodes {{node-3}}…) sent [prepare request > |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/ActiveRepairService.java#L333] > to {{node-1}}, but {{node-1}}’s AntiEntropyStage thread is busy awaiting for > lock at {{ActiveRepairService.removeParentRepairSession}}, hence {{node-2}}, > {{node-3}} (and possibly other nodes) will also go in 1 hour wait *with > lock*. This rolling effect continues and stalls repair in entire rin
[jira] [Commented] (CASSANDRA-14804) Running repair on multiple nodes in parallel could halt entire repair
[ https://issues.apache.org/jira/browse/CASSANDRA-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640295#comment-16640295 ] Blake Eggleston commented on CASSANDRA-14804: - [~chovatia.jayd...@gmail.com] I’m not sure how we'd get to the state in t2. We wait for an hour on a semaphore we instantiate in {{prepareForRepair}}, and {{removeParentRepairSession}} is synchronized on the object monitor. One shouldn’t block the other. I think the jstack in the description is missing the thread where the {{ActiveRepairService}} monitor is being held. > Running repair on multiple nodes in parallel could halt entire repair > -- > > Key: CASSANDRA-14804 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14804 > Project: Cassandra > Issue Type: Bug > Components: Repair >Reporter: Jaydeepkumar Chovatia >Priority: Major > Fix For: 3.0.18 > > > Possible deadlock if we run repair on multiple nodes at the same time. We > have come across a situation in production in which if we repair multiple > nodes at the same time then repair hangs forever. Here are the details: > Time t1 > {{node-1}} has issued repair command to {{node-2}} but due to some reason > {{node-2}} didn't receive request hence {{node-1}} is awaiting at > [prepareForRepair > |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/ActiveRepairService.java#L333] > for 1 hour *with lock* > Time t2 > {{node-2}} sent prepare repair request to {{node-1}}, some exception > occurred on {{node-1}} and it is trying to cleanup parent session > [here|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/repair/RepairMessageVerbHandler.java#L172] > but {{node-1}} cannot get lock as 1 hour of time has not yet elapsed (above > one) > snippet of jstack on {{node-1}} > {quote}"Thread-888" #262588 daemon prio=5 os_prio=0 waiting on condition > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for (a java.util.concurrent.CountDownLatch$Sync) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) > at > org.apache.cassandra.service.ActiveRepairService.prepareForRepair(ActiveRepairService.java:332) > - locked <> (a org.apache.cassandra.service.ActiveRepairService) > at > org.apache.cassandra.repair.RepairRunnable.runMayThrow(RepairRunnable.java:214) > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$9/864248990.run(Unknown > Source) > at java.lang.Thread.run(Thread.java:748) > "AntiEntropyStage:1" #1789 daemon prio=5 os_prio=0 waiting for monitor entry > [] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.cassandra.service.ActiveRepairService.removeParentRepairSession(ActiveRepairService.java:421) > - waiting to lock <> (a org.apache.cassandra.service.ActiveRepairService) > at > org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:172) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$9/864248990.run(Unknown > Source) > at java.lang.Thread.run(Thread.java:748){quote} > Time t3: > {{node-2}}(and possibly other nodes {{node-3}}…) sent [prepare request > |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/ActiveRepairService.java#L333] > to {{node-1}}, but {{node-1}}’s AntiEntropyStage thread is busy awaiting for > lock at {{ActiveRepairService.removeParentRepairSession}}, hence {{node-2}}, > {{node-3}} (and possibly other nodes) will al
[jira] [Commented] (CASSANDRA-14776) Transient Replication: Hints on timeout should be disabled for writes to transient nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-14776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640274#comment-16640274 ] Ariel Weisberg commented on CASSANDRA-14776: So this isn't totally true I think if we are trying to achieve EACH_QUORUM in every data center? There are cases where yes we already achieved it and it hinting is not useful, but there are cases where we achieved local quorum, but not each quorum, and we might like transient replicas to be hinted. Hints is a pretty effective mechanism for bring remote DCs up to date without running repair and they work when repair can't run such as when nodes are down. I feel like if we went to the trouble to attempt a write to a transient replica it's OK for us to then hint it? > Transient Replication: Hints on timeout should be disabled for writes to > transient nodes > - > > Key: CASSANDRA-14776 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14776 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benedict >Priority: Minor > Fix For: 4.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14761) Rename speculative_write_threshold to something more appropriate
[ https://issues.apache.org/jira/browse/CASSANDRA-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640271#comment-16640271 ] Ariel Weisberg commented on CASSANDRA-14761: [~benedict] what are we going to change this to? I want to get this done so I can reference it in other material. > Rename speculative_write_threshold to something more appropriate > > > Key: CASSANDRA-14761 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14761 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Ariel Weisberg >Priority: Major > Fix For: 4.0 > > > It's not really speculative. This commit is where it was last named and shows > what to update > https://github.com/aweisberg/cassandra/commit/e1df8e977d942a1b0da7c2a7554149c781d0e6c3 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14807) Avoid querying “self” through messaging service when collecting full data during read repair
[ https://issues.apache.org/jira/browse/CASSANDRA-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-14807: Description: Currently, when collecting full requests during read-repair, we go through the messaging service instead of executing the query locally. |[patch|https://github.com/apache/cassandra/pull/278]|[dtest-patch|https://github.com/apache/cassandra-dtest/pull/39]|[utest|https://circleci.com/gh/ifesdjeen/cassandra/641]|[dtest-vnode|https://circleci.com/gh/ifesdjeen/cassandra/640]|[dtest-novnode|https://circleci.com/gh/ifesdjeen/cassandra/639]| was:Currently, when collecting full requests during read-repair, we go through the messaging service instead of executing the query locally. > Avoid querying “self” through messaging service when collecting full data > during read repair > > > Key: CASSANDRA-14807 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14807 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > > Currently, when collecting full requests during read-repair, we go through > the messaging service instead of executing the query locally. > |[patch|https://github.com/apache/cassandra/pull/278]|[dtest-patch|https://github.com/apache/cassandra-dtest/pull/39]|[utest|https://circleci.com/gh/ifesdjeen/cassandra/641]|[dtest-vnode|https://circleci.com/gh/ifesdjeen/cassandra/640]|[dtest-novnode|https://circleci.com/gh/ifesdjeen/cassandra/639]| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14807) Avoid querying “self” through messaging service when collecting full data during read repair
Alex Petrov created CASSANDRA-14807: --- Summary: Avoid querying “self” through messaging service when collecting full data during read repair Key: CASSANDRA-14807 URL: https://issues.apache.org/jira/browse/CASSANDRA-14807 Project: Cassandra Issue Type: Bug Reporter: Alex Petrov Assignee: Alex Petrov Currently, when collecting full requests during read-repair, we go through the messaging service instead of executing the query locally. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14807) Avoid querying “self” through messaging service when collecting full data during read repair
[ https://issues.apache.org/jira/browse/CASSANDRA-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-14807: Description: Currently, when collecting full requests during read-repair, we go through the messaging service instead of executing the query locally. ||[patch|https://github.com/apache/cassandra/pull/278]||[dtest-patch|https://github.com/apache/cassandra-dtest/pull/39]|| |[utest|https://circleci.com/gh/ifesdjeen/cassandra/641]|[dtest-vnode|https://circleci.com/gh/ifesdjeen/cassandra/640]|[dtest-novnode|https://circleci.com/gh/ifesdjeen/cassandra/639]| was: Currently, when collecting full requests during read-repair, we go through the messaging service instead of executing the query locally. ||[patch|https://github.com/apache/cassandra/pull/278]||[dtest-patch|https://github.com/apache/cassandra-dtest/pull/39]|| [utest|https://circleci.com/gh/ifesdjeen/cassandra/641]|[dtest-vnode|https://circleci.com/gh/ifesdjeen/cassandra/640]|[dtest-novnode|https://circleci.com/gh/ifesdjeen/cassandra/639]| > Avoid querying “self” through messaging service when collecting full data > during read repair > > > Key: CASSANDRA-14807 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14807 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > > Currently, when collecting full requests during read-repair, we go through > the messaging service instead of executing the query locally. > ||[patch|https://github.com/apache/cassandra/pull/278]||[dtest-patch|https://github.com/apache/cassandra-dtest/pull/39]|| > |[utest|https://circleci.com/gh/ifesdjeen/cassandra/641]|[dtest-vnode|https://circleci.com/gh/ifesdjeen/cassandra/640]|[dtest-novnode|https://circleci.com/gh/ifesdjeen/cassandra/639]| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14807) Avoid querying “self” through messaging service when collecting full data during read repair
[ https://issues.apache.org/jira/browse/CASSANDRA-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-14807: Description: Currently, when collecting full requests during read-repair, we go through the messaging service instead of executing the query locally. ||[patch|https://github.com/apache/cassandra/pull/278]||[dtest-patch|https://github.com/apache/cassandra-dtest/pull/39]|| [utest|https://circleci.com/gh/ifesdjeen/cassandra/641]|[dtest-vnode|https://circleci.com/gh/ifesdjeen/cassandra/640]|[dtest-novnode|https://circleci.com/gh/ifesdjeen/cassandra/639]| was: Currently, when collecting full requests during read-repair, we go through the messaging service instead of executing the query locally. |[patch|https://github.com/apache/cassandra/pull/278]|[dtest-patch|https://github.com/apache/cassandra-dtest/pull/39]|[utest|https://circleci.com/gh/ifesdjeen/cassandra/641]|[dtest-vnode|https://circleci.com/gh/ifesdjeen/cassandra/640]|[dtest-novnode|https://circleci.com/gh/ifesdjeen/cassandra/639]| > Avoid querying “self” through messaging service when collecting full data > during read repair > > > Key: CASSANDRA-14807 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14807 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > > Currently, when collecting full requests during read-repair, we go through > the messaging service instead of executing the query locally. > ||[patch|https://github.com/apache/cassandra/pull/278]||[dtest-patch|https://github.com/apache/cassandra-dtest/pull/39]|| > [utest|https://circleci.com/gh/ifesdjeen/cassandra/641]|[dtest-vnode|https://circleci.com/gh/ifesdjeen/cassandra/640]|[dtest-novnode|https://circleci.com/gh/ifesdjeen/cassandra/639]| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14373) Allow using custom script for chronicle queue BinLog archival
[ https://issues.apache.org/jira/browse/CASSANDRA-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639945#comment-16639945 ] Ariel Weisberg commented on CASSANDRA-14373: +1 One interesting thing to note is that the retry mechanism is going to reorder the things it is archiving when it supplies them to the archive script. It's probably fine, but I wonder if people can tell when files are missing? Like are they sequentially numbered? > Allow using custom script for chronicle queue BinLog archival > - > > Key: CASSANDRA-14373 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14373 > Project: Cassandra > Issue Type: Improvement >Reporter: Stefan Podkowinski >Assignee: Pramod K Sivaraju >Priority: Major > Labels: lhf, pull-request-available > Fix For: 4.x > > Time Spent: 10m > Remaining Estimate: 0h > > It would be nice to allow the user to configure an archival script that will > be executed in {{BinLog.onReleased(cycle, file)}} for every deleted bin log, > just as we do in {{CommitLogArchiver}}. The script should be able to copy the > released file to an external location or do whatever the author hand in mind. > Deleting the log file should be delegated to the script as well. > See CASSANDRA-13983, CASSANDRA-12151 for use cases. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14373) Allow using custom script for chronicle queue BinLog archival
[ https://issues.apache.org/jira/browse/CASSANDRA-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639846#comment-16639846 ] Marcus Eriksson commented on CASSANDRA-14373: - thanks for the review, pushed a couple of commits to address the comments bq. Removing documentation of defaults doesn't seem like a pure win since they still seem to be there? I removed the defaults from the nodetool command to be able to override options that are set in cassandra.yaml, do you have a suggestion how to do it in a better way? We don't know the actual set defaults in the nodetool command, the user would have to check out cassandra.yaml which might not be very user friendly. bq. This isn't just a path right it's a format specified of sorts with %path? right, changed to 'command' instead bq. This is BinLogOptions but the comments reference Audit log and there is a typo in the first sentence here fixed bq. Depending on what the archiving script does and why it failed there could be unfortunate consequences to retrying repeatedly. added a new configuration param that allows users to set max retries to 0 to avoid this (defaults to 10 retries as retrying forever might also be bad) bq. exec forks, forking can be slow because of page table copying which in the past was slow under things like Xen.. I'm just mentioning it. I don't think you need to make it better right now. I don't know offhand how you invoke an external command more efficiently from Java. yeah not sure what to do here, a quick search tells me ProcessBuilder seems to be the way to do this bq. Is this going to enable it for all tests? Is that a good idea can we only enable it for just the unit tests that require it? yeah, removed, should not be there bq. Should use execute instead of submit unless consuming the result future fixed bq. Same here here we actually wait for the future bq. The dtests are good tests but could they be unit tests since they are single node. In general I agree, but in this case it executes the nodetool command as an end user would, against a running cassandra cluster (well, node, but anyway). I suppose we could stand up a real cluster in a unit test and execute the nodetool script as well, but I assume that would take about as long as doing it using ccm. > Allow using custom script for chronicle queue BinLog archival > - > > Key: CASSANDRA-14373 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14373 > Project: Cassandra > Issue Type: Improvement >Reporter: Stefan Podkowinski >Assignee: Pramod K Sivaraju >Priority: Major > Labels: lhf, pull-request-available > Fix For: 4.x > > Time Spent: 10m > Remaining Estimate: 0h > > It would be nice to allow the user to configure an archival script that will > be executed in {{BinLog.onReleased(cycle, file)}} for every deleted bin log, > just as we do in {{CommitLogArchiver}}. The script should be able to copy the > released file to an external location or do whatever the author hand in mind. > Deleting the log file should be delegated to the script as well. > See CASSANDRA-13983, CASSANDRA-12151 for use cases. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14713) Update docker image used for testing
[ https://issues.apache.org/jira/browse/CASSANDRA-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623200#comment-16623200 ] Stefan Podkowinski edited comment on CASSANDRA-14713 at 10/5/18 1:04 PM: - Dtest results from b.a.o when run with "spod/cassandra-testing-ubuntu18-java11" image: ||Branch||Results|| |2.1|[!https://builds.apache.org/buildStatus/icon?job=Cassandra-devbranch-dtest&build=650!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/650/]| |2.2|[!https://builds.apache.org/buildStatus/icon?job=Cassandra-devbranch-dtest&build=657!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/657/]| |3.0|[!https://builds.apache.org/buildStatus/icon?job=Cassandra-devbranch-dtest&build=648!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/648/]| |3.11|[!https://builds.apache.org/buildStatus/icon?job=Cassandra-devbranch-dtest&build=651!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/651/]| |trunk|[!https://builds.apache.org/buildStatus/icon?job=Cassandra-devbranch-dtest&build=646!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/646/]| |Unit Tests|[!https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14713.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14713]| was (Author: spo...@gmail.com): Dtest results from b.a.o when run with "spod/cassandra-testing-ubuntu18-java11" image: ||Branch||Results|| |2.1|[!https://builds.apache.org/buildStatus/icon?job=Cassandra-devbranch-dtest&build=650!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/650/]| |2.2|[!https://builds.apache.org/buildStatus/icon?job=Cassandra-devbranch-dtest&build=654!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/654/]| |3.0|[!https://builds.apache.org/buildStatus/icon?job=Cassandra-devbranch-dtest&build=648!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/648/]| |3.11|[!https://builds.apache.org/buildStatus/icon?job=Cassandra-devbranch-dtest&build=651!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/651/]| |trunk|[!https://builds.apache.org/buildStatus/icon?job=Cassandra-devbranch-dtest&build=646!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/646/]| |Unit Tests|[!https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14713.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14713]| > Update docker image used for testing > > > Key: CASSANDRA-14713 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14713 > Project: Cassandra > Issue Type: New Feature > Components: Testing >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Major > Attachments: Dockerfile > > > Tests executed on builds.apache.org ({{docker/jenkins/jenkinscommand.sh}}) > and circleCI ({{.circleci/config.yml}}) will currently use the same > [cassandra-test|https://hub.docker.com/r/kjellman/cassandra-test/] docker > image ([github|https://github.com/mkjellman/cassandra-test-docker]) by > [~mkjellman]. > We should manage this image on our own as part of cassandra-builds, to keep > it updated. There's also a [Apache > user|https://hub.docker.com/u/apache/?page=1] on docker hub for publishing > images. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Resolved] (CASSANDRA-14805) Fails on running Cassandra server
[ https://issues.apache.org/jira/browse/CASSANDRA-14805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie resolved CASSANDRA-14805. - Resolution: Invalid Please reach out to the community on #cassandra on freenode or via the [user mailing lists|[http://cassandra.apache.org/community/].] This Jira is for tracking development of the database. > Fails on running Cassandra server > -- > > Key: CASSANDRA-14805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14805 > Project: Cassandra > Issue Type: Bug >Reporter: Ravi Gangwar >Priority: Critical > > Full Product Version : > Fails on java version. It's Java "1.8.0_181" > Os Version : Ubuntu 16.04 LTS > EXTRA RELEVANT SYSTEM CONFIGURATION : > Just installed Cassandra 2.2.11 > A DESCRIPTION OF THE PROBLEM : > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGBUS (0x7) at pc=0x7f42fc492e70, pid=12128, tid=0x7f42fc3c9700 > # > # JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build > 1.8.0_181-b13) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # C [liblz4-java8098863625230398555.so+0x5e70] LZ4_decompress_fast+0xd0 > # > # Failed to write core dump. Core dumps have been disabled. To enable core > dumping, try "ulimit -c unlimited" before starting Java again > # > # If you would like to submit a bug report, please visit: > # http://bugreport.java.com/bugreport/crash.jsp > # The crash happened outside the Java Virtual Machine in native code. > # See problematic frame for where to report the bug. > # > --- T H R E A D --- > Current thread (0x006d0800): JavaThread "CompactionExecutor:7" daemon > [_thread_in_native, id=5278, stack(0x7f42fc389000,0x7f42fc3ca000)] > siginfo: si_signo: 7 (SIGBUS), si_code: 2 (BUS_ADRERR), si_addr: > 0x7f2026b8b000 > Registers: > RAX=0x7f432200750a, RBX=0x7f2026b8affc, RCX=0x7f4322005856, > RDX=0x7f432200750a > RSP=0x7f42fc3c7e50, RBP=0x7f2026b8940f, RSI=0x7f4322011d88, > RDI=0x0007 > R8 =0x7f4322011d84, R9 =0x7f4322011d90, R10=0x0003, > R11=0x > R12=0x, R13=0x7f1ffe33d40f, R14=0x7f4322011d87, > R15=0x7f4322011d8b > RIP=0x7f42fc492e70, EFLAGS=0x00010287, CSGSFS=0x002b0033, > ERR=0x0004 > TRAPNO=0x000e > Top of Stack: (sp=0x7f42fc3c7e50) > 0x7f42fc3c7e50: 4ce9 > 0x7f42fc3c7e60: 0004 0001 > 0x7f42fc3c7e70: 0002 0001 > 0x7f42fc3c7e80: 0004 0004 > 0x7f42fc3c7e90: 0004 0004 > 0x7f42fc3c7ea0: > 0x7f42fc3c7eb0: > 0x7f42fc3c7ec0: 0001 > 0x7f42fc3c7ed0: 0002 0003 > 0x7f42fc3c7ee0: 7f42fc3c7fa8 006d09f8 > 0x7f42fc3c7ef0: > 0x7f42fc3c7f00: 7f1ffe33d40f 7f4322001d90 > 0x7f42fc3c7f10: 2884c000 7f42fc48f59d > 0x7f42fc3c7f20: 000733b5fa78 > 0x7f42fc3c7f30: 7f42fc3c7fc0 > 0x7f42fc3c7f40: 2884bfff 7f42fc3c7fa8 > 0x7f42fc3c7f50: 006d0800 7f43119e8f1d > 0x7f42fc3c7f60: 7f42fc3c7f98 > 0x7f42fc3c7f70: 0001 > 0x7f42fc3c7f80: 00070764f4d8 > 0x7f42fc3c7f90: 00075b5937f0 > 0x7f42fc3c7fa0: 7f42fc3c7ff0 0006f91d2e38 > 0x7f42fc3c7fb0: 0006d7e3f210 7f42fc3c8040 > 0x7f42fc3c7fc0: 00070764f4d8 7f4311f9eca8 > 0x7f42fc3c7fd0: 51070001 00072dc9abb0 > 0x7f42fc3c7fe0: 28851103 0006dd2d > 0x7f42fc3c7ff0: 00022883f40b eb6b26fe2884bffc > 0x7f42fc3c8000: 00070764f4d8 00012884c000 > 0x7f42fc3c8010: 00075b5937f0 00075b5937f0 > 0x7f42fc3c8020: 00060008 0006d7f67598 > 0x7f42fc3c8030: 0006d7f67348 7f43120080d4 > 0x7f42fc3c8040: 0001 0006d7f676d8 > > > SYSTEM CONFIGURATION : > > # CPU - Intel Core i7 - 4771 CPU @ 3.50 GHz * 8 > 2. RAM - 16 GB > 3. STORAGE - 967.6 GB > 4. OS - Ubuntu 16.04 LTS > 5. Apache Cassandra - 2.2.11 > 6. CPP Driver - 2.2.1-1 > 7. libuv - 1.4.2-1 > 8. Java version - "1.8.0_181" > Java (TM) SE Runtime Environment (build 1.8.0_181-b13) > Java Hotspot (TM) 64-Bit Server VM (build 25.181-b13, mixed mode) > > Cassandra was working normally before generating this bug. Now when i am > trying to restart my server getting this bu
[jira] [Created] (CASSANDRA-14806) CircleCI workflow improvements and Java 11 support
Stefan Podkowinski created CASSANDRA-14806: -- Summary: CircleCI workflow improvements and Java 11 support Key: CASSANDRA-14806 URL: https://issues.apache.org/jira/browse/CASSANDRA-14806 Project: Cassandra Issue Type: Improvement Components: Build, Testing Reporter: Stefan Podkowinski Assignee: Stefan Podkowinski The current CircleCI config could use some cleanup and improvements. First of all, the config has been made more modular by using the new CircleCI 2.1 executors and command elements. Based on CASSANDRA-14713, there's now also a Java 11 executor that will allow running tests under Java 11. The {{build}} step will be done using Java 11 in all cases, so we can catch any regressions for that and also test the Java 11 multi-jar artifact during dtests, that we'd also create during the release process. The job workflow has now also been changed to make use of the [manual job approval|https://circleci.com/docs/2.0/workflows/#holding-a-workflow-for-a-manual-approval] feature, which now allows running dtest jobs only on request and not automatically with every commit. The Java8 unit tests still do, but that could also be easily changed if needed. See [example workflow|https://circleci.com/workflow-run/08ecb879-9aaa-4d75-84d6-b00dc9628425] with start_ jobs being triggers needed manual approval for starting the actual jobs. There was some churn in manually editing the config for paid and non-paid resource tiers before. This has been mostly mitigated now by using project settings instead, for overriding lower defaults (see below) and scheduling dtests on request, which will only run on paid accounts nonetheless, so we use high settings for these right away. The only issue left is the question how we may be able to dynamically adjust the {{resource_class}} and {{parallelism}} settings for unit tests. So at this point, the CircleCI config will work for both paid and non-paid by default, but paid accounts will see slower unit test results, as only medium instances are used (ie. 15min instead of 4min). Attention CircleCI paid account users: you'll have to add "{{CCM_MAX_HEAP_SIZE: 2048M}}" and "{{CCM_HEAP_NEWSIZE: 512M}}" to your project's environment settings or create a context, to override the lower defaults for free instances! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org