[jira] [Comment Edited] (CASSANDRA-9935) Repair fails with RuntimeException
[ https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194696#comment-15194696 ] Ruoran Wang edited comment on CASSANDRA-9935 at 3/15/16 3:51 AM: - [~pauloricardomg] Here is the recent Error message, and following are the sstables and their metadata. {noformat} ERROR [ValidationExecutor:8] 2016-03-15 03:19:25,473 Validator.java:245 - Failed creating a merkle tree for [repair #b82c4cf0-ea5c-11e5-8b54-71e192c0496a on KEYSPACE/COLUM_FAMILY, (8825693858844788422,8825705737822637605]], /10.57.198.67 (see log for details) ERROR [ValidationExecutor:8] 2016-03-15 03:19:25,474 CassandraDaemon.java:229 - Exception in thread Thread[ValidationExecutor:8,1,main] java.lang.AssertionError: row DecoratedKey(8825694477039867191, 000403b708015363e13ed200) received out of order wrt DecoratedKey(8825705587125016582, 0004004208015363141ed900) at org.apache.cassandra.repair.Validator.add(Validator.java:126) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1051) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:89) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:662) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_66] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] {noformat} {noformat} nodetool getsstables --hex-format -- KEYSPACE COLUM_FAMILY 000403b708015363e13ed200 /var/lib/cassandra/data/KEYSPACE/COLUM_FAMILY-d0500b80d14a11e5a42361571269f00d/KEYSPACE-COLUM_FAMILY-ka-59389-Data.db /var/lib/cassandra/data/KEYSPACE/COLUM_FAMILY-d0500b80d14a11e5a42361571269f00d/KEYSPACE-COLUM_FAMILY-ka-59225-Data.db {noformat} {noformat} nodetool getsstables --hex-format -- KEYSPACE COLUM_FAMILY 0004004208015363141ed900/var/lib/cassandra/data/KEYSPACE/COLUM_FAMILY-d0500b80d14a11e5a42361571269f00d/KEYSPACE-COLUM_FAMILY-ka-59389-Data.db /var/lib/cassandra/data/KEYSPACE/COLUM_FAMILY-d0500b80d14a11e5a42361571269f00d/KEYSPACE-COLUM_FAMILY-ka-59225-Data.db {noformat} {noformat} SSTable: /var/lib/cassandra/data/KEYSPACE/COLUM_FAMILY-d0500b80d14a11e5a42361571269f00d/KEYSPACE-COLUM_FAMILY-ka-59225 Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Bloom Filter FP chance: 0.10 Minimum timestamp: 1457647152189000 Maximum timestamp: 1457683010045000 SSTable max local deletion time: 1458287810 Compression ratio: 0.2804368699432709 Estimated droppable tombstones: 0.1136631298580633 SSTable Level: 0 Repaired at: 0 ReplayPosition(segmentId=1457685762291, position=384) {noformat} {noformat} SSTable: /var/lib/cassandra/data/KEYSPACE/COLUM_FAMILY-d0500b80d14a11e5a42361571269f00d/KEYSPACE-COLUM_FAMILY-ka-59389 Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Bloom Filter FP chance: 0.10 Minimum timestamp: 1457647152172001 Maximum timestamp: 1458009746854000 SSTable max local deletion time: 1458614546 Compression ratio: 0.2809352366738701 Estimated droppable tombstones: 0.11049303066041988 SSTable Level: 0 Repaired at: 0 ReplayPosition(segmentId=1457995474961, position=24034207) {noformat} was (Author: ruoranwang): {noformat} ERROR [ValidationExecutor:8] 2016-03-15 03:19:25,473 Validator.java:245 - Failed creating a merkle tree for [repair #b82c4cf0-ea5c-11e5-8b54-71e192c0496a on KEYSPACE/COLUM_FAMILY, (8825693858844788422,8825705737822637605]], /10.57.198.67 (see log for details) ERROR [ValidationExecutor:8] 2016-03-15 03:19:25,474 CassandraDaemon.java:229 - Exception in thread Thread[ValidationExecutor:8,1,main] java.lang.AssertionError: row DecoratedKey(8825694477039867191, 000403b708015363e13ed200) received out of order wrt DecoratedKey(8825705587125016582, 0004004208015363141ed900) at org.apache.cassandra.repair.Validator.add(Validator.java:126) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1051) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:89) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:662) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_66] at
[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException
[ https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194696#comment-15194696 ] Ruoran Wang commented on CASSANDRA-9935: {noformat} ERROR [ValidationExecutor:8] 2016-03-15 03:19:25,473 Validator.java:245 - Failed creating a merkle tree for [repair #b82c4cf0-ea5c-11e5-8b54-71e192c0496a on KEYSPACE/COLUM_FAMILY, (8825693858844788422,8825705737822637605]], /10.57.198.67 (see log for details) ERROR [ValidationExecutor:8] 2016-03-15 03:19:25,474 CassandraDaemon.java:229 - Exception in thread Thread[ValidationExecutor:8,1,main] java.lang.AssertionError: row DecoratedKey(8825694477039867191, 000403b708015363e13ed200) received out of order wrt DecoratedKey(8825705587125016582, 0004004208015363141ed900) at org.apache.cassandra.repair.Validator.add(Validator.java:126) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1051) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:89) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:662) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_66] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] {noformat} {noformat} getsstables --hex-format -- KEYSPACE COLUM_FAMILY 000403b708015363e13ed200 /var/lib/cassandra/data/KEYSPACE/COLUM_FAMILY-d0500b80d14a11e5a42361571269f00d/KEYSPACE-COLUM_FAMILY-ka-59389-Data.db /var/lib/cassandra/data/KEYSPACE/COLUM_FAMILY-d0500b80d14a11e5a42361571269f00d/KEYSPACE-COLUM_FAMILY-ka-59225-Data.db {noformat} {noformat} nodetool getsstables --hex-format -- KEYSPACE COLUM_FAMILY 0004004208015363141ed900/var/lib/cassandra/data/KEYSPACE/COLUM_FAMILY-d0500b80d14a11e5a42361571269f00d/KEYSPACE-COLUM_FAMILY-ka-59389-Data.db /var/lib/cassandra/data/KEYSPACE/COLUM_FAMILY-d0500b80d14a11e5a42361571269f00d/KEYSPACE-COLUM_FAMILY-ka-59225-Data.db {noformat} {noformat} SSTable: /var/lib/cassandra/data/KEYSPACE/COLUM_FAMILY-d0500b80d14a11e5a42361571269f00d/KEYSPACE-COLUM_FAMILY-ka-59225 Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Bloom Filter FP chance: 0.10 Minimum timestamp: 1457647152189000 Maximum timestamp: 1457683010045000 SSTable max local deletion time: 1458287810 Compression ratio: 0.2804368699432709 Estimated droppable tombstones: 0.1136631298580633 SSTable Level: 0 Repaired at: 0 ReplayPosition(segmentId=1457685762291, position=384) {noformat} {noformat} SSTable: /var/lib/cassandra/data/KEYSPACE/COLUM_FAMILY-d0500b80d14a11e5a42361571269f00d/KEYSPACE-COLUM_FAMILY-ka-59389 Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Bloom Filter FP chance: 0.10 Minimum timestamp: 1457647152172001 Maximum timestamp: 1458009746854000 SSTable max local deletion time: 1458614546 Compression ratio: 0.2809352366738701 Estimated droppable tombstones: 0.11049303066041988 SSTable Level: 0 Repaired at: 0 ReplayPosition(segmentId=1457995474961, position=24034207) {noformat} > Repair fails with RuntimeException > -- > > Key: CASSANDRA-9935 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9935 > Project: Cassandra > Issue Type: Bug > Environment: C* 2.1.8, Debian Wheezy >Reporter: mlowicki >Assignee: Yuki Morishita > Fix For: 2.1.x > > Attachments: db1.sync.lati.osa.cassandra.log, > db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, > system.log.10.210.3.221, system.log.10.210.3.230 > > > We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade > to 2.1.8 it started to work faster but now it fails with: > {code} > ... > [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde > for range (-5474076923322749342,-5468600594078911162] finished > [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde > for range (-8631877858109464676,-8624040066373718932] finished > [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde > for range (-5372806541854279315,-5369354119480076785] finished > [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde > for range (8166489034383821955,8168408930184216281] finished > [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde > for range (6084602890817326921,6088328703025510057] finished >
[jira] [Created] (CASSANDRA-11352) Include units of metrics in the cassandra-stress tool
Rajath Subramanyam created CASSANDRA-11352: -- Summary: Include units of metrics in the cassandra-stress tool Key: CASSANDRA-11352 URL: https://issues.apache.org/jira/browse/CASSANDRA-11352 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Rajath Subramanyam Priority: Minor Fix For: 2.1.0 cassandra-stress in the Results section can have units for the metrics as an improvement to make the tool more usable. Results: op rate : 14668 [READ:7334, WRITE:7334] partition rate: 14668 [READ:7334, WRITE:7334] row rate : 14668 [READ:7334, WRITE:7334] latency mean : 0.7 [READ:0.7, WRITE:0.7] latency median: 0.6 [READ:0.6, WRITE:0.6] latency 95th percentile : 0.8 [READ:0.8, WRITE:0.8] latency 99th percentile : 1.2 [READ:1.2, WRITE:1.2] latency 99.9th percentile : 8.8 [READ:8.9, WRITE:9.0] latency max : 448.7 [READ:162.3, WRITE:448.7] Total partitions : 105612753 [READ:52805915, WRITE:52806838] Total errors : 0 [READ:0, WRITE:0] total gc count: 0 total gc mb : 0 total gc time (s) : 0 avg gc time(ms) : NaN stdev gc time(ms) : 0 Total operation time : 02:00:00 END -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10099) Improve concurrency in CompactionStrategyManager
[ https://issues.apache.org/jira/browse/CASSANDRA-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-10099: --- Status: Ready to Commit (was: Patch Available) > Improve concurrency in CompactionStrategyManager > > > Key: CASSANDRA-10099 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10099 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuki Morishita >Assignee: Marcus Eriksson > Labels: compaction, lcs > Fix For: 3.x > > > Continue discussion from CASSANDRA-9882. > CompactionStrategyManager(WrappingCompactionStrategy for <3.0) tracks SSTable > changes mainly for separating repaired / unrepaired SSTables (+ LCS manages > level). > This is blocking operation, and can lead to block of flush etc. when > determining next background task takes longer. > Explore the way to mitigate this concurrency issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9285) LEAK DETECTED in sstwriter
[ https://issues.apache.org/jira/browse/CASSANDRA-9285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-9285: -- Resolution: Won't Fix Status: Resolved (was: Patch Available) Since this is 2.1 only (2.2+ is fixed) and it is not critical, I'd like to close this as Won't Fix. > LEAK DETECTED in sstwriter > -- > > Key: CASSANDRA-9285 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9285 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Pierre N. > Fix For: 2.1.x > > > reproduce bug : > {code} > public static void main(String[] args) throws Exception { > System.setProperty("cassandra.debugrefcount","true"); > > String ks = "ks1"; > String table = "t1"; > > String schema = "CREATE TABLE " + ks + "." + table + "(a1 INT, > PRIMARY KEY (a1));"; > String insert = "INSERT INTO "+ ks + "." + table + "(a1) VALUES(?);"; > > File dir = new File("/var/tmp/" + ks + "/" + table); > dir.mkdirs(); > > CQLSSTableWriter writer = > CQLSSTableWriter.builder().forTable(schema).using(insert).inDirectory(dir).build(); > > writer.addRow(1); > writer.close(); > writer = null; > > Thread.sleep(1000);System.gc(); > Thread.sleep(1000);System.gc(); > } > {code} > {quote} > [2015-05-01 16:09:59,139] [Reference-Reaper:1] ERROR > org.apache.cassandra.utils.concurrent.Ref - LEAK DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@79fa9da9) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@2053866990:Memory@[7f87f8043b20..7f87f8043b48) > was not released before the reference was garbage collected > [2015-05-01 16:09:59,143] [Reference-Reaper:1] ERROR > org.apache.cassandra.utils.concurrent.Ref - Allocate trace > org.apache.cassandra.utils.concurrent.Ref$State@79fa9da9: > Thread[Thread-2,5,main] > at java.lang.Thread.getStackTrace(Thread.java:1552) > at org.apache.cassandra.utils.concurrent.Ref$Debug.(Ref.java:200) > at org.apache.cassandra.utils.concurrent.Ref$State.(Ref.java:133) > at org.apache.cassandra.utils.concurrent.Ref.(Ref.java:60) > at org.apache.cassandra.io.util.SafeMemory.(SafeMemory.java:32) > at > org.apache.cassandra.io.util.SafeMemoryWriter.(SafeMemoryWriter.java:33) > at > org.apache.cassandra.io.sstable.IndexSummaryBuilder.(IndexSummaryBuilder.java:111) > at > org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.(SSTableWriter.java:576) > at > org.apache.cassandra.io.sstable.SSTableWriter.(SSTableWriter.java:140) > at > org.apache.cassandra.io.sstable.AbstractSSTableSimpleWriter.getWriter(AbstractSSTableSimpleWriter.java:58) > at > org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter$DiskWriter.run(SSTableSimpleUnsortedWriter.java:227) > [2015-05-01 16:09:59,144] [Reference-Reaper:1] ERROR > org.apache.cassandra.utils.concurrent.Ref - LEAK DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@664382e3) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@899100784:Memory@[7f87f8043990..7f87f8043994) > was not released before the reference was garbage collected > [2015-05-01 16:09:59,144] [Reference-Reaper:1] ERROR > org.apache.cassandra.utils.concurrent.Ref - Allocate trace > org.apache.cassandra.utils.concurrent.Ref$State@664382e3: > Thread[Thread-2,5,main] > at java.lang.Thread.getStackTrace(Thread.java:1552) > at org.apache.cassandra.utils.concurrent.Ref$Debug.(Ref.java:200) > at org.apache.cassandra.utils.concurrent.Ref$State.(Ref.java:133) > at org.apache.cassandra.utils.concurrent.Ref.(Ref.java:60) > at org.apache.cassandra.io.util.SafeMemory.(SafeMemory.java:32) > at > org.apache.cassandra.io.util.SafeMemoryWriter.(SafeMemoryWriter.java:33) > at > org.apache.cassandra.io.sstable.IndexSummaryBuilder.(IndexSummaryBuilder.java:110) > at > org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.(SSTableWriter.java:576) > at > org.apache.cassandra.io.sstable.SSTableWriter.(SSTableWriter.java:140) > at > org.apache.cassandra.io.sstable.AbstractSSTableSimpleWriter.getWriter(AbstractSSTableSimpleWriter.java:58) > at > org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter$DiskWriter.run(SSTableSimpleUnsortedWriter.java:227) > [2015-05-01 16:09:59,144] [Reference-Reaper:1] ERROR > org.apache.cassandra.utils.concurrent.Ref - LEAK DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@3cca0ac2) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@499043670:Memory@[7f87f8039940..7f87f8039c60) > was not released before the reference was
[jira] [Commented] (CASSANDRA-11351) rethink stream throttling logic
[ https://issues.apache.org/jira/browse/CASSANDRA-11351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1519#comment-1519 ] Paulo Motta commented on CASSANDRA-11351: - Is this a duplicate of CASSANDRA-11303 ? The idea there is to add {{stream_throughput_inbound_megabits_per_sec}} and {{inter_dc_stream_throughput_inbound_megabits_per_sec}} and throttle reading from socket on the receiving side (analogous to what we currently do for outbound throttling). If outbound > inbound, tcp receive buffer would fill and write side would dynamically adjust to inbound consume throughput. Would that be sufficient or is there any problem with this approach? > rethink stream throttling logic > --- > > Key: CASSANDRA-11351 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11351 > Project: Cassandra > Issue Type: Improvement >Reporter: Brandon Williams > > Currently, we throttle steaming from the outbound side, because throttling > from the inbound side is thought as not doable. This creates a problem > because the total stream throughput is based on the number of nodes involved, > so based on the operation to be performed it can vary. This creates > operational overhead, as the throttle has to be constantly adjusted. > I propose we flip this logic on its head, and instead limit the total inbound > throughput. How? It's simple: we ask. Given a total inbound throughput of > 200Mb, if a node is going to stream from 10 nodes, it would simply tell the > source nodes to only stream at 20Mb/s when asking for the stream, thereby > never going over the 200Mb inbound limit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException
[ https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194431#comment-15194431 ] Ruoran Wang commented on CASSANDRA-9935: [~pauloricardomg] thanks, that patch works. I am reproducing the error, I will post the result when the error shows up again. Btw, I noticed those two failing column families have high number of sstable count at level 1. The follwing output is the sstable count for the 6 nodes we have. Top two are the column families that had the issue, the bottom two are two normal ones. I noticed this last Friday, the level 1 count didn't drop until today. I don't see any pending compactions (This is a performace testing cluster and I stopped read and write from last friday) {noformat} SSTables in each level: [2, 20/10, 88, 0, 0, 0, 0, 0, 0] SSTables in each level: [0, 20/10, 103/100, 90, 0, 0, 0, 0, 0] SSTables in each level: [2, 10, 39, 0, 0, 0, 0, 0, 0] SSTables in each level: [2, 10, 58, 0, 0, 0, 0, 0, 0] SSTables in each level: [50/4, 20/10, 85, 0, 0, 0, 0, 0, 0] SSTables in each level: [1, 18/10, 108/100, 81, 0, 0, 0, 0, 0] SSTables in each level: [2, 10, 35, 0, 0, 0, 0, 0, 0] SSTables in each level: [2, 10, 59, 0, 0, 0, 0, 0, 0] SSTables in each level: [1, 22/10, 97, 0, 0, 0, 0, 0, 0] SSTables in each level: [0, 18/10, 107/100, 91, 0, 0, 0, 0, 0] SSTables in each level: [2, 10, 43, 0, 0, 0, 0, 0, 0] SSTables in each level: [2, 10, 67, 0, 0, 0, 0, 0, 0] SSTables in each level: [1, 20/10, 91, 0, 0, 0, 0, 0, 0] SSTables in each level: [1, 20/10, 108/100, 102, 0, 0, 0, 0, 0] SSTables in each level: [2, 10, 37, 0, 0, 0, 0, 0, 0] SSTables in each level: [2, 10, 61, 0, 0, 0, 0, 0, 0] SSTables in each level: [1, 21/10, 95, 0, 0, 0, 0, 0, 0] SSTables in each level: [1, 18/10, 114/100, 84, 0, 0, 0, 0, 0] SSTables in each level: [2, 10, 41, 0, 0, 0, 0, 0, 0] SSTables in each level: [2, 10, 67, 0, 0, 0, 0, 0, 0] SSTables in each level: [1, 20/10, 88, 0, 0, 0, 0, 0, 0] SSTables in each level: [1, 20/10, 110/100, 151, 0, 0, 0, 0, 0] SSTables in each level: [2, 10, 37, 0, 0, 0, 0, 0, 0] SSTables in each level: [2, 10, 56, 0, 0, 0, 0, 0, 0] {noformat} > Repair fails with RuntimeException > -- > > Key: CASSANDRA-9935 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9935 > Project: Cassandra > Issue Type: Bug > Environment: C* 2.1.8, Debian Wheezy >Reporter: mlowicki >Assignee: Yuki Morishita > Fix For: 2.1.x > > Attachments: db1.sync.lati.osa.cassandra.log, > db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, > system.log.10.210.3.221, system.log.10.210.3.230 > > > We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade > to 2.1.8 it started to work faster but now it fails with: > {code} > ... > [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde > for range (-5474076923322749342,-5468600594078911162] finished > [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde > for range (-8631877858109464676,-8624040066373718932] finished > [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde > for range (-5372806541854279315,-5369354119480076785] finished > [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde > for range (8166489034383821955,8168408930184216281] finished > [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde > for range (6084602890817326921,6088328703025510057] finished > [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde > for range (-781874602493000830,-781745173070807746] finished > [2015-07-29 20:44:03,957] Repair command #4 finished > error: nodetool failed, check server logs > -- StackTrace -- > java.lang.RuntimeException: nodetool failed, check server logs > at > org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290) > at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202) > {code} > After running: > {code} > nodetool repair --partitioner-range --parallel --in-local-dc sync > {code} > Last records in logs regarding repair are: > {code} > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range > (-7695808664784761779,-7693529816291585568] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair
[jira] [Commented] (CASSANDRA-10099) Improve concurrency in CompactionStrategyManager
[ https://issues.apache.org/jira/browse/CASSANDRA-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194393#comment-15194393 ] Yuki Morishita commented on CASSANDRA-10099: Thanks for the update. +1. At first I thought we should remove {{while}} loop entirely from {{getNextBackgroundTask}} also, but returning null just stops submitting background task (until that 5-min resubmit kicks in), so for now, removing {{synchronized}} will do. > Improve concurrency in CompactionStrategyManager > > > Key: CASSANDRA-10099 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10099 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuki Morishita >Assignee: Marcus Eriksson > Labels: compaction, lcs > Fix For: 3.x > > > Continue discussion from CASSANDRA-9882. > CompactionStrategyManager(WrappingCompactionStrategy for <3.0) tracks SSTable > changes mainly for separating repaired / unrepaired SSTables (+ LCS manages > level). > This is blocking operation, and can lead to block of flush etc. when > determining next background task takes longer. > Explore the way to mitigate this concurrency issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11344) Fix bloom filter sizing with LCS
[ https://issues.apache.org/jira/browse/CASSANDRA-11344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194350#comment-15194350 ] Paulo Motta commented on CASSANDRA-11344: - CASSANDRA-9830 results are much more consistent and bloom filter sizes have gone down from 60~100MB to ~10MB. The fix also is [confirmed|https://www.mail-archive.com/user@cassandra.apache.org/msg46633.html] by a mailing user who have been facing OOMs due to large bloom filters. I created a [regression dtest|https://github.com/pauloricardomg/cassandra-dtest/tree/11344] to check that bloom filter are within a certain range (100K keys, bf should be between 50KB and 100KB, it's 500KB in the broken version) and resubmitted above dtests with this branch. Feel free to mark as ready to commit when/if tests are passing. > Fix bloom filter sizing with LCS > > > Key: CASSANDRA-11344 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11344 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 2.2.x, 3.0.x, 3.x > > > Since CASSANDRA-7272 we most often over allocate the bloom filter size with > LCS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11351) rethink stream throttling logic
[ https://issues.apache.org/jira/browse/CASSANDRA-11351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194314#comment-15194314 ] Jeff Jirsa commented on CASSANDRA-11351: Any confusion in overloading meaning of zero, everywhere else it seems to indicate unthrottled? > rethink stream throttling logic > --- > > Key: CASSANDRA-11351 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11351 > Project: Cassandra > Issue Type: Improvement >Reporter: Brandon Williams > > Currently, we throttle steaming from the outbound side, because throttling > from the inbound side is thought as not doable. This creates a problem > because the total stream throughput is based on the number of nodes involved, > so based on the operation to be performed it can vary. This creates > operational overhead, as the throttle has to be constantly adjusted. > I propose we flip this logic on its head, and instead limit the total inbound > throughput. How? It's simple: we ask. Given a total inbound throughput of > 200Mb, if a node is going to stream from 10 nodes, it would simply tell the > source nodes to only stream at 20Mb/s when asking for the stream, thereby > never going over the 200Mb inbound limit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3
[ https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Deng updated CASSANDRA-9302: Labels: docs-impacting (was: ) > Optimize cqlsh COPY FROM, part 3 > > > Key: CASSANDRA-9302 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9302 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Jonathan Ellis >Assignee: Stefania >Priority: Critical > Labels: docs-impacting > Fix For: 2.1.13, 2.2.5, 3.0.3, 3.2 > > > We've had some discussion moving to Spark CSV import for bulk load in 3.x, > but people need a good bulk load tool now. One option is to add a separate > Java bulk load tool (CASSANDRA-9048), but if we can match that performance > from cqlsh I would prefer to leave COPY FROM as the preferred option to which > we point people, rather than adding more tools that need to be supported > indefinitely. > Previous work on COPY FROM optimization was done in CASSANDRA-7405 and > CASSANDRA-8225. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11168) Hint Metrics are updated even if hinted_hand-offs=false
[ https://issues.apache.org/jira/browse/CASSANDRA-11168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-11168: Resolution: Fixed Fix Version/s: (was: 3.0.x) (was: 2.2.x) 3.0.5 2.2.6 Status: Resolved (was: Ready to Commit) [Committed|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=719caa67649bf6f27cdd99dd7d6055d2aa8546ae], with some whitespace and formatting fixes on the 3.0 patch. > Hint Metrics are updated even if hinted_hand-offs=false > --- > > Key: CASSANDRA-11168 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11168 > Project: Cassandra > Issue Type: Bug > Components: Coordination, Observability >Reporter: Anubhav Kale >Assignee: Anubhav Kale >Priority: Minor > Fix For: 2.2.6, 3.0.5, 3.5 > > Attachments: 0001-Hinted-Handoff-Fix.patch, > 0001-Hinted-Handoff-fix-2_2.patch, 0001-Hinted-handoff-metrics.patch, > 0001-Hinted-handoffs-fix.patch > > > In our PROD logs, we noticed a lot of hint metrics even though we have > disabled hinted handoffs. > The reason is StorageProxy.ShouldHint has an inverted if condition. > We should also wrap the if (hintWindowExpired) block in if > (DatabaseDescriptor.hintedHandoffEnabled()). > The fix is easy, and I can provide a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[09/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.5
Merge branch 'cassandra-3.0' into cassandra-3.5 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/950b1a3d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/950b1a3d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/950b1a3d Branch: refs/heads/trunk Commit: 950b1a3d2e5ef544a29013f3dc338675ae7682cf Parents: de44900 a479fb0 Author: Joshua McKenzieAuthored: Mon Mar 14 18:02:06 2016 -0400 Committer: Joshua McKenzie Committed: Mon Mar 14 18:02:06 2016 -0400 -- .../apache/cassandra/service/StorageProxy.java | 35 ++-- 1 file changed, 17 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/950b1a3d/src/java/org/apache/cassandra/service/StorageProxy.java --
[02/10] cassandra git commit: Hinted Handoff metrics fix
Hinted Handoff metrics fix Patch by akale; reviewed by jkni for CASSANDRA-11168 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/719caa67 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/719caa67 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/719caa67 Branch: refs/heads/cassandra-3.0 Commit: 719caa67649bf6f27cdd99dd7d6055d2aa8546ae Parents: 971d649 Author: anubhavkAuthored: Thu Mar 10 10:55:37 2016 -0800 Committer: Joshua McKenzie Committed: Mon Mar 14 17:59:47 2016 -0400 -- .../apache/cassandra/service/StorageProxy.java | 31 ++-- 1 file changed, 16 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/719caa67/src/java/org/apache/cassandra/service/StorageProxy.java -- diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java b/src/java/org/apache/cassandra/service/StorageProxy.java index 841e980..8e5ba0f 100644 --- a/src/java/org/apache/cassandra/service/StorageProxy.java +++ b/src/java/org/apache/cassandra/service/StorageProxy.java @@ -2117,29 +2117,30 @@ public class StorageProxy implements StorageProxyMBean public static boolean shouldHint(InetAddress ep) { -if (DatabaseDescriptor.shouldHintByDC()) +if (DatabaseDescriptor.hintedHandoffEnabled()) { -final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); -//Disable DC specific hints -if(!DatabaseDescriptor.hintedHandoffEnabled(dc)) +if (DatabaseDescriptor.shouldHintByDC()) +{ +final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); +// Disable DC specific hints +if (!DatabaseDescriptor.hintedHandoffEnabled(dc)) +{ +return false; +} +} + +boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); +if (hintWindowExpired) { HintedHandOffManager.instance.metrics.incrPastWindow(ep); -return false; +Tracing.trace("Not hinting {} which has been down {}ms", ep, Gossiper.instance.getEndpointDowntime(ep)); } +return !hintWindowExpired; } -else if (!DatabaseDescriptor.hintedHandoffEnabled()) +else { -HintedHandOffManager.instance.metrics.incrPastWindow(ep); return false; } - -boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); -if (hintWindowExpired) -{ -HintedHandOffManager.instance.metrics.incrPastWindow(ep); -Tracing.trace("Not hinting {} which has been down {}ms", ep, Gossiper.instance.getEndpointDowntime(ep)); -} -return !hintWindowExpired; } /**
[05/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a479fb0d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a479fb0d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a479fb0d Branch: refs/heads/cassandra-3.5 Commit: a479fb0dbe9e2e94e7b1c2c410c4436fd9d3c4d3 Parents: 854a243 719caa6 Author: Joshua McKenzieAuthored: Mon Mar 14 18:00:19 2016 -0400 Committer: Joshua McKenzie Committed: Mon Mar 14 18:01:14 2016 -0400 -- .../apache/cassandra/service/StorageProxy.java | 35 ++-- 1 file changed, 17 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a479fb0d/src/java/org/apache/cassandra/service/StorageProxy.java -- diff --cc src/java/org/apache/cassandra/service/StorageProxy.java index 5cebf27,8e5ba0f..1395470 --- a/src/java/org/apache/cassandra/service/StorageProxy.java +++ b/src/java/org/apache/cassandra/service/StorageProxy.java @@@ -2311,31 -2117,30 +2311,30 @@@ public class StorageProxy implements St public static boolean shouldHint(InetAddress ep) { - if (!DatabaseDescriptor.hintedHandoffEnabled()) + if (DatabaseDescriptor.hintedHandoffEnabled()) { - HintsService.instance.metrics.incrPastWindow(ep); - return false; - } - - Set disabledDCs = DatabaseDescriptor.hintedHandoffDisabledDCs(); - if (!disabledDCs.isEmpty()) - { - final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); - if (disabledDCs.contains(dc)) -if (DatabaseDescriptor.shouldHintByDC()) ++Set disabledDCs = DatabaseDescriptor.hintedHandoffDisabledDCs(); ++if (!disabledDCs.isEmpty()) + { + final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); -// Disable DC specific hints -if (!DatabaseDescriptor.hintedHandoffEnabled(dc)) ++if (disabledDCs.contains(dc)) + { ++Tracing.trace("Not hinting {} since its data center {} has been disabled {}", ep, dc, disabledDCs); + return false; + } + } - + boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); + if (hintWindowExpired) { - Tracing.trace("Not hinting {} since its data center {} has been disabled {}", ep, dc, disabledDCs); -HintedHandOffManager.instance.metrics.incrPastWindow(ep); -Tracing.trace("Not hinting {} which has been down {}ms", ep, Gossiper.instance.getEndpointDowntime(ep)); +HintsService.instance.metrics.incrPastWindow(ep); - return false; ++Tracing.trace("Not hinting {} which has been down {} ms", ep, Gossiper.instance.getEndpointDowntime(ep)); } + return !hintWindowExpired; } - - boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); - if (hintWindowExpired) + else { - HintsService.instance.metrics.incrPastWindow(ep); - Tracing.trace("Not hinting {} which has been down {} ms", ep, Gossiper.instance.getEndpointDowntime(ep)); + return false; } - return !hintWindowExpired; } /**
[04/10] cassandra git commit: Hinted Handoff metrics fix
Hinted Handoff metrics fix Patch by akale; reviewed by jkni for CASSANDRA-11168 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/719caa67 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/719caa67 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/719caa67 Branch: refs/heads/trunk Commit: 719caa67649bf6f27cdd99dd7d6055d2aa8546ae Parents: 971d649 Author: anubhavkAuthored: Thu Mar 10 10:55:37 2016 -0800 Committer: Joshua McKenzie Committed: Mon Mar 14 17:59:47 2016 -0400 -- .../apache/cassandra/service/StorageProxy.java | 31 ++-- 1 file changed, 16 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/719caa67/src/java/org/apache/cassandra/service/StorageProxy.java -- diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java b/src/java/org/apache/cassandra/service/StorageProxy.java index 841e980..8e5ba0f 100644 --- a/src/java/org/apache/cassandra/service/StorageProxy.java +++ b/src/java/org/apache/cassandra/service/StorageProxy.java @@ -2117,29 +2117,30 @@ public class StorageProxy implements StorageProxyMBean public static boolean shouldHint(InetAddress ep) { -if (DatabaseDescriptor.shouldHintByDC()) +if (DatabaseDescriptor.hintedHandoffEnabled()) { -final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); -//Disable DC specific hints -if(!DatabaseDescriptor.hintedHandoffEnabled(dc)) +if (DatabaseDescriptor.shouldHintByDC()) +{ +final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); +// Disable DC specific hints +if (!DatabaseDescriptor.hintedHandoffEnabled(dc)) +{ +return false; +} +} + +boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); +if (hintWindowExpired) { HintedHandOffManager.instance.metrics.incrPastWindow(ep); -return false; +Tracing.trace("Not hinting {} which has been down {}ms", ep, Gossiper.instance.getEndpointDowntime(ep)); } +return !hintWindowExpired; } -else if (!DatabaseDescriptor.hintedHandoffEnabled()) +else { -HintedHandOffManager.instance.metrics.incrPastWindow(ep); return false; } - -boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); -if (hintWindowExpired) -{ -HintedHandOffManager.instance.metrics.incrPastWindow(ep); -Tracing.trace("Not hinting {} which has been down {}ms", ep, Gossiper.instance.getEndpointDowntime(ep)); -} -return !hintWindowExpired; } /**
[07/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a479fb0d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a479fb0d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a479fb0d Branch: refs/heads/trunk Commit: a479fb0dbe9e2e94e7b1c2c410c4436fd9d3c4d3 Parents: 854a243 719caa6 Author: Joshua McKenzieAuthored: Mon Mar 14 18:00:19 2016 -0400 Committer: Joshua McKenzie Committed: Mon Mar 14 18:01:14 2016 -0400 -- .../apache/cassandra/service/StorageProxy.java | 35 ++-- 1 file changed, 17 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a479fb0d/src/java/org/apache/cassandra/service/StorageProxy.java -- diff --cc src/java/org/apache/cassandra/service/StorageProxy.java index 5cebf27,8e5ba0f..1395470 --- a/src/java/org/apache/cassandra/service/StorageProxy.java +++ b/src/java/org/apache/cassandra/service/StorageProxy.java @@@ -2311,31 -2117,30 +2311,30 @@@ public class StorageProxy implements St public static boolean shouldHint(InetAddress ep) { - if (!DatabaseDescriptor.hintedHandoffEnabled()) + if (DatabaseDescriptor.hintedHandoffEnabled()) { - HintsService.instance.metrics.incrPastWindow(ep); - return false; - } - - Set disabledDCs = DatabaseDescriptor.hintedHandoffDisabledDCs(); - if (!disabledDCs.isEmpty()) - { - final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); - if (disabledDCs.contains(dc)) -if (DatabaseDescriptor.shouldHintByDC()) ++Set disabledDCs = DatabaseDescriptor.hintedHandoffDisabledDCs(); ++if (!disabledDCs.isEmpty()) + { + final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); -// Disable DC specific hints -if (!DatabaseDescriptor.hintedHandoffEnabled(dc)) ++if (disabledDCs.contains(dc)) + { ++Tracing.trace("Not hinting {} since its data center {} has been disabled {}", ep, dc, disabledDCs); + return false; + } + } - + boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); + if (hintWindowExpired) { - Tracing.trace("Not hinting {} since its data center {} has been disabled {}", ep, dc, disabledDCs); -HintedHandOffManager.instance.metrics.incrPastWindow(ep); -Tracing.trace("Not hinting {} which has been down {}ms", ep, Gossiper.instance.getEndpointDowntime(ep)); +HintsService.instance.metrics.incrPastWindow(ep); - return false; ++Tracing.trace("Not hinting {} which has been down {} ms", ep, Gossiper.instance.getEndpointDowntime(ep)); } + return !hintWindowExpired; } - - boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); - if (hintWindowExpired) + else { - HintsService.instance.metrics.incrPastWindow(ep); - Tracing.trace("Not hinting {} which has been down {} ms", ep, Gossiper.instance.getEndpointDowntime(ep)); + return false; } - return !hintWindowExpired; } /**
[06/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a479fb0d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a479fb0d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a479fb0d Branch: refs/heads/cassandra-3.0 Commit: a479fb0dbe9e2e94e7b1c2c410c4436fd9d3c4d3 Parents: 854a243 719caa6 Author: Joshua McKenzieAuthored: Mon Mar 14 18:00:19 2016 -0400 Committer: Joshua McKenzie Committed: Mon Mar 14 18:01:14 2016 -0400 -- .../apache/cassandra/service/StorageProxy.java | 35 ++-- 1 file changed, 17 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a479fb0d/src/java/org/apache/cassandra/service/StorageProxy.java -- diff --cc src/java/org/apache/cassandra/service/StorageProxy.java index 5cebf27,8e5ba0f..1395470 --- a/src/java/org/apache/cassandra/service/StorageProxy.java +++ b/src/java/org/apache/cassandra/service/StorageProxy.java @@@ -2311,31 -2117,30 +2311,30 @@@ public class StorageProxy implements St public static boolean shouldHint(InetAddress ep) { - if (!DatabaseDescriptor.hintedHandoffEnabled()) + if (DatabaseDescriptor.hintedHandoffEnabled()) { - HintsService.instance.metrics.incrPastWindow(ep); - return false; - } - - Set disabledDCs = DatabaseDescriptor.hintedHandoffDisabledDCs(); - if (!disabledDCs.isEmpty()) - { - final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); - if (disabledDCs.contains(dc)) -if (DatabaseDescriptor.shouldHintByDC()) ++Set disabledDCs = DatabaseDescriptor.hintedHandoffDisabledDCs(); ++if (!disabledDCs.isEmpty()) + { + final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); -// Disable DC specific hints -if (!DatabaseDescriptor.hintedHandoffEnabled(dc)) ++if (disabledDCs.contains(dc)) + { ++Tracing.trace("Not hinting {} since its data center {} has been disabled {}", ep, dc, disabledDCs); + return false; + } + } - + boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); + if (hintWindowExpired) { - Tracing.trace("Not hinting {} since its data center {} has been disabled {}", ep, dc, disabledDCs); -HintedHandOffManager.instance.metrics.incrPastWindow(ep); -Tracing.trace("Not hinting {} which has been down {}ms", ep, Gossiper.instance.getEndpointDowntime(ep)); +HintsService.instance.metrics.incrPastWindow(ep); - return false; ++Tracing.trace("Not hinting {} which has been down {} ms", ep, Gossiper.instance.getEndpointDowntime(ep)); } + return !hintWindowExpired; } - - boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); - if (hintWindowExpired) + else { - HintsService.instance.metrics.incrPastWindow(ep); - Tracing.trace("Not hinting {} which has been down {} ms", ep, Gossiper.instance.getEndpointDowntime(ep)); + return false; } - return !hintWindowExpired; } /**
[01/10] cassandra git commit: Hinted Handoff metrics fix
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 971d64954 -> 719caa676 refs/heads/cassandra-3.0 854a243af -> a479fb0db refs/heads/cassandra-3.5 de44900a3 -> 950b1a3d2 refs/heads/trunk 0ac03a20c -> b12413d4e Hinted Handoff metrics fix Patch by akale; reviewed by jkni for CASSANDRA-11168 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/719caa67 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/719caa67 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/719caa67 Branch: refs/heads/cassandra-2.2 Commit: 719caa67649bf6f27cdd99dd7d6055d2aa8546ae Parents: 971d649 Author: anubhavkAuthored: Thu Mar 10 10:55:37 2016 -0800 Committer: Joshua McKenzie Committed: Mon Mar 14 17:59:47 2016 -0400 -- .../apache/cassandra/service/StorageProxy.java | 31 ++-- 1 file changed, 16 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/719caa67/src/java/org/apache/cassandra/service/StorageProxy.java -- diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java b/src/java/org/apache/cassandra/service/StorageProxy.java index 841e980..8e5ba0f 100644 --- a/src/java/org/apache/cassandra/service/StorageProxy.java +++ b/src/java/org/apache/cassandra/service/StorageProxy.java @@ -2117,29 +2117,30 @@ public class StorageProxy implements StorageProxyMBean public static boolean shouldHint(InetAddress ep) { -if (DatabaseDescriptor.shouldHintByDC()) +if (DatabaseDescriptor.hintedHandoffEnabled()) { -final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); -//Disable DC specific hints -if(!DatabaseDescriptor.hintedHandoffEnabled(dc)) +if (DatabaseDescriptor.shouldHintByDC()) +{ +final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); +// Disable DC specific hints +if (!DatabaseDescriptor.hintedHandoffEnabled(dc)) +{ +return false; +} +} + +boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); +if (hintWindowExpired) { HintedHandOffManager.instance.metrics.incrPastWindow(ep); -return false; +Tracing.trace("Not hinting {} which has been down {}ms", ep, Gossiper.instance.getEndpointDowntime(ep)); } +return !hintWindowExpired; } -else if (!DatabaseDescriptor.hintedHandoffEnabled()) +else { -HintedHandOffManager.instance.metrics.incrPastWindow(ep); return false; } - -boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); -if (hintWindowExpired) -{ -HintedHandOffManager.instance.metrics.incrPastWindow(ep); -Tracing.trace("Not hinting {} which has been down {}ms", ep, Gossiper.instance.getEndpointDowntime(ep)); -} -return !hintWindowExpired; } /**
[08/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.5
Merge branch 'cassandra-3.0' into cassandra-3.5 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/950b1a3d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/950b1a3d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/950b1a3d Branch: refs/heads/cassandra-3.5 Commit: 950b1a3d2e5ef544a29013f3dc338675ae7682cf Parents: de44900 a479fb0 Author: Joshua McKenzieAuthored: Mon Mar 14 18:02:06 2016 -0400 Committer: Joshua McKenzie Committed: Mon Mar 14 18:02:06 2016 -0400 -- .../apache/cassandra/service/StorageProxy.java | 35 ++-- 1 file changed, 17 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/950b1a3d/src/java/org/apache/cassandra/service/StorageProxy.java --
[03/10] cassandra git commit: Hinted Handoff metrics fix
Hinted Handoff metrics fix Patch by akale; reviewed by jkni for CASSANDRA-11168 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/719caa67 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/719caa67 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/719caa67 Branch: refs/heads/cassandra-3.5 Commit: 719caa67649bf6f27cdd99dd7d6055d2aa8546ae Parents: 971d649 Author: anubhavkAuthored: Thu Mar 10 10:55:37 2016 -0800 Committer: Joshua McKenzie Committed: Mon Mar 14 17:59:47 2016 -0400 -- .../apache/cassandra/service/StorageProxy.java | 31 ++-- 1 file changed, 16 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/719caa67/src/java/org/apache/cassandra/service/StorageProxy.java -- diff --git a/src/java/org/apache/cassandra/service/StorageProxy.java b/src/java/org/apache/cassandra/service/StorageProxy.java index 841e980..8e5ba0f 100644 --- a/src/java/org/apache/cassandra/service/StorageProxy.java +++ b/src/java/org/apache/cassandra/service/StorageProxy.java @@ -2117,29 +2117,30 @@ public class StorageProxy implements StorageProxyMBean public static boolean shouldHint(InetAddress ep) { -if (DatabaseDescriptor.shouldHintByDC()) +if (DatabaseDescriptor.hintedHandoffEnabled()) { -final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); -//Disable DC specific hints -if(!DatabaseDescriptor.hintedHandoffEnabled(dc)) +if (DatabaseDescriptor.shouldHintByDC()) +{ +final String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(ep); +// Disable DC specific hints +if (!DatabaseDescriptor.hintedHandoffEnabled(dc)) +{ +return false; +} +} + +boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); +if (hintWindowExpired) { HintedHandOffManager.instance.metrics.incrPastWindow(ep); -return false; +Tracing.trace("Not hinting {} which has been down {}ms", ep, Gossiper.instance.getEndpointDowntime(ep)); } +return !hintWindowExpired; } -else if (!DatabaseDescriptor.hintedHandoffEnabled()) +else { -HintedHandOffManager.instance.metrics.incrPastWindow(ep); return false; } - -boolean hintWindowExpired = Gossiper.instance.getEndpointDowntime(ep) > DatabaseDescriptor.getMaxHintWindow(); -if (hintWindowExpired) -{ -HintedHandOffManager.instance.metrics.incrPastWindow(ep); -Tracing.trace("Not hinting {} which has been down {}ms", ep, Gossiper.instance.getEndpointDowntime(ep)); -} -return !hintWindowExpired; } /**
[10/10] cassandra git commit: Merge branch 'cassandra-3.5' into trunk
Merge branch 'cassandra-3.5' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b12413d4 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b12413d4 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b12413d4 Branch: refs/heads/trunk Commit: b12413d4e260d30f175c19e36a772d003aec0895 Parents: 0ac03a2 950b1a3 Author: Joshua McKenzieAuthored: Mon Mar 14 18:02:19 2016 -0400 Committer: Joshua McKenzie Committed: Mon Mar 14 18:02:19 2016 -0400 -- .../apache/cassandra/service/StorageProxy.java | 35 ++-- 1 file changed, 17 insertions(+), 18 deletions(-) --
[jira] [Updated] (CASSANDRA-11341) dtest failure in upgrade_tests.cql_tests.TestCQLNodes3RF3_2_1_HEAD_UpTo_2_2.whole_list_conditional_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russ Hatch updated CASSANDRA-11341: --- Reviewer: Jim Witschey > dtest failure in > upgrade_tests.cql_tests.TestCQLNodes3RF3_2_1_HEAD_UpTo_2_2.whole_list_conditional_test > --- > > Key: CASSANDRA-11341 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11341 > Project: Cassandra > Issue Type: Test >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > > example failure: > http://cassci.datastax.com/job/upgrade_tests-all/22/testReport/upgrade_tests.cql_tests/TestCQLNodes3RF3_2_1_HEAD_UpTo_2_2/whole_list_conditional_test > Failed on CassCI build upgrade_tests-all #22 > There's only one flap in the history currently. This was the failure: > {code} > Expected [[0, ['foo', 'bar', 'foobar']]] from SELECT * FROM tlist, but got > [[0, None]] > >> begin captured logging << > dtest: DEBUG: cluster ccm directory: /mnt/tmp/dtest-SF2dOV > dtest: DEBUG: Custom init_config not found. Setting defaults. > dtest: DEBUG: Done setting configuration options: > { 'initial_token': None, > 'num_tokens': '32', > 'phi_convict_threshold': 5, > 'range_request_timeout_in_ms': 1, > 'read_request_timeout_in_ms': 1, > 'request_timeout_in_ms': 1, > 'truncate_request_timeout_in_ms': 1, > 'write_request_timeout_in_ms': 1} > dtest: DEBUG: upgrading node1 to 2.2.5 > dtest: DEBUG: Querying upgraded node > - >> end captured logging << - > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/tools.py", line 253, in wrapped > f(obj) > File "/home/automaton/cassandra-dtest/upgrade_tests/cql_tests.py", line > 4294, in whole_list_conditional_test > check_applies("l != null AND l IN (['foo', 'bar', 'foobar'])") > File "/home/automaton/cassandra-dtest/upgrade_tests/cql_tests.py", line > 4282, in check_applies > assert_one(cursor, "SELECT * FROM %s" % (table,), [0, ['foo', 'bar', > 'foobar']]) > File "/home/automaton/cassandra-dtest/assertions.py", line 50, in assert_one > assert list_res == [expected], "Expected %s from %s, but got %s" % > ([expected], query, list_res) > "Expected [[0, ['foo', 'bar', 'foobar']]] from SELECT * FROM tlist, but got > [[0, None]]\n >> begin captured logging << > \ndtest: DEBUG: cluster ccm directory: > /mnt/tmp/dtest-SF2dOV\ndtest: DEBUG: Custom init_config not found. Setting > defaults.\ndtest: DEBUG: Done setting configuration options:\n{ > 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': > 5,\n'range_request_timeout_in_ms': 1,\n > 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n > 'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': > 1}\ndtest: DEBUG: upgrading node1 to 2.2.5\ndtest: DEBUG: Querying > upgraded node\n- >> end captured logging << > -" > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11341) dtest failure in upgrade_tests.cql_tests.TestCQLNodes3RF3_2_1_HEAD_UpTo_2_2.whole_list_conditional_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russ Hatch updated CASSANDRA-11341: --- Status: Patch Available (was: Open) > dtest failure in > upgrade_tests.cql_tests.TestCQLNodes3RF3_2_1_HEAD_UpTo_2_2.whole_list_conditional_test > --- > > Key: CASSANDRA-11341 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11341 > Project: Cassandra > Issue Type: Test >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > > example failure: > http://cassci.datastax.com/job/upgrade_tests-all/22/testReport/upgrade_tests.cql_tests/TestCQLNodes3RF3_2_1_HEAD_UpTo_2_2/whole_list_conditional_test > Failed on CassCI build upgrade_tests-all #22 > There's only one flap in the history currently. This was the failure: > {code} > Expected [[0, ['foo', 'bar', 'foobar']]] from SELECT * FROM tlist, but got > [[0, None]] > >> begin captured logging << > dtest: DEBUG: cluster ccm directory: /mnt/tmp/dtest-SF2dOV > dtest: DEBUG: Custom init_config not found. Setting defaults. > dtest: DEBUG: Done setting configuration options: > { 'initial_token': None, > 'num_tokens': '32', > 'phi_convict_threshold': 5, > 'range_request_timeout_in_ms': 1, > 'read_request_timeout_in_ms': 1, > 'request_timeout_in_ms': 1, > 'truncate_request_timeout_in_ms': 1, > 'write_request_timeout_in_ms': 1} > dtest: DEBUG: upgrading node1 to 2.2.5 > dtest: DEBUG: Querying upgraded node > - >> end captured logging << - > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/tools.py", line 253, in wrapped > f(obj) > File "/home/automaton/cassandra-dtest/upgrade_tests/cql_tests.py", line > 4294, in whole_list_conditional_test > check_applies("l != null AND l IN (['foo', 'bar', 'foobar'])") > File "/home/automaton/cassandra-dtest/upgrade_tests/cql_tests.py", line > 4282, in check_applies > assert_one(cursor, "SELECT * FROM %s" % (table,), [0, ['foo', 'bar', > 'foobar']]) > File "/home/automaton/cassandra-dtest/assertions.py", line 50, in assert_one > assert list_res == [expected], "Expected %s from %s, but got %s" % > ([expected], query, list_res) > "Expected [[0, ['foo', 'bar', 'foobar']]] from SELECT * FROM tlist, but got > [[0, None]]\n >> begin captured logging << > \ndtest: DEBUG: cluster ccm directory: > /mnt/tmp/dtest-SF2dOV\ndtest: DEBUG: Custom init_config not found. Setting > defaults.\ndtest: DEBUG: Done setting configuration options:\n{ > 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': > 5,\n'range_request_timeout_in_ms': 1,\n > 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n > 'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': > 1}\ndtest: DEBUG: upgrading node1 to 2.2.5\ndtest: DEBUG: Querying > upgraded node\n- >> end captured logging << > -" > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11341) dtest failure in upgrade_tests.cql_tests.TestCQLNodes3RF3_2_1_HEAD_UpTo_2_2.whole_list_conditional_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194196#comment-15194196 ] Russ Hatch commented on CASSANDRA-11341: believe I've got a fix in place on dtest at: https://github.com/riptano/cassandra-dtest/pull/857 > dtest failure in > upgrade_tests.cql_tests.TestCQLNodes3RF3_2_1_HEAD_UpTo_2_2.whole_list_conditional_test > --- > > Key: CASSANDRA-11341 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11341 > Project: Cassandra > Issue Type: Test >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > > example failure: > http://cassci.datastax.com/job/upgrade_tests-all/22/testReport/upgrade_tests.cql_tests/TestCQLNodes3RF3_2_1_HEAD_UpTo_2_2/whole_list_conditional_test > Failed on CassCI build upgrade_tests-all #22 > There's only one flap in the history currently. This was the failure: > {code} > Expected [[0, ['foo', 'bar', 'foobar']]] from SELECT * FROM tlist, but got > [[0, None]] > >> begin captured logging << > dtest: DEBUG: cluster ccm directory: /mnt/tmp/dtest-SF2dOV > dtest: DEBUG: Custom init_config not found. Setting defaults. > dtest: DEBUG: Done setting configuration options: > { 'initial_token': None, > 'num_tokens': '32', > 'phi_convict_threshold': 5, > 'range_request_timeout_in_ms': 1, > 'read_request_timeout_in_ms': 1, > 'request_timeout_in_ms': 1, > 'truncate_request_timeout_in_ms': 1, > 'write_request_timeout_in_ms': 1} > dtest: DEBUG: upgrading node1 to 2.2.5 > dtest: DEBUG: Querying upgraded node > - >> end captured logging << - > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/tools.py", line 253, in wrapped > f(obj) > File "/home/automaton/cassandra-dtest/upgrade_tests/cql_tests.py", line > 4294, in whole_list_conditional_test > check_applies("l != null AND l IN (['foo', 'bar', 'foobar'])") > File "/home/automaton/cassandra-dtest/upgrade_tests/cql_tests.py", line > 4282, in check_applies > assert_one(cursor, "SELECT * FROM %s" % (table,), [0, ['foo', 'bar', > 'foobar']]) > File "/home/automaton/cassandra-dtest/assertions.py", line 50, in assert_one > assert list_res == [expected], "Expected %s from %s, but got %s" % > ([expected], query, list_res) > "Expected [[0, ['foo', 'bar', 'foobar']]] from SELECT * FROM tlist, but got > [[0, None]]\n >> begin captured logging << > \ndtest: DEBUG: cluster ccm directory: > /mnt/tmp/dtest-SF2dOV\ndtest: DEBUG: Custom init_config not found. Setting > defaults.\ndtest: DEBUG: Done setting configuration options:\n{ > 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': > 5,\n'range_request_timeout_in_ms': 1,\n > 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n > 'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': > 1}\ndtest: DEBUG: upgrading node1 to 2.2.5\ndtest: DEBUG: Querying > upgraded node\n- >> end captured logging << > -" > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11351) rethink stream throttling logic
[ https://issues.apache.org/jira/browse/CASSANDRA-11351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194172#comment-15194172 ] Jason Brown commented on CASSANDRA-11351: - I think this makes sense. The sending node can throttle itself as needed (perhaps using existing mechanisms), and having the recipient indicate the max throughput it will allow from each peer should better tune things. It might be interesting for the recipient to send updates, as needed, to the senders based on it's changing circumstances (some streams are complete, other have begun, and so on). For example, if streaming from 10 peers, after some subset of streams have finished, the recipient could send a message to the remaining senders and tell them it can now accept more Mb/s. > rethink stream throttling logic > --- > > Key: CASSANDRA-11351 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11351 > Project: Cassandra > Issue Type: Improvement >Reporter: Brandon Williams > > Currently, we throttle steaming from the outbound side, because throttling > from the inbound side is thought as not doable. This creates a problem > because the total stream throughput is based on the number of nodes involved, > so based on the operation to be performed it can vary. This creates > operational overhead, as the throttle has to be constantly adjusted. > I propose we flip this logic on its head, and instead limit the total inbound > throughput. How? It's simple: we ask. Given a total inbound throughput of > 200Mb, if a node is going to stream from 10 nodes, it would simply tell the > source nodes to only stream at 20Mb/s when asking for the stream, thereby > never going over the 200Mb inbound limit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11351) rethink stream throttling logic
[ https://issues.apache.org/jira/browse/CASSANDRA-11351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194138#comment-15194138 ] sankalp kohli commented on CASSANDRA-11351: --- By setting this to zero may be it can pause the stream and implement CASSANDRA-6752 > rethink stream throttling logic > --- > > Key: CASSANDRA-11351 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11351 > Project: Cassandra > Issue Type: Improvement >Reporter: Brandon Williams > > Currently, we throttle steaming from the outbound side, because throttling > from the inbound side is thought as not doable. This creates a problem > because the total stream throughput is based on the number of nodes involved, > so based on the operation to be performed it can vary. This creates > operational overhead, as the throttle has to be constantly adjusted. > I propose we flip this logic on its head, and instead limit the total inbound > throughput. How? It's simple: we ask. Given a total inbound throughput of > 200Mb, if a node is going to stream from 10 nodes, it would simply tell the > source nodes to only stream at 20Mb/s when asking for the stream, thereby > never going over the 200Mb inbound limit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194091#comment-15194091 ] Ruoran Wang commented on CASSANDRA-9625: Here are the thread-dump result. First one is when the reporter is still working, the second one is when reporter is stopped. {noformat} "metrics-graphite-reporter-thread-1" #574 daemon prio=5 os_prio=0 tid=0x7fae39b21800 nid=0x4940 waiting on condition [0x7fa57191] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x7fa67d7972d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} {noformat} "metrics-graphite-reporter-thread-1" #555 daemon prio=5 os_prio=0 tid=0x7fdf4e7f7800 nid=0xe43 waiting for monitor entry [0x7fd6bb86b000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getEstimatedRemainingTasks(WrappingCompactionStrategy.java:162) - waiting to lock <0x7fd72ced3e38> (a org.apache.cassandra.db.compaction.WrappingCompactionStrategy) at org.apache.cassandra.metrics.ColumnFamilyMetrics$13.value(ColumnFamilyMetrics.java:357) at org.apache.cassandra.metrics.ColumnFamilyMetrics$13.value(ColumnFamilyMetrics.java:354) at org.apache.cassandra.metrics.ColumnFamilyMetrics$33.value(ColumnFamilyMetrics.java:662) at org.apache.cassandra.metrics.ColumnFamilyMetrics$33.value(ColumnFamilyMetrics.java:656) at com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:304) at com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:26) at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28) at com.yammer.metrics.reporting.GraphiteReporter.printRegularMetrics(GraphiteReporter.java:247) at com.yammer.metrics.reporting.GraphiteReporter.run(GraphiteReporter.java:213) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: T Jake Luciani > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11346) Can't create User Defined Functions with same name, different args/types
[ https://issues.apache.org/jira/browse/CASSANDRA-11346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194058#comment-15194058 ] Kishan Karunaratne commented on CASSANDRA-11346: Yes, I've figured it out. However, this code did work as-is in C* 3.0 through 3.3, and this error only started showing up in C* 3.4. Sorry for the dupe, JIRA was acting super slow on Friday. > Can't create User Defined Functions with same name, different args/types > > > Key: CASSANDRA-11346 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11346 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 3.4 | Ruby driver 3.0.0-rc1 >Reporter: Kishan Karunaratne > > As of Cassandra 3.4, I can no longer create multiple UDFs with the same name, > but different args/types: > {noformat} > CREATE FUNCTION state_group_and_sum(state map, star_rating int) > CALLED ON NULL INPUT > RETURNS map > LANGUAGE java > AS 'if (state.get(star_rating) == null) > state.put(star_rating, 1); else state.put(star_rating, ((Integer) > state.get(star_rating)) + 1); return state;'; > CREATE FUNCTION state_group_and_sum(state map , > star_rating smallint) > CALLED ON NULL INPUT > RETURNS map > LANGUAGE java > AS 'if (state.get(star_rating) == null) > state.put(star_rating, 1); else state.put(star_rating, ((Integer) > state.get(star_rating)) + 1); return state;'; > {noformat} > Returns to the client: > {noformat} > InvalidRequest: code=2200 [Invalid query] message="Could not compile function > 'simplex.state_group_and_sum' from Java source: > org.apache.cassandra.exceptions.InvalidRequestException: Java source > compilation failed: > Line 1: The method put(Integer, Short) in the type Map is not > applicable for the arguments (Short, int) > Line 1: The method put(Integer, Short) in the type Map is not > applicable for the arguments (Short, int) > Line 1: Cannot cast from Short to Integer > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11351) rethink stream throttling logic
Brandon Williams created CASSANDRA-11351: Summary: rethink stream throttling logic Key: CASSANDRA-11351 URL: https://issues.apache.org/jira/browse/CASSANDRA-11351 Project: Cassandra Issue Type: Improvement Reporter: Brandon Williams Currently, we throttle steaming from the outbound side, because throttling from the inbound side is thought as not doable. This creates a problem because the total stream throughput is based on the number of nodes involved, so based on the operation to be performed it can vary. This creates operational overhead, as the throttle has to be constantly adjusted. I propose we flip this logic on its head, and instead limit the total inbound throughput. How? It's simple: we ask. Given a total inbound throughput of 200Mb, if a node is going to stream from 10 nodes, it would simply tell the source nodes to only stream at 20Mb/s when asking for the stream, thereby never going over the 200Mb inbound limit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11303) New inbound throughput parameters for streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-11303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194055#comment-15194055 ] Paulo Motta commented on CASSANDRA-11303: - Thanks for the patch [~skonno]. Overall your approach looks good, you need to make a few adjustments: * As you already noticed, limiting the inbound throughput will be a bit trickier because the data consumption will be performed internally by {{SSTableMultiWriter.append}}. One idea is to wrap the {{LZFInputStream}} into a new {{ThrottledInputStream}} (extending {{FilterInputStream}}) that receives a {{RateLimiter}} object and transparently acquires the permit before each {{read*}} operation. * The same static {{RateLimiter}} object is being shared between inbound and outgoing streams on {{StreamRateLimiter}}, you'll need to have 4 static {{RateLimiter}} objects: {{inboundLimiter, outboundLimiter, inboundInterDCLimiter, outboundInterDCLimiter}}. * As {{StreamRateLimiter}} will become more complex, I think it can move to its own file instead of being part of {{StreamManager}} * Add property and description to {{cassandra.yaml}} * Add similar support to {{CompressedStreamReader}} After you have your initial version ready, if you could raise a test cluster (check it out [ccm|https://github.com/pcmanus/ccm]) and check if the new property is working as expected that would be great. > New inbound throughput parameters for streaming > --- > > Key: CASSANDRA-11303 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11303 > Project: Cassandra > Issue Type: New Feature > Components: Configuration >Reporter: Satoshi Konno >Priority: Minor > Attachments: cassandra_inbound_stream.diff > > > Hi, > To specify stream throughputs of a node more clearly, I would like to add the > following new inbound parameters like existing outbound parameters in the > cassandra.yaml. > - stream_throughput_inbound_megabits_per_sec > - inter_dc_stream_throughput_outbound_megabits_per_sec > We use only the existing outbound parameters now, but it is difficult to > control the total throughputs of a node. In our production network, some > critical alerts occurs when a node exceed the specified total throughput > which is the sum of the input and output throughputs. > In our operation of Cassandra, the alerts occurs during the bootstrap or > repair processing when a new node is added. In the worst case, we have to > stop the operation of the exceed node. > I have attached the patch under consideration. I would like to add a new > limiter class, StreamInboundRateLimiter, and use the limiter class in > StreamDeserializer class. I use Row::dataSize( )to get the input throughput > in StreamDeserializer::newPartition(), but I am not sure whether the > dataSize() returns the correct data size. > Can someone please tell me how to do it ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11350) Max_SSTable_Age isn't really deprecated in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-11350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193983#comment-15193983 ] Marcus Eriksson edited comment on CASSANDRA-11350 at 3/14/16 7:30 PM: -- max_sstable_age_days is left at 1000 years, if someone wants to get the exact behaviour they had before they can change it back With those settings you will do compaction in 1day-sized windows until they are 1 year old. But the thing is you will actually not do any compaction unless you have to (like after repair etc) since the windows don't change, so keeping max_sstable_age_days at 1 year is quite pointless. Problem with max_sstable_age_days is that. If you bootstrap a new node for example you can end up with a huge amount of sstables older than max_sstable_age_days was (Author: krummas): max_sstable_age_days is left at 1000 years, if someone wants to get the exact behaviour they had before With those settings you will do compaction in 1day-sized windows until they are 1 year old. But the thing is you will actually not do any compaction unless you have to (like after repair etc) since the windows don't change, so keeping max_sstable_age_days at 1 year is quite pointless. Problem with max_sstable_age_days is that. If you bootstrap a new node for example you can end up with a huge amount of sstables older than max_sstable_age_days > Max_SSTable_Age isn't really deprecated in DTCS > --- > > Key: CASSANDRA-11350 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11350 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: PROD >Reporter: Anubhav Kale >Priority: Minor > > Based on the comments in > https://issues.apache.org/jira/browse/CASSANDRA-10280, and changes made to > DateTieredCompactionStrategyOptions.java, the Max_SSTable_Age field is marked > as deprecated. > However, this is still used to filter the old SS Tables in > DateTieredCompactionStrategy.java. Once those tables are filtered, > Max_Window_Size is used to limit how back in time we can go (essentially how > Max_SSTable_Age was used previously). > So I am somewhat confused on the exact use of these two fields. Should > Max_SSTable_Age be really removed and Max_Window_Size be used to filter old > tables (in which case it should be set to 1 year as well) ? > Currently, Max_SSTable_Age = 1 Year, and Max_Window_Size = 1 Day. What is the > expected behavior with these settings ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-11350) Max_SSTable_Age isn't really deprecated in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-11350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson resolved CASSANDRA-11350. - Resolution: Not A Problem closing as not a problem - better to ask the mailing lists for clarification of these things > Max_SSTable_Age isn't really deprecated in DTCS > --- > > Key: CASSANDRA-11350 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11350 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: PROD >Reporter: Anubhav Kale >Priority: Minor > > Based on the comments in > https://issues.apache.org/jira/browse/CASSANDRA-10280, and changes made to > DateTieredCompactionStrategyOptions.java, the Max_SSTable_Age field is marked > as deprecated. > However, this is still used to filter the old SS Tables in > DateTieredCompactionStrategy.java. Once those tables are filtered, > Max_Window_Size is used to limit how back in time we can go (essentially how > Max_SSTable_Age was used previously). > So I am somewhat confused on the exact use of these two fields. Should > Max_SSTable_Age be really removed and Max_Window_Size be used to filter old > tables (in which case it should be set to 1 year as well) ? > Currently, Max_SSTable_Age = 1 Year, and Max_Window_Size = 1 Day. What is the > expected behavior with these settings ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11350) Max_SSTable_Age isn't really deprecated in DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-11350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193983#comment-15193983 ] Marcus Eriksson commented on CASSANDRA-11350: - max_sstable_age_days is left at 1000 years, if someone wants to get the exact behaviour they had before With those settings you will do compaction in 1day-sized windows until they are 1 year old. But the thing is you will actually not do any compaction unless you have to (like after repair etc) since the windows don't change, so keeping max_sstable_age_days at 1 year is quite pointless. Problem with max_sstable_age_days is that. If you bootstrap a new node for example you can end up with a huge amount of sstables older than max_sstable_age_days > Max_SSTable_Age isn't really deprecated in DTCS > --- > > Key: CASSANDRA-11350 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11350 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: PROD >Reporter: Anubhav Kale >Priority: Minor > > Based on the comments in > https://issues.apache.org/jira/browse/CASSANDRA-10280, and changes made to > DateTieredCompactionStrategyOptions.java, the Max_SSTable_Age field is marked > as deprecated. > However, this is still used to filter the old SS Tables in > DateTieredCompactionStrategy.java. Once those tables are filtered, > Max_Window_Size is used to limit how back in time we can go (essentially how > Max_SSTable_Age was used previously). > So I am somewhat confused on the exact use of these two fields. Should > Max_SSTable_Age be really removed and Max_Window_Size be used to filter old > tables (in which case it should be set to 1 year as well) ? > Currently, Max_SSTable_Age = 1 Year, and Max_Window_Size = 1 Day. What is the > expected behavior with these settings ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10091) Align JMX authentication with internal authentication
[ https://issues.apache.org/jira/browse/CASSANDRA-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-10091: -- Reviewer: (was: Aleksey Yeschenko) > Align JMX authentication with internal authentication > - > > Key: CASSANDRA-10091 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10091 > Project: Cassandra > Issue Type: New Feature >Reporter: Jan Karlsson >Assignee: Sam Tunnicliffe >Priority: Minor > Fix For: 3.x > > > It would be useful to authenticate with JMX through Cassandra's internal > authentication. This would reduce the overhead of keeping passwords in files > on the machine and would consolidate passwords to one location. It would also > allow the possibility to handle JMX permissions in Cassandra. > It could be done by creating our own JMX server and setting custom classes > for the authenticator and authorizer. We could then add some parameters where > the user could specify what authenticator and authorizer to use in case they > want to make their own. > This could also be done by creating a premain method which creates a jmx > server. This would give us the feature without changing the Cassandra code > itself. However I believe this would be a good feature to have in Cassandra. > I am currently working on a solution which creates a JMX server and uses a > custom authenticator and authorizer. It is currently build as a premain, > however it would be great if we could put this in Cassandra instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10876) Alter behavior of batch WARN and fail on single partition batches
[ https://issues.apache.org/jira/browse/CASSANDRA-10876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193783#comment-15193783 ] Aleksey Yeschenko commented on CASSANDRA-10876: --- LGTM. The patch still applies cleanly to trunk fwiw, but feel free to apply and rerun the tests if you feel like it (I'd say given the scope of the ticket it's not necessary). > Alter behavior of batch WARN and fail on single partition batches > - > > Key: CASSANDRA-10876 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10876 > Project: Cassandra > Issue Type: Improvement >Reporter: Patrick McFadin >Assignee: Sylvain Lebresne >Priority: Minor > Labels: lhf > Fix For: 3.x > > Attachments: 10876.txt > > > In an attempt to give operator insight into potentially harmful batch usage, > Jiras were created to log WARN or fail on certain batch sizes. This ignores > the single partition batch, which doesn't create the same issues as a > multi-partition batch. > The proposal is to ignore size on single partition batch statements. > Reference: > [CASSANDRA-6487|https://issues.apache.org/jira/browse/CASSANDRA-6487] > [CASSANDRA-8011|https://issues.apache.org/jira/browse/CASSANDRA-8011] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10876) Alter behavior of batch WARN and fail on single partition batches
[ https://issues.apache.org/jira/browse/CASSANDRA-10876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-10876: -- Status: Ready to Commit (was: Patch Available) > Alter behavior of batch WARN and fail on single partition batches > - > > Key: CASSANDRA-10876 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10876 > Project: Cassandra > Issue Type: Improvement >Reporter: Patrick McFadin >Assignee: Sylvain Lebresne >Priority: Minor > Labels: lhf > Fix For: 3.x > > Attachments: 10876.txt > > > In an attempt to give operator insight into potentially harmful batch usage, > Jiras were created to log WARN or fail on certain batch sizes. This ignores > the single partition batch, which doesn't create the same issues as a > multi-partition batch. > The proposal is to ignore size on single partition batch statements. > Reference: > [CASSANDRA-6487|https://issues.apache.org/jira/browse/CASSANDRA-6487] > [CASSANDRA-8011|https://issues.apache.org/jira/browse/CASSANDRA-8011] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11350) Max_SSTable_Age isn't really deprecated in DTCS
Anubhav Kale created CASSANDRA-11350: Summary: Max_SSTable_Age isn't really deprecated in DTCS Key: CASSANDRA-11350 URL: https://issues.apache.org/jira/browse/CASSANDRA-11350 Project: Cassandra Issue Type: Bug Components: Compaction Environment: PROD Reporter: Anubhav Kale Priority: Minor Based on the comments in https://issues.apache.org/jira/browse/CASSANDRA-10280, and changes made to DateTieredCompactionStrategyOptions.java, the Max_SSTable_Age field is marked as deprecated. However, this is still used to filter the old SS Tables in DateTieredCompactionStrategy.java. Once those tables are filtered, Max_Window_Size is used to limit how back in time we can go (essentially how Max_SSTable_Age was used previously). So I am somewhat confused on the exact use of these two fields. Should Max_SSTable_Age be really removed and Max_Window_Size be used to filter old tables (in which case it should be set to 1 year as well) ? Currently, Max_SSTable_Age = 1 Year, and Max_Window_Size = 1 Day. What is the expected behavior with these settings ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11168) Hint Metrics are updated even if hinted_hand-offs=false
[ https://issues.apache.org/jira/browse/CASSANDRA-11168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193558#comment-15193558 ] Joel Knighton commented on CASSANDRA-11168: --- +1. CI looks clean - I ran for 2.2, 3.0, 3.5, and trunk. For committer: there are two patches to be applied. One for 2.2 and one for 3.0+. These are the two most recent patches attached to the issue. The 2.2 patch is attached to the issue as 0001-Hinted-Handoff-fix-2_2.patch. The 3.0+ patch is attached to the issue as 0001-Hinted-handoffs-fix.patch. This patch should merge forward cleanly up to trunk. > Hint Metrics are updated even if hinted_hand-offs=false > --- > > Key: CASSANDRA-11168 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11168 > Project: Cassandra > Issue Type: Bug >Reporter: Anubhav Kale >Assignee: Anubhav Kale >Priority: Minor > Attachments: 0001-Hinted-Handoff-Fix.patch, > 0001-Hinted-Handoff-fix-2_2.patch, 0001-Hinted-handoff-metrics.patch, > 0001-Hinted-handoffs-fix.patch > > > In our PROD logs, we noticed a lot of hint metrics even though we have > disabled hinted handoffs. > The reason is StorageProxy.ShouldHint has an inverted if condition. > We should also wrap the if (hintWindowExpired) block in if > (DatabaseDescriptor.hintedHandoffEnabled()). > The fix is easy, and I can provide a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11168) Hint Metrics are updated even if hinted_hand-offs=false
[ https://issues.apache.org/jira/browse/CASSANDRA-11168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton updated CASSANDRA-11168: -- Fix Version/s: 3.0.x 2.2.x 3.5 Component/s: Observability Coordination > Hint Metrics are updated even if hinted_hand-offs=false > --- > > Key: CASSANDRA-11168 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11168 > Project: Cassandra > Issue Type: Bug > Components: Coordination, Observability >Reporter: Anubhav Kale >Assignee: Anubhav Kale >Priority: Minor > Fix For: 3.5, 2.2.x, 3.0.x > > Attachments: 0001-Hinted-Handoff-Fix.patch, > 0001-Hinted-Handoff-fix-2_2.patch, 0001-Hinted-handoff-metrics.patch, > 0001-Hinted-handoffs-fix.patch > > > In our PROD logs, we noticed a lot of hint metrics even though we have > disabled hinted handoffs. > The reason is StorageProxy.ShouldHint has an inverted if condition. > We should also wrap the if (hintWindowExpired) block in if > (DatabaseDescriptor.hintedHandoffEnabled()). > The fix is easy, and I can provide a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11179) Parallel cleanup can lead to disk space exhaustion
[ https://issues.apache.org/jira/browse/CASSANDRA-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-11179: Labels: doc-impacting (was: ) +1 on defaulting to 2 threads. I like having the default be fairly safe. > Parallel cleanup can lead to disk space exhaustion > -- > > Key: CASSANDRA-11179 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11179 > Project: Cassandra > Issue Type: Improvement > Components: Compaction, Tools >Reporter: Tyler Hobbs >Assignee: Marcus Eriksson > Labels: doc-impacting > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > > In CASSANDRA-5547, we made cleanup (among other things) run in parallel > across multiple sstables. There have been reports on IRC of this leading to > disk space exhaustion, because multiple sstables are (almost entirely) > rewritten at the same time. This seems particularly problematic because > cleanup is frequently run after a cluster is expanded due to low disk space. > I'm not really familiar with how we perform free disk space checks now, but > it sounds like we can make some improvements here. It would be good to > reduce the concurrency of cleanup operations if there isn't enough free disk > space to support this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11168) Hint Metrics are updated even if hinted_hand-offs=false
[ https://issues.apache.org/jira/browse/CASSANDRA-11168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton updated CASSANDRA-11168: -- Status: Ready to Commit (was: Patch Available) > Hint Metrics are updated even if hinted_hand-offs=false > --- > > Key: CASSANDRA-11168 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11168 > Project: Cassandra > Issue Type: Bug >Reporter: Anubhav Kale >Assignee: Anubhav Kale >Priority: Minor > Attachments: 0001-Hinted-Handoff-Fix.patch, > 0001-Hinted-Handoff-fix-2_2.patch, 0001-Hinted-handoff-metrics.patch, > 0001-Hinted-handoffs-fix.patch > > > In our PROD logs, we noticed a lot of hint metrics even though we have > disabled hinted handoffs. > The reason is StorageProxy.ShouldHint has an inverted if condition. > We should also wrap the if (hintWindowExpired) block in if > (DatabaseDescriptor.hintedHandoffEnabled()). > The fix is easy, and I can provide a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11179) Parallel cleanup can lead to disk space exhaustion
[ https://issues.apache.org/jira/browse/CASSANDRA-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193541#comment-15193541 ] Marcus Eriksson commented on CASSANDRA-11179: - updated to use --jobs X or -j X and make it default to 2 threads > Parallel cleanup can lead to disk space exhaustion > -- > > Key: CASSANDRA-11179 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11179 > Project: Cassandra > Issue Type: Improvement > Components: Compaction, Tools >Reporter: Tyler Hobbs >Assignee: Marcus Eriksson > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > > In CASSANDRA-5547, we made cleanup (among other things) run in parallel > across multiple sstables. There have been reports on IRC of this leading to > disk space exhaustion, because multiple sstables are (almost entirely) > rewritten at the same time. This seems particularly problematic because > cleanup is frequently run after a cluster is expanded due to low disk space. > I'm not really familiar with how we perform free disk space checks now, but > it sounds like we can make some improvements here. It would be good to > reduce the concurrency of cleanup operations if there isn't enough free disk > space to support this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11340) Heavy read activity on system_auth tables can cause apparent livelock
[ https://issues.apache.org/jira/browse/CASSANDRA-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193529#comment-15193529 ] Nate McCall commented on CASSANDRA-11340: - [~jjirsa] I think the barrage of RRs triggered on a big cluster backing up the SimpleCondition's queue is the main issue. > Heavy read activity on system_auth tables can cause apparent livelock > - > > Key: CASSANDRA-11340 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11340 > Project: Cassandra > Issue Type: Bug >Reporter: Jeff Jirsa >Assignee: Aleksey Yeschenko > > Reproduced in at least 2.1.9. > It appears possible for queries against system_auth tables to trigger > speculative retry, which causes auth to block on traffic going off node. In > some cases, it appears possible for threads to become deadlocked, causing > load on the nodes to increase sharply. This happens even in clusters with RF > of system_auth == N, as all requests being served locally puts the bar for > 99% SR pretty low. > Incomplete stack trace below, but we haven't yet figured out what exactly is > blocking: > {code} > Thread 82291: (state = BLOCKED) > - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information > may be imprecise) > - java.util.concurrent.locks.LockSupport.parkNanos(long) @bci=11, line=338 > (Compiled frame) > - > org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUntil(long) > @bci=28, line=307 (Compiled frame) > - org.apache.cassandra.utils.concurrent.SimpleCondition.await(long, > java.util.concurrent.TimeUnit) @bci=76, line=63 (Compiled frame) > - org.apache.cassandra.service.ReadCallback.await(long, > java.util.concurrent.TimeUnit) @bci=25, line=92 (Compiled frame) > - > org.apache.cassandra.service.AbstractReadExecutor$SpeculatingReadExecutor.maybeTryAdditionalReplicas() > @bci=39, line=281 (Compiled frame) > - org.apache.cassandra.service.StorageProxy.fetchRows(java.util.List, > org.apache.cassandra.db.ConsistencyLevel) @bci=175, line=1338 (Compiled frame) > - org.apache.cassandra.service.StorageProxy.readRegular(java.util.List, > org.apache.cassandra.db.ConsistencyLevel) @bci=9, line=1274 (Compiled frame) > - org.apache.cassandra.service.StorageProxy.read(java.util.List, > org.apache.cassandra.db.ConsistencyLevel, > org.apache.cassandra.service.ClientState) @bci=57, line=1199 (Compiled frame) > - > org.apache.cassandra.cql3.statements.SelectStatement.execute(org.apache.cassandra.service.pager.Pageable, > org.apache.cassandra.cql3.QueryOptions, int, long, > org.apache.cassandra.service.QueryState) @bci=35, line=272 (Compiled frame) > - > org.apache.cassandra.cql3.statements.SelectStatement.execute(org.apache.cassandra.service.QueryState, > org.apache.cassandra.cql3.QueryOptions) @bci=105, line=224 (Compiled frame) > - org.apache.cassandra.auth.Auth.selectUser(java.lang.String) @bci=27, > line=265 (Compiled frame) > - org.apache.cassandra.auth.Auth.isExistingUser(java.lang.String) @bci=1, > line=86 (Compiled frame) > - > org.apache.cassandra.service.ClientState.login(org.apache.cassandra.auth.AuthenticatedUser) > @bci=11, line=206 (Compiled frame) > - > org.apache.cassandra.transport.messages.AuthResponse.execute(org.apache.cassandra.service.QueryState) > @bci=58, line=82 (Compiled frame) > - > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(io.netty.channel.ChannelHandlerContext, > org.apache.cassandra.transport.Message$Request) @bci=75, line=439 (Compiled > frame) > - > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(io.netty.channel.ChannelHandlerContext, > java.lang.Object) @bci=6, line=335 (Compiled frame) > - > io.netty.channel.SimpleChannelInboundHandler.channelRead(io.netty.channel.ChannelHandlerContext, > java.lang.Object) @bci=17, line=105 (Compiled frame) > - > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(java.lang.Object) > @bci=9, line=333 (Compiled frame) > - > io.netty.channel.AbstractChannelHandlerContext.access$700(io.netty.channel.AbstractChannelHandlerContext, > java.lang.Object) @bci=2, line=32 (Compiled frame) > - io.netty.channel.AbstractChannelHandlerContext$8.run() @bci=8, line=324 > (Compiled frame) > - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 > (Compiled frame) > - > org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run() > @bci=5, line=164 (Compiled frame) > - org.apache.cassandra.concurrent.SEPWorker.run() @bci=87, line=105 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} > In a cluster with many connected clients (potentially thousands), a > reconnection flood (for example, restarting all at once) is likely to trigger > this bug.
[jira] [Commented] (CASSANDRA-4791) When a node bootstrap in a multi-datacenter environment it should try to get data only from local nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193401#comment-15193401 ] Thom Valley commented on CASSANDRA-4791: We are seeing bootstrapping in our 5 DC configuration frequently bootstrapping nodes from some of the longest distance alternate DCs. Using GPFS with appropriate DCs defined. Almost seems like random selection. Cassandra 2.1.13 (DSE 4.8.5) > When a node bootstrap in a multi-datacenter environment it should try to get > data only from local nodes > --- > > Key: CASSANDRA-4791 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4791 > Project: Cassandra > Issue Type: Improvement >Reporter: Eran Chinthaka Withana > > When a node bootstraps, in a multi-datacenter environment, currently it is > streaming data from any node available. But, to cut down the intra datacenter > traffic, it should first try to stream data within the same datacenter or > there should be a flag to explicitly limit it to stream data from its own > data center. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11179) Parallel cleanup can lead to disk space exhaustion
[ https://issues.apache.org/jira/browse/CASSANDRA-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193322#comment-15193322 ] T Jake Luciani commented on CASSANDRA-11179: It wouldn't be much of a change so I think you should make this an int vs a boolean. So you can constrain from 1-N of these at a time. Just block on N futures per iteration. right now it's 1 or ALL. > Parallel cleanup can lead to disk space exhaustion > -- > > Key: CASSANDRA-11179 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11179 > Project: Cassandra > Issue Type: Improvement > Components: Compaction, Tools >Reporter: Tyler Hobbs >Assignee: Marcus Eriksson > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > > In CASSANDRA-5547, we made cleanup (among other things) run in parallel > across multiple sstables. There have been reports on IRC of this leading to > disk space exhaustion, because multiple sstables are (almost entirely) > rewritten at the same time. This seems particularly problematic because > cleanup is frequently run after a cluster is expanded due to low disk space. > I'm not really familiar with how we perform free disk space checks now, but > it sounds like we can make some improvements here. It would be good to > reduce the concurrency of cleanup operations if there isn't enough free disk > space to support this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-11342) Native transport TP stats aren't getting logged
[ https://issues.apache.org/jira/browse/CASSANDRA-11342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-11342. -- Resolution: Not A Problem Oh, then since 2.1 is only for critical fixes at this point and this is not critical, closing. > Native transport TP stats aren't getting logged > --- > > Key: CASSANDRA-11342 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11342 > Project: Cassandra > Issue Type: Improvement >Reporter: Sebastian Estevez > Labels: lhf > > Native-Transports was added back to tpstats in CASSANDRA-10044 but I think it > was missed in the StatusLogger because I'm not seeing it in my system.log. > {code} > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,582 StatusLogger.java:51 - Pool > NameActive Pending Completed Blocked All Time > Blocked > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,583 StatusLogger.java:66 - > CounterMutationStage 2 02534760 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,583 StatusLogger.java:66 - > ReadStage 1 0 447464 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,583 StatusLogger.java:66 - > RequestResponseStage 2 16035382 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,584 StatusLogger.java:66 - > ReadRepairStage 0 1282 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,584 StatusLogger.java:66 - > MutationStage 0 07187156 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,584 StatusLogger.java:66 - > GossipStage 0 0 5535 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,584 StatusLogger.java:66 - > AntiEntropyStage 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,585 StatusLogger.java:66 - > CacheCleanupExecutor 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,585 StatusLogger.java:66 - > MigrationStage0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,585 StatusLogger.java:66 - > ValidationExecutor0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,585 StatusLogger.java:66 - > Sampler 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,585 StatusLogger.java:66 - > MiscStage 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,586 StatusLogger.java:66 - > CommitLogArchiver 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,586 StatusLogger.java:66 - > MemtableFlushWriter 115106 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,586 StatusLogger.java:66 - > PendingRangeCalculator0 0381 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,586 StatusLogger.java:66 - > MemtableReclaimMemory 0 0106 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,587 StatusLogger.java:66 - > MemtablePostFlush 116170 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,587 StatusLogger.java:66 - > CompactionExecutor2 191636 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,587 StatusLogger.java:66 - > InternalResponseStage 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,587 StatusLogger.java:66 - > HintedHandoff 2 5 60 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,588 StatusLogger.java:75 - > CompactionManager 2 4 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,588 StatusLogger.java:87 - > MessagingServicen/a 1/4 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,588 StatusLogger.java:97 - > Cache Type Size Capacity > KeysToSave > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,588 StatusLogger.java:99 - > KeyCache 93954904104857600
[jira] [Commented] (CASSANDRA-11342) Native transport TP stats aren't getting logged
[ https://issues.apache.org/jira/browse/CASSANDRA-11342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193310#comment-15193310 ] Paulo Motta commented on CASSANDRA-11342: - this will only affect 2.1, since CASSANDRA-10018 fixed this for 2.2+. > Native transport TP stats aren't getting logged > --- > > Key: CASSANDRA-11342 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11342 > Project: Cassandra > Issue Type: Improvement >Reporter: Sebastian Estevez > Labels: lhf > > Native-Transports was added back to tpstats in CASSANDRA-10044 but I think it > was missed in the StatusLogger because I'm not seeing it in my system.log. > {code} > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,582 StatusLogger.java:51 - Pool > NameActive Pending Completed Blocked All Time > Blocked > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,583 StatusLogger.java:66 - > CounterMutationStage 2 02534760 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,583 StatusLogger.java:66 - > ReadStage 1 0 447464 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,583 StatusLogger.java:66 - > RequestResponseStage 2 16035382 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,584 StatusLogger.java:66 - > ReadRepairStage 0 1282 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,584 StatusLogger.java:66 - > MutationStage 0 07187156 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,584 StatusLogger.java:66 - > GossipStage 0 0 5535 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,584 StatusLogger.java:66 - > AntiEntropyStage 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,585 StatusLogger.java:66 - > CacheCleanupExecutor 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,585 StatusLogger.java:66 - > MigrationStage0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,585 StatusLogger.java:66 - > ValidationExecutor0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,585 StatusLogger.java:66 - > Sampler 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,585 StatusLogger.java:66 - > MiscStage 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,586 StatusLogger.java:66 - > CommitLogArchiver 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,586 StatusLogger.java:66 - > MemtableFlushWriter 115106 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,586 StatusLogger.java:66 - > PendingRangeCalculator0 0381 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,586 StatusLogger.java:66 - > MemtableReclaimMemory 0 0106 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,587 StatusLogger.java:66 - > MemtablePostFlush 116170 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,587 StatusLogger.java:66 - > CompactionExecutor2 191636 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,587 StatusLogger.java:66 - > InternalResponseStage 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,587 StatusLogger.java:66 - > HintedHandoff 2 5 60 0 > 0 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,588 StatusLogger.java:75 - > CompactionManager 2 4 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,588 StatusLogger.java:87 - > MessagingServicen/a 1/4 > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,588 StatusLogger.java:97 - > Cache Type Size Capacity > KeysToSave > INFO [ScheduledTasks:1] 2016-03-10 20:01:26,588 StatusLogger.java:99 - > KeyCache 93954904104857600 > all >
[jira] [Reopened] (CASSANDRA-5977) Structure for cfstats output (JSON, YAML, or XML)
[ https://issues.apache.org/jira/browse/CASSANDRA-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita reopened CASSANDRA-5977: --- Assignee: Shogo Hoshii Thanks for the patch! I will review shortly. > Structure for cfstats output (JSON, YAML, or XML) > - > > Key: CASSANDRA-5977 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5977 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Alyssa Kwan >Assignee: Shogo Hoshii >Priority: Minor > Labels: Tools > Fix For: 3.x > > Attachments: tablestats_sample_result.json, > tablestats_sample_result.txt, tablestats_sample_result.yaml, > trunk-tablestats.patch, trunk-tablestats.patch > > > nodetool cfstats should take a --format arg that structures the output in > JSON, YAML, or XML. This would be useful for piping into another script that > can easily parse this and act on it. It would also help those of us who use > things like MCollective gather aggregate stats across clusters/nodes. > Thoughts? I can submit a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-5977) Structure for cfstats output (JSON, YAML, or XML)
[ https://issues.apache.org/jira/browse/CASSANDRA-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-5977: -- Labels: Tools (was: ) Reviewer: Yuki Morishita Fix Version/s: 3.x Status: Patch Available (was: Reopened) > Structure for cfstats output (JSON, YAML, or XML) > - > > Key: CASSANDRA-5977 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5977 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Alyssa Kwan >Assignee: Shogo Hoshii >Priority: Minor > Labels: Tools > Fix For: 3.x > > Attachments: tablestats_sample_result.json, > tablestats_sample_result.txt, tablestats_sample_result.yaml, > trunk-tablestats.patch, trunk-tablestats.patch > > > nodetool cfstats should take a --format arg that structures the output in > JSON, YAML, or XML. This would be useful for piping into another script that > can easily parse this and act on it. It would also help those of us who use > things like MCollective gather aggregate stats across clusters/nodes. > Thoughts? I can submit a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11179) Parallel cleanup can lead to disk space exhaustion
[ https://issues.apache.org/jira/browse/CASSANDRA-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193292#comment-15193292 ] Marcus Eriksson commented on CASSANDRA-11179: - ... working on the failing tests > Parallel cleanup can lead to disk space exhaustion > -- > > Key: CASSANDRA-11179 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11179 > Project: Cassandra > Issue Type: Improvement > Components: Compaction, Tools >Reporter: Tyler Hobbs >Assignee: Marcus Eriksson > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > > In CASSANDRA-5547, we made cleanup (among other things) run in parallel > across multiple sstables. There have been reports on IRC of this leading to > disk space exhaustion, because multiple sstables are (almost entirely) > rewritten at the same time. This seems particularly problematic because > cleanup is frequently run after a cluster is expanded due to low disk space. > I'm not really familiar with how we perform free disk space checks now, but > it sounds like we can make some improvements here. It would be good to > reduce the concurrency of cleanup operations if there isn't enough free disk > space to support this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-7118) Exception around IOException doesnt report file or table getting exception
[ https://issues.apache.org/jira/browse/CASSANDRA-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko resolved CASSANDRA-7118. -- Resolution: Cannot Reproduce > Exception around IOException doesnt report file or table getting exception > -- > > Key: CASSANDRA-7118 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7118 > Project: Cassandra > Issue Type: Improvement >Reporter: Laura Adney >Priority: Minor > > Saw this in Cassandra version: 1.2.11.2 > Run into several situations where an IOException indicates that corruption > has occurred. The exception does not provide the sstable or the table name > making it very difficult to determine what files are involved. > The request is to update the error/exception to include more relevant > table/file information. > Example Exception: > ERROR [ReadStage:146665] 2014-02-25 06:28:18,286 CassandraDaemon.java (line > 191) Exception in thread Thread[ReadStage:146665,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1613) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > java.io.EOFException > Caused by: java.io.EOFException > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:446) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:424) > at > org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:380) > at > org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392) > at > org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355) > at > org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:94) > at > org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92) > at > org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73) > at > org.apache.cassandra.db.columniterator.IndexedSliceReader$SimpleBlockFetcher.(IndexedSliceReader.java:477) > at > org.apache.cassandra.db.columniterator.IndexedSliceReader.(IndexedSliceReader.java:94) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11179) Parallel cleanup can lead to disk space exhaustion
[ https://issues.apache.org/jira/browse/CASSANDRA-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-11179: Assignee: Marcus Eriksson (was: T Jake Luciani) Reviewer: Carl Yeksigian (was: Marcus Eriksson) Fix Version/s: 3.x 2.2.x 2.1.x Status: Patch Available (was: Open) patch to add an --seq option to scrub/upgradesstables/cleanup to only use a single thread for the operation ||branch||testall||dtest|| |[marcuse/11179|https://github.com/krummas/cassandra/tree/marcuse/11179]|[testall|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11179-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11179-dtest]| |[marcuse/11179-2.2|https://github.com/krummas/cassandra/tree/marcuse/11179-2.2]|[testall|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11179-2.2-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11179-2.2-dtest]| |[marcuse/11179-3.0|https://github.com/krummas/cassandra/tree/marcuse/11179-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11179-3.0-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11179-3.0-dtest]| |[marcuse/11179-3.5|https://github.com/krummas/cassandra/tree/marcuse/11179-3.5]|[testall|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11179-3.5-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11179-3.5-dtest]| |[marcuse/11179-trunk|https://github.com/krummas/cassandra/tree/marcuse/11179-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11179-trunk-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11179-trunk-dtest]| [~carlyeks] to review since we were poking this in CASSANDRA-10829 > Parallel cleanup can lead to disk space exhaustion > -- > > Key: CASSANDRA-11179 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11179 > Project: Cassandra > Issue Type: Improvement > Components: Compaction, Tools >Reporter: Tyler Hobbs >Assignee: Marcus Eriksson > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > > In CASSANDRA-5547, we made cleanup (among other things) run in parallel > across multiple sstables. There have been reports on IRC of this leading to > disk space exhaustion, because multiple sstables are (almost entirely) > rewritten at the same time. This seems particularly problematic because > cleanup is frequently run after a cluster is expanded due to low disk space. > I'm not really familiar with how we perform free disk space checks now, but > it sounds like we can make some improvements here. It would be good to > reduce the concurrency of cleanup operations if there isn't enough free disk > space to support this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11258) Repair scheduling - Resource locking API
[ https://issues.apache.org/jira/browse/CASSANDRA-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193176#comment-15193176 ] Marcus Olsson commented on CASSANDRA-11258: --- I'm on vacation and I will be back in office 21st of March! /Marcus Olsson > Repair scheduling - Resource locking API > > > Key: CASSANDRA-11258 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11258 > Project: Cassandra > Issue Type: Sub-task >Reporter: Marcus Olsson >Assignee: Marcus Olsson >Priority: Minor > > Create a resource locking API & implementation that is able to lock a > resource in a specified data center. It should handle priorities to avoid > node starvation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11258) Repair scheduling - Resource locking API
[ https://issues.apache.org/jira/browse/CASSANDRA-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193173#comment-15193173 ] Paulo Motta commented on CASSANDRA-11258: - Nice. I'll also start work on CASSANDRA-11190 soon. > Repair scheduling - Resource locking API > > > Key: CASSANDRA-11258 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11258 > Project: Cassandra > Issue Type: Sub-task >Reporter: Marcus Olsson >Assignee: Marcus Olsson >Priority: Minor > > Create a resource locking API & implementation that is able to lock a > resource in a specified data center. It should handle priorities to avoid > node starvation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10956) Enable authentication of native protocol users via client certificates
[ https://issues.apache.org/jira/browse/CASSANDRA-10956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193168#comment-15193168 ] Sam Tunnicliffe commented on CASSANDRA-10956: - Thanks for the patch, I second [~spo...@gmail.com]'s comment that this looks a really useful addition. I don't think that extending {{IAuthenticator}} is necessarily the right way to go though. Making {{ICertificateAuthenticator}} a standalone top level interface makes the options around failure handling cleaner, because then it's pretty straightforward to first attempt cert based auth, but optionally on to fall back to SASL if necessary. To illustrate, I've pushed a branch [here|https://github.com/beobal/cassandra/tree/10956-trunk-v2] with an additional commit on top of the patch. As well as being able to specify a cert-based authenticator in cassandra.yaml, admins can also set the requirement level of that authentication, from a choice of 3 options: {{REQUIRED}}, {{NOT_REQUIRED}}, {{OPTIONAL}}. These new settings are in the {{client_encryption_options}} section of cassandra.yaml. * Choosing {{REQUIRED}} means that clients *must* provide credentials via the certificate (and hence the server should only accept encrypted connections). Futhermore, those credentials *must* be valid & the associated role *must* have {{LOGIN}} privilege. * {{OPTIONAL}} also prioritizes cert based auth, but should it fail for any reason, the server falls back to the existing auth mechanism however that is configured in cassandra.yaml. * {{NOT_REQUIRED}} is the default if no other option is set in the yaml, meaning cert based auth is disabled and the connection is established as it is currently. There's probably no reason for an admin to actually configure in this way, so it could be rejected (with an appropriate warning) if explicitly set. [~sklock], [~spo...@gmail.com] what are your thoughts on this approach? > Enable authentication of native protocol users via client certificates > -- > > Key: CASSANDRA-10956 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10956 > Project: Cassandra > Issue Type: New Feature >Reporter: Samuel Klock >Assignee: Samuel Klock > Attachments: 10956.patch > > > Currently, the native protocol only supports user authentication via SASL. > While this is adequate for many use cases, it may be superfluous in scenarios > where clients are required to present an SSL certificate to connect to the > server. If the certificate presented by a client is sufficient by itself to > specify a user, then an additional (series of) authentication step(s) via > SASL merely add overhead. Worse, for uses wherein it's desirable to obtain > the identity from the client's certificate, it's necessary to implement a > custom SASL mechanism to do so, which increases the effort required to > maintain both client and server and which also duplicates functionality > already provided via SSL/TLS. > Cassandra should provide a means of using certificates for user > authentication in the native protocol without any effort above configuring > SSL on the client and server. Here's a possible strategy: > * Add a new authenticator interface that returns {{AuthenticatedUser}} > objects based on the certificate chain presented by the client. > * If this interface is in use, the user is authenticated immediately after > the server receives the {{STARTUP}} message. It then responds with a > {{READY}} message. > * Otherwise, the existing flow of control is used (i.e., if the authenticator > requires authentication, then an {{AUTHENTICATE}} message is sent to the > client). > One advantage of this strategy is that it is backwards-compatible with > existing schemes; current users of SASL/{{IAuthenticator}} are not impacted. > Moreover, it can function as a drop-in replacement for SASL schemes without > requiring code changes (or even config changes) on the client side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11206) Support large partitions on the 3.0 sstable format
[ https://issues.apache.org/jira/browse/CASSANDRA-11206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193082#comment-15193082 ] Stefania commented on CASSANDRA-11206: -- bq. IndexInfo is also used from {{UnfilteredRowIteratorWithLowerBound#getPartitionIndexLowerBound}} (CASSANDRA-8180) - not sure whether it's worth to deserialize the index for this functionality, *as it is currently restricted to the entries that are present in the key cache*. I tend to remove this access. If I am not mistaken when the sstable iterator is created, the partition should be added to the key cache if not already present. Please have a look at BigTableReader {{iterator()}} and {{getPosition()}} to confirm. The reason we need the index info is that the lower bounds in the sstable metatdata do not work for tombstones. This is the only lower bound we have for tombstones. If it's removed then the optimization of CASSANDRA-8180 no longer works in the presence of tombstones (whether this is acceptable is up for discussion). Can't we add the partition bounds to the offset map? For completeness, I also add that we don't necessarily need a lower bound for the partion, it can be a lower bound for the entire sstable if easier. However it should work for tombstones, that is it should be an instance of {{ClusteringPrefix}} rather than an array of {{ByteBuffer}} as it is currently stored in the sstable metadata. > Support large partitions on the 3.0 sstable format > -- > > Key: CASSANDRA-11206 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11206 > Project: Cassandra > Issue Type: Improvement >Reporter: Jonathan Ellis >Assignee: Robert Stupp > Fix For: 3.x > > > Cassandra saves a sample of IndexInfo objects that store the offset within > each partition of every 64KB (by default) range of rows. To find a row, we > binary search this sample, then scan the partition of the appropriate range. > The problem is that this scales poorly as partitions grow: on a cache miss, > we deserialize the entire set of IndexInfo, which both creates a lot of GC > overhead (as noted in CASSANDRA-9754) but is also non-negligible i/o activity > (relative to reading a single 64KB row range) as partitions get truly large. > We introduced an "offset map" in CASSANDRA-10314 that allows us to perform > the IndexInfo bsearch while only deserializing IndexInfo that we need to > compare against, i.e. log(N) deserializations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11206) Support large partitions on the 3.0 sstable format
[ https://issues.apache.org/jira/browse/CASSANDRA-11206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192977#comment-15192977 ] Robert Stupp commented on CASSANDRA-11206: -- Quick progress status: * refactored the code to be able to handle "flat byte structures" (i.e. a {{byte[]}} at the moment - as a pre-requisite to directly access the index file) * IndexInfo is only used from {{AbstractSSTableIterator.IndexState}} - an instance to an open index-file is available, so removing the {{byte[]}} and accessing the index file directly is the next step. * unit and dtests are mostly passing (i.e. there are some flakey ones on cassci, which are passing locally). Still need to identify what's going on with the failing paging dtests. * cstar tests show similar results compared to current trunk * IndexInfo is also used from {{UnfilteredRowIteratorWithLowerBound#getPartitionIndexLowerBound}} (CASSANDRA-8180) - not sure whether it's worth to deserialize the index for this functionality, as it is currently restricted to the entries that are present in the key cache. I tend to remove this access. (/cc [~Stefania]) Observations: * accesses to IndexInfo objects are "random" during the binary search operation (as expected) * accesses to IndexInfo objects are "nearly sequential" during scan operations - "nearly" means, it accesses index N, then index N-1, then index N+1 before it actually moves ahead - but does some random accesses to previously accessed IndexInfo instances afterwards. Therefore {{IndexState}} "caches" the already deserialised {{IndexInfo}} objects. These should stay in new-gen as these are only referenced during the lifetime of the actual read. Alternatively it is possible to use a plain & boring LRU like cache for the 10 last IndexInfo objects in IndexState. * index-file writes (flushes/compactions) also used {{IndexInfo}} objects - replaced with a buffered write ({{DataOutputBuffer}}) Assumptions: * heap pressure due to the vast amount of {{IndexInfo}} objects is already handled by this patch (exchanged to one {{byte[]}} at the moment) both for reads and flushes/compactions * after replacing the {{byte[]}} with index file access, we could lower the (default) key-cache size since we then no longer cache {{IndexInfo}} objects on heap So the next step is to remove the {{byte[]}} from {{IndexedEntry}} and replace it with index-file access from {{IndexState}}. > Support large partitions on the 3.0 sstable format > -- > > Key: CASSANDRA-11206 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11206 > Project: Cassandra > Issue Type: Improvement >Reporter: Jonathan Ellis >Assignee: Robert Stupp > Fix For: 3.x > > > Cassandra saves a sample of IndexInfo objects that store the offset within > each partition of every 64KB (by default) range of rows. To find a row, we > binary search this sample, then scan the partition of the appropriate range. > The problem is that this scales poorly as partitions grow: on a cache miss, > we deserialize the entire set of IndexInfo, which both creates a lot of GC > overhead (as noted in CASSANDRA-9754) but is also non-negligible i/o activity > (relative to reading a single 64KB row range) as partitions get truly large. > We introduced an "offset map" in CASSANDRA-10314 that allows us to perform > the IndexInfo bsearch while only deserializing IndexInfo that we need to > compare against, i.e. log(N) deserializations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-11346) Can't create User Defined Functions with same name, different args/types
[ https://issues.apache.org/jira/browse/CASSANDRA-11346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp resolved CASSANDRA-11346. -- Resolution: Not A Problem Your Java code for this UDF itself is wrong - i.e. it cannot be compiled. That's what the error message says. > Can't create User Defined Functions with same name, different args/types > > > Key: CASSANDRA-11346 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11346 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 3.4 | Ruby driver 3.0.0-rc1 >Reporter: Kishan Karunaratne > > As of Cassandra 3.4, I can no longer create multiple UDFs with the same name, > but different args/types: > {noformat} > CREATE FUNCTION state_group_and_sum(state map, star_rating int) > CALLED ON NULL INPUT > RETURNS map > LANGUAGE java > AS 'if (state.get(star_rating) == null) > state.put(star_rating, 1); else state.put(star_rating, ((Integer) > state.get(star_rating)) + 1); return state;'; > CREATE FUNCTION state_group_and_sum(state map , > star_rating smallint) > CALLED ON NULL INPUT > RETURNS map > LANGUAGE java > AS 'if (state.get(star_rating) == null) > state.put(star_rating, 1); else state.put(star_rating, ((Integer) > state.get(star_rating)) + 1); return state;'; > {noformat} > Returns to the client: > {noformat} > InvalidRequest: code=2200 [Invalid query] message="Could not compile function > 'simplex.state_group_and_sum' from Java source: > org.apache.cassandra.exceptions.InvalidRequestException: Java source > compilation failed: > Line 1: The method put(Integer, Short) in the type Map is not > applicable for the arguments (Short, int) > Line 1: The method put(Integer, Short) in the type Map is not > applicable for the arguments (Short, int) > Line 1: Cannot cast from Short to Integer > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-11347) Can't create User Defined Functions with same name, different args/types
[ https://issues.apache.org/jira/browse/CASSANDRA-11347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp resolved CASSANDRA-11347. -- Resolution: Duplicate > Can't create User Defined Functions with same name, different args/types > > > Key: CASSANDRA-11347 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11347 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 3.4 | Ruby driver 3.0.0-rc1 >Reporter: Kishan Karunaratne > > As of Cassandra 3.4, I can no longer create multiple UDFs with the same name, > but different args/types: > {noformat} > CREATE FUNCTION state_group_and_sum(state map, star_rating int) > CALLED ON NULL INPUT > RETURNS map > LANGUAGE java > AS 'if (state.get(star_rating) == null) > state.put(star_rating, 1); else state.put(star_rating, ((Integer) > state.get(star_rating)) + 1); return state;'; > CREATE FUNCTION state_group_and_sum(state map , > star_rating smallint) > CALLED ON NULL INPUT > RETURNS map > LANGUAGE java > AS 'if (state.get(star_rating) == null) > state.put(star_rating, 1); else state.put(star_rating, ((Integer) > state.get(star_rating)) + 1); return state;'; > {noformat} > Returns to the client: > {noformat} > InvalidRequest: code=2200 [Invalid query] message="Could not compile function > 'simplex.state_group_and_sum' from Java source: > org.apache.cassandra.exceptions.InvalidRequestException: Java source > compilation failed: > Line 1: The method put(Integer, Short) in the type Map is not > applicable for the arguments (Short, int) > Line 1: The method put(Integer, Short) in the type Map is not > applicable for the arguments (Short, int) > Line 1: Cannot cast from Short to Integer > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-5977) Structure for cfstats output (JSON, YAML, or XML)
[ https://issues.apache.org/jira/browse/CASSANDRA-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shogo Hoshii updated CASSANDRA-5977: Flags: Patch > Structure for cfstats output (JSON, YAML, or XML) > - > > Key: CASSANDRA-5977 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5977 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Alyssa Kwan >Priority: Minor > Attachments: tablestats_sample_result.json, > tablestats_sample_result.txt, tablestats_sample_result.yaml, > trunk-tablestats.patch, trunk-tablestats.patch > > > nodetool cfstats should take a --format arg that structures the output in > JSON, YAML, or XML. This would be useful for piping into another script that > can easily parse this and act on it. It would also help those of us who use > things like MCollective gather aggregate stats across clusters/nodes. > Thoughts? I can submit a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-5977) Structure for cfstats output (JSON, YAML, or XML)
[ https://issues.apache.org/jira/browse/CASSANDRA-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shogo Hoshii updated CASSANDRA-5977: Attachment: trunk-tablestats.patch re-uploaded because previous one includes unnecessary diffs > Structure for cfstats output (JSON, YAML, or XML) > - > > Key: CASSANDRA-5977 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5977 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Alyssa Kwan >Priority: Minor > Attachments: tablestats_sample_result.json, > tablestats_sample_result.txt, tablestats_sample_result.yaml, > trunk-tablestats.patch, trunk-tablestats.patch > > > nodetool cfstats should take a --format arg that structures the output in > JSON, YAML, or XML. This would be useful for piping into another script that > can easily parse this and act on it. It would also help those of us who use > things like MCollective gather aggregate stats across clusters/nodes. > Thoughts? I can submit a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-5977) Structure for cfstats output (JSON, YAML, or XML)
[ https://issues.apache.org/jira/browse/CASSANDRA-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192816#comment-15192816 ] Shogo Hoshii commented on CASSANDRA-5977: - Hello, I added -F option to tablestats (cfstats) command to get json or yaml result of that. If you don't use -F option, the result is same as that of previous versions. Could someone review the attached patch? I worked in trunk branch. Thank you > Structure for cfstats output (JSON, YAML, or XML) > - > > Key: CASSANDRA-5977 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5977 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Alyssa Kwan >Priority: Minor > Attachments: tablestats_sample_result.json, > tablestats_sample_result.txt, tablestats_sample_result.yaml, > trunk-tablestats.patch > > > nodetool cfstats should take a --format arg that structures the output in > JSON, YAML, or XML. This would be useful for piping into another script that > can easily parse this and act on it. It would also help those of us who use > things like MCollective gather aggregate stats across clusters/nodes. > Thoughts? I can submit a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-5977) Structure for cfstats output (JSON, YAML, or XML)
[ https://issues.apache.org/jira/browse/CASSANDRA-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shogo Hoshii updated CASSANDRA-5977: Attachment: tablestats_sample_result.txt tablestats_sample_result.yaml tablestats_sample_result.json trunk-tablestats.patch - trunk-tablestats.patch add -F option to choose json or yaml format - tablestats_sample_result.json sample result displayed by command `nodetool tablestats -F json` - tablestats_sample_result.yaml sample result displayed by command `nodetool tablestats -F yaml` - tablestats_sample_result.txt sample result displayed by command `nodetool tablestats` > Structure for cfstats output (JSON, YAML, or XML) > - > > Key: CASSANDRA-5977 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5977 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Alyssa Kwan >Priority: Minor > Attachments: tablestats_sample_result.json, > tablestats_sample_result.txt, tablestats_sample_result.yaml, > trunk-tablestats.patch > > > nodetool cfstats should take a --format arg that structures the output in > JSON, YAML, or XML. This would be useful for piping into another script that > can easily parse this and act on it. It would also help those of us who use > things like MCollective gather aggregate stats across clusters/nodes. > Thoughts? I can submit a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)