[
https://issues.apache.org/jira/browse/CASSANDRA-7239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114218#comment-14114218
]
Brandon Williams commented on CASSANDRA-7239:
---------------------------------------------
Here's one way to repro fairly easily: stick an assert/throw in
org.apache.cassandra.db.DataTracker.spaceReclaimed after it decrements, then do
some stress inserts. Took around 4M for me to trigger it:
{noformat}
INFO 19:26:13 Enqueuing flush of Standard1: 87030528 (33%) on-heap, 0 (0%)
off-heap
INFO 19:26:13 Writing Memtable-Standard1@1502715954(24933920 serialized bytes,
566680 ops, 33%/0% of on/off-heap limit)
INFO 19:26:15 Completed flushing
/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-14-Data.db
(31847416 bytes) for commitlog position
ReplayPosition(segmentId=1409253765886, position=18443872)
INFO 19:26:15 Enqueuing flush of compactions_in_progress: 1334 (0%) on-heap, 0
(0%) off-heap
INFO 19:26:15 Writing Memtable-compactions_in_progress@1683103686(144
serialized bytes, 9 ops, 0%/0% of on/off-heap limit)
INFO 19:26:15 Completed flushing
/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-6-Data.db
(167 bytes) for commitlog position ReplayPosition(segmentId=1409253765887,
position=857003)
INFO 19:26:15 Compacting
[SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-12-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-13-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-11-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-14-Data.db')]
INFO 19:26:18 Enqueuing flush of Standard1: 87028992 (33%) on-heap, 0 (0%)
off-heap
INFO 19:26:18 Writing Memtable-Standard1@1478442071(24930840 serialized bytes,
566610 ops, 33%/0% of on/off-heap limit)
INFO 19:26:20 Completed flushing
/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-16-Data.db
(31843482 bytes) for commitlog position
ReplayPosition(segmentId=1409253765887, position=21153003)
INFO 19:26:23 Enqueuing flush of Standard1: 87039744 (33%) on-heap, 0 (0%)
off-heap
INFO 19:26:23 Writing Memtable-Standard1@708372436(24935900 serialized bytes,
566725 ops, 33%/0% of on/off-heap limit)
INFO 19:26:25 Completed flushing
/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-17-Data.db
(31849945 bytes) for commitlog position
ReplayPosition(segmentId=1409253765888, position=23869152)
INFO 19:26:28 Enqueuing flush of Standard1: 87031296 (33%) on-heap, 0 (0%)
off-heap
INFO 19:26:28 Writing Memtable-Standard1@1583340894(24930840 serialized bytes,
566610 ops, 33%/0% of on/off-heap limit)
INFO 19:26:29 Enqueuing flush of compactions_in_progress: 180 (0%) on-heap, 0
(0%) off-heap
INFO 19:26:29 Writing Memtable-compactions_in_progress@1333345401(0 serialized
bytes, 1 ops, 0%/0% of on/off-heap limit)
INFO 19:26:29 Completed flushing
/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-7-Data.db
(42 bytes) for commitlog position ReplayPosition(segmentId=1409253765890,
position=1491303)
INFO 19:26:30 Completed flushing
/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-18-Data.db
(31843482 bytes) for commitlog position
ReplayPosition(segmentId=1409253765889, position=26577952)
INFO 19:26:30 Compacted 4 sstables to
[/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-15,].
127,390,788 bytes to 127,390,788 (~100% of original) in 15,199ms =
7.993244MB/s. 453,348 total partitions merged to 453,348. Partition merge
counts were {1:453348, }
INFO 19:26:33 Enqueuing flush of Standard1: 87026688 (33%) on-heap, 0 (0%)
off-heap
INFO 19:26:33 Writing Memtable-Standard1@159807468(24931940 serialized bytes,
566635 ops, 33%/0% of on/off-heap limit)
INFO 19:26:34 Completed flushing
/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-19-Data.db
(31844887 bytes) for commitlog position
ReplayPosition(segmentId=1409253765890, position=29289191)
INFO 19:26:34 Enqueuing flush of compactions_in_progress: 1334 (0%) on-heap, 0
(0%) off-heap
INFO 19:26:34 Writing Memtable-compactions_in_progress@1430076737(144
serialized bytes, 9 ops, 0%/0% of on/off-heap limit)
INFO 19:26:34 Completed flushing
/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-8-Data.db
(172 bytes) for commitlog position ReplayPosition(segmentId=1409253765891,
position=7686763)
INFO 19:26:34 Compacting
[SSTableReader(path='/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-5-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-8-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-6-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-7-Data.db')]
INFO 19:26:34 Compacting
[SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-17-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-19-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-16-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1-b8aa5ae02ee811e4a834517bcdb23258/Keyspace1-Standard1-ka-18-Data.db')]
ERROR 19:26:34 Exception in thread Thread[NonPeriodicTasks:1,5,main]
java.lang.RuntimeException: disk space is negative
at
org.apache.cassandra.db.DataTracker.spaceReclaimed(DataTracker.java:422)
~[main/:na]
at
org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:96)
~[main/:na]
at
org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:655)
~[main/:na]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
~[na:1.7.0_60]
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
~[na:1.7.0_60]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
~[na:1.7.0_60]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
~[na:1.7.0_60]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_60]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_60]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_60]
INFO 19:26:34 Compacted 4 sstables to
[/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-9,].
423 bytes to 172 (~40% of original) in 51ms = 0.003216MB/s. 4 total
partitions merged to 1. Partition merge counts were {1:2, 2:1, }
{noformat}
And indeed nodetool is reporting negative load:
{noformat}
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID
Rack
UN 10.208.35.225 -481904132 bytes 256 100.0%
38dd30d1-de65-43dd-9bcb-b0cd018f050c rack1
{noformat}
So it looks like we have some kind of race between decrementing and
incrementing.
> Nodetool Status Reports Negative Load With VNodes Disabled
> ----------------------------------------------------------
>
> Key: CASSANDRA-7239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7239
> Project: Cassandra
> Issue Type: Bug
> Components: Tools
> Environment: 1000 Nodes EC2 m1.large ubuntu 12.04
> Reporter: Russell Alexander Spitzer
> Priority: Minor
> Fix For: 2.1.1
>
> Attachments: nodetool.png, opscenter.png
>
>
> When I run stress on a large cluster without vnodes (num_token =1 initial
> token set) The loads reported by nodetool status are negative, or become
> negative after stress is run.
> {code}
> UN 10.97.155.31 -447426217 bytes 1 0.2%
> 8d40568c-044c-4753-be26-4ab62710beba rack1
>
> UN 10.9.132.53 -447342449 bytes 1 0.2%
> 58e7f255-803d-493b-a19e-58137466fb78 rack1
>
> UN 10.37.151.202 -447298672 bytes 1 0.2%
> ba29b1f1-186f-45d0-9e59-6a528db8df5d rack1
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)