[ 
https://issues.apache.org/jira/browse/CASSANDRA-18176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17687171#comment-17687171
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-18176 at 2/10/23 4:39 PM:
-----------------------------------------------------------------------------

As part of any backport work, do you want to perhaps take a look at why we 
aren't seeing the strong loop detection errors anywhere in our CI, or if we are 
why they aren't being surfaced? Or perhaps just create a dedicated strong loop 
test that makes sure to run the detector at least once after running some basic 
tasks?

This is a pretty serious issue really, as it completely disables all of our 
leak detection which is obviously very important. So we want to catch mistakes 
here as early as possible.


was (Author: benedict):
As part of any backport work, do you want to perhaps take a look at why we 
aren't seeing the strong loop detection errors anywhere in our CI, or if we are 
why they aren't being surfaced? Or perhaps just create a dedicated strong loop 
test that makes sure to run the detector at least once after running some basic 
tasks?

> Merged SSTable files not reclaimed by OS
> ----------------------------------------
>
>                 Key: CASSANDRA-18176
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18176
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Compaction
>            Reporter: Pere Balaguer
>            Priority: Normal
>             Fix For: 4.0.x
>
>
> (EDIT: Looks like this is masked by the lack of backport of CASSANDRA-17205 
> just on the 4.0 line - will backport that and block here on it)
> After upgrading to Cassandra 4.x (4.0.1 and 4.0.5) we've noticed that at 
> times after a compaction deleted sstables diskspace doesn't get reclaimed by 
> the OS until the cassandra process is restarted (which kinda points at some 
> sort of resource leak), I do not recall this happening in cassandra 3, at 
> least not to such degree.
> We've seen the behavior in multiple clusters with different schemas, access 
> patterns and consistency levels at somewhat "random" points in time, the only 
> interesting thing is that there were active repair sessions at the time 
> affecting the node, keyspace and table.
> {noformat}
> $ date +%Y-%m-%d
> 2023-01-17
> $ nodetool version
> ReleaseVersion: 4.0.5
> {noformat}
> {noformat}
> $ lsof +L1 | grep cassandra | grep myawesomecluster | wc -l
> 2772
> $ lsof +L1 | grep cassandra | grep myawesomecluster | tail -n1
> java      59003 cassandra *979u   REG  253,8        10     0 1208053768 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Digest.crc32
>  (deleted)
> {noformat}
> {noformat}
> $ grep 2274426 /cassandra/systemlog.log
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,969 BigTableZeroCopyWriter.java:203 - Writing component DATA to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Data.db
>  length 2.900KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,969 BigTableZeroCopyWriter.java:213 - Block Writing component to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Data.db
>  length 2.900KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,970 BigTableZeroCopyWriter.java:203 - Writing component 
> PRIMARY_INDEX to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Index.db
>  length 3.739KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,970 BigTableZeroCopyWriter.java:213 - Block Writing component to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Index.db
>  length 3.739KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,970 BigTableZeroCopyWriter.java:203 - Writing component STATS to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Statistics.db
>  length 5.062KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,970 BigTableZeroCopyWriter.java:213 - Block Writing component to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Statistics.db
>  length 5.062KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,971 BigTableZeroCopyWriter.java:203 - Writing component 
> COMPRESSION_INFO to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-CompressionInfo.db
>  length 0.054KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,971 BigTableZeroCopyWriter.java:213 - Block Writing component to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-CompressionInfo.db
>  length 0.054KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,971 BigTableZeroCopyWriter.java:203 - Writing component FILTER to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Filter.db
>  length 0.031KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,971 BigTableZeroCopyWriter.java:213 - Block Writing component to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Filter.db
>  length 0.031KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,972 BigTableZeroCopyWriter.java:203 - Writing component SUMMARY to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Summary.db
>  length 0.436KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,972 BigTableZeroCopyWriter.java:213 - Block Writing component to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Summary.db
>  length 0.436KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,972 BigTableZeroCopyWriter.java:203 - Writing component DIGEST to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Digest.crc32
>  length 0.010KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,972 BigTableZeroCopyWriter.java:213 - Block Writing component to 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Digest.crc32
>  length 0.010KiB
> INFO  [Stream-Deserializer-/10.214.79.62:randomport-b904af67] 2023-01-15 
> 13:06:24,974 SSTableReaderBuilder.java:351 - Opening 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big
>  (2.900KiB)
> INFO  [CompactionExecutor:54141] 2023-01-15 13:06:24,978 
> CompactionTask.java:150 - Compacting (6a3e8320-94d5-11ed-b2c8-7b967b642f39) 
> [/cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274422-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274411-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274423-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274410-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274412-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274413-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274414-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274419-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274425-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274424-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274418-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274421-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274420-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274416-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274417-big-Data.db:level=0,
>  
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274415-big-Data.db:level=0,
>  ]
> INFO  [NonPeriodicTasks:1] 2023-01-15 13:06:25,254 SSTable.java:111 - 
> Deleting sstable: 
> /cassandra-data/data/myawesomecluster/schema_one-randomchecksum/nb-2274426-big
> {noformat}
> {noformat}
> $ nodetool compactionhistory | grep '2023-01-15T13:06' 
> 75032a40-94d5-11ed-b2c8-7b967b642f39 myawesomecluster    schema_one          
> 2023-01-15T13:06:43.044 53958242   52853746   {1:478098, 2:11313, 3:1765, 
> 4:287, 5:185, 6:88, 7:39, 8:22, 9:13, 10:7, 11:3, 14:1}
> 6a4539e0-94d5-11ed-b2c8-7b967b642f39 myawesomecluster    schema_one          
> 2023-01-15T13:06:25.022 107577     30110      {1:25, 2:66, 3:124, 4:16, 5:66, 
> 6:15, 9:1}
> 6a1b43b0-94d5-11ed-b2c8-7b967b642f39 myawesomecluster    schema_one          
> 2023-01-15T13:06:24.747 91018      29868      {1:16, 2:127, 3:148, 4:5, 5:7, 
> 6:6, 7:3, 8:1, 10:1}
> 6a063510-94d5-11ed-b2c8-7b967b642f39 myawesomecluster    schema_one          
> 2023-01-15T13:06:24.609 767        366        {2:2}
> 6a0523a0-94d5-11ed-b2c8-7b967b642f39 myawesomecluster    schema_one          
> 2023-01-15T13:06:24.602 87020      27917      {1:73, 2:38, 3:96, 4:15, 5:67, 
> 6:1}
> 69c5a9a0-94d5-11ed-b2c8-7b967b642f39 myawesomecluster    schema_one          
> 2023-01-15T13:06:24.186 45345      25814      {1:117, 2:150, 3:3}
> 6956bb30-94d5-11ed-b2c8-7b967b642f39 myawesomecluster    schema_two          
> 2023-01-15T13:06:23.459 8925831    8904158    {1:102662, 2:120}
> 5d886c40-94d5-11ed-b2c8-7b967b642f39 myawesomecluster    schema_two          
> 2023-01-15T13:06:03.652 8901698    8901269    {1:102606, 2:9}
> 5d02c180-94d5-11ed-b2c8-7b967b642f39 myawesomecluster    schema_two          
> 2023-01-15T13:06:02.776 8976310    8900384    {1:102844, 2:324}
> 5bf9dd00-94d5-11ed-b2c8-7b967b642f39 system_distributed repair_history        
> 2023-01-15T13:06:01.040 4848124    4847506    {4:2}
> 5bc27950-94d5-11ed-b2c8-7b967b642f39 system_distributed parent_repair_history 
> 2023-01-15T13:06:00.677 351877     351135     {1:2383, 2:1}
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to