[jira] [Commented] (CASSANDRA-16681) org.apache.cassandra.utils.memory.LongBufferPoolTest - tests are flaky

Piotr Kolaczkowski (Jira) Tue, 31 May 2022 02:21:05 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-16681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17544259#comment-17544259
 ]


Piotr Kolaczkowski commented on CASSANDRA-16681:
------------------------------------------------

> Code analysis is simpler, as less is changed. 

How adding a new parameter + modifying all callers is "less change" than a 
single `Thread.currentThread() == owningThread` check that is explicit and 
guaranteed to do the right thing?

The thread check was the only thing needed to correct for that particular 
problem described in point 1. 

Anyway, I'm fine as long as we still keep an `assert Thread.currentThread() == 
owningThread` in that place, to make it obvious and fail fast in case someone 
gets the boolean flag wrong.

The fact it was broken and multiple people didn't see it for many months 
already proves the property of "only one thread can enter here" is totally not 
obvious.

And now it would be even harder when we have one more parameter, because now 
the reader of the code would have to analyze all the callers who call put if 
they set this flag properly.

 

> The public API is not modified, it is entirely internal to `BufferPool`.

Technically you're right, but this is one of the main APIs of a major internal 
component, which in itself is quite complex and has many callers.

LocalPool's put is called from the main API of the BufferPool as well  as 
indirectly from chunks when they get recycled, including recursive call from 
other put (you can have put twice in a call trace). 

 

> No, this would behave correctly as we would be invoking it on the tinyPool 
> that we own, and so we would pass `isOwner = true`

But what about the already evicted tiny buffers? Owner == null for those.

 

> Lazy recycling can introduce more pathological behaviour, particularly for 
> `TinyPool` as we may have large allocations that have all been freed, and a 
> single tiny slither in a number of tiny pools that prevent them from being 
> released. If these are themselves part of mostly free _chunks_ we could end 
> up exhausting our global pool entirely without any of it being in use.

Indeed, good point.

> org.apache.cassandra.utils.memory.LongBufferPoolTest - tests are flaky
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-16681
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16681
>             Project: Cassandra
>          Issue Type: Bug
>          Components: CI
>            Reporter: Ekaterina Dimitrova
>            Assignee: Piotr Kolaczkowski
>            Priority: Normal
>             Fix For: 3.11.x, 4.0.x, 4.x
>
>         Attachments: 0001-Fix-memoryInUse-counter-in-BufferPool.patch, 
> 0002-Multiple-fixes-in-BufferPool-and-LongBufferPoolTest.patch
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Jenkins history:
> [https://jenkins-cm4.apache.org/job/Cassandra-4.0/50/testReport/junit/org.apache.cassandra.utils.memory/LongBufferPoolTest/testPoolAllocateWithRecyclePartially/history/]
> Fails being run in a loop in CircleCI:
> https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/844/workflows/945011f4-00ac-4678-89f6-5c0db0a40169/jobs/5008
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-16681) org.apache.cassandra.utils.memory.LongBufferPoolTest - tests are flaky

Reply via email to