[ https://issues.apache.org/jira/browse/CASSANDRA-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537777#comment-14537777 ]
Stefania commented on CASSANDRA-8897: ------------------------------------- Hi [~benedict], I am still working on tests but if in the meantime you want to take a look at the code, here is what I did: bq. currently this means we slice a 64Kb block into 1Kb units. To allocate smaller buffers, we can create a microChunk from an allocation within a chunk (at the localPool level), from which we can serve smaller requests (which could be served in multiples of 16 bytes, so we get finer granularity again). This could also help us avoid the problem of wastage if we were to, say, allocate a 64/32K buffer when we still had 16K spare in the current chunk, since we could convert the remainder into a microChunk for serving any small requests. As discussed, I've paused on this in favor of a separate ticket. bq. We could safely and cheaply assert the buffer has not already been freed I've added assertions on the bits we are about to set in order to check this. The attachment is now replaced atomically, else we get bad deallocations in Ref when multiple threads try to deallocate the same buffer at once. Also the assertions on the bits could fail (I have a unit test where multiple threads release the same buffer). bq. We could consider making this fully concurrent, dropping the normalFree and atomicFree, and just using the bitmap for determining its current status via a CAS. I was generally hoping to avoid introducing extra concurrency on the critical path, but we could potentially have two paths, one for concurrent and one for non-concurrent access, and introduce a flag so that any concurrent free on a non-concurrent path would fail. With or without this, though, I like the increased simplicity of only relying on the bitmap, since that means only a handful of lines of code to understand the memory management This is done but, as previously discussed, there is still an assumption that only one thread can allocate a buffer from a given chunk at any one time, which is presently true, and which results in a simplification inside get(), in that we can CAS in a loop without changing the candidate, but asserting no-one else has taken the candidate bits. bq. We could consider making the chunks available for reallocation before they are fully free, since there's no different between a partially or fully free chunk now for allocation purposes This is also done. There is no more guarantee that a chunk in the global pool can allocate a buffer of a given size if we recycle before they are fully free. Therefore, the local pool keeps a deque of chunks and checks if any of these can serve a buffer, if not it asks the global pool for a buffer directly, and then takes ownership of the parent chunk. This way we avoid checking if a chunk has enough space first. The local pool recycles a chunk if it is not the head of the queue, as long as it is the owner of the chunk. The deque is not strictly necessary, it is just a small step towards supporting allocation across a range of sizes as needed by CASSANDRA-8630. \\ \\ Also, following your suggestions in the code, I added one configuration property to determine if we can allocate on the heap once the pool is exhausted and one flag to disable the pool entirely ({{-Dcassandra.test.disable_buffer_pool=true}}), this latter to use in tests. The long stress burn test has been added as well, but it may change it slightly tomorrow. > Remove FileCacheService, instead pooling the buffers > ---------------------------------------------------- > > Key: CASSANDRA-8897 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8897 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Benedict > Assignee: Stefania > Fix For: 3.x > > > After CASSANDRA-8893, a RAR will be a very lightweight object and will not > need caching, so we can eliminate this cache entirely. Instead we should have > a pool of buffers that are page-aligned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)