[ https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15340371#comment-15340371 ]
Joshua McKenzie commented on CASSANDRA-10202: --------------------------------------------- bq. I can assure you the prior implementation was no less custom I (probably a bit too snarkily) was alluding to that when I stated "which is saying something". bq. by gaining this it becomes a target for criticism Honestly, having repeatedly run into races and timing issues with tests and changes for CDC, the segment allocation logic in the CommitLog is a target for criticism in my mind as the trade-off between complexity and value gained from this implementation falls on the side of "not worth it" to me, specifically w/regards to new file allocation and swapping. bq. on that front I think your arguments are pretty fundamentally flawed To enumerate them, my arguements are: * I'm -1 on us including our own implementation of a concurrent linked list unless we can strongly justify the inclusion of that complexity. ** I stand by this. While I don't love what we *have* from a subtlety / side-effect management perspective, at least its been in there for awhile and had some bugs flushed out. * We have to maintain this code ** This is less an argument and more something I think we need to remind ourselves of when debating adding a new, custom implementation of a relatively statefully complex customized collection to the code-base. * I find this container even more unnecessarily complex to reason about than our current CommitLogSegmentManager.advanceAllocatingFrom ** Key here is "unnecessarily complex", and I mean this specifically w/regards to the segment allocation logic. Going back and looking at the #'s from CASSANDRA-3578, it's clear there's a marked improvement in CPU utilization and ops/sec throughput, however I'm skeptical as to how much of that is due to the logic surrounding new segment creation and signalling vs. the multi-threaded CommitLogSegment.Allocation logic. * This reminds me a lot of CASSANDRA-7282 where we're taking out a battle-tested concurrent collection in favor of writing our own from scratch ** I don't mean to pick at old wounds or re-start old battles, but the marginal gains in performance we get from these changes seems *heavily* outweighed by the developer man-hours that go into maintaining them and fixing the subtle and complex bugs that come along with these types of implementations. bq. the project is still ideologically opposed to custom algorithms I think it's less that the project is ideologically opposed to custom algorithms, and more that specific vocal people (I am completely guilty of this) are very complexity averse in a code-base of this size and existing complexity and want very strong justifications for decisions that appear to be adding more complexity than the perceived performance benefits they grant. > simplify CommitLogSegmentManager > -------------------------------- > > Key: CASSANDRA-10202 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10202 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths > Reporter: Jonathan Ellis > Assignee: Branimir Lambov > Priority: Minor > > Now that we only keep one active segment around we can simplify this from the > old recycling design. -- This message was sent by Atlassian JIRA (v6.3.4#6332)