On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <beggles...@apple.com> wrote:
> I think we’re getting a little ahead of ourselves talking about DI > frameworks. Before that even becomes something worth talking about, we’d > need to have made serious progress on un-spaghettifying Cassandra in the > first place. It’s an extremely tall order. Adding a DI framework right now > would be like throwing gasoline on a raging tire fire. > > Removing singletons seems to come up every 6-12 months, and usually > abandoned once people figure out how difficult they are to remove properly. > I do think removing them *should* be a long term goal, but we really need > something more immediately actionable. Otherwise, nothing’s going to > happen, and we’ll be having this discussion again in a year or so when > everyone’s angry that Cassandra 5.0 still isn’t ready for production, a > year after it’s release. > > That said, the reason singletons regularly get brought up is because doing > extensive testing of anything in Cassandra is pretty much impossible, since > the code is basically this big web of interconnected global state. Testing > anything in isolation can’t be done, which, for a distributed database, is > crazy. It’s a chronic problem that handicaps our ability to release a > stable database. > > At this point, I think a more pragmatic approach would be to draft and > enforce some coding standards that can be applied in day to day development > that drive incremental improvement of the testing and testability of the > project. What should be tested, how it should be tested. How to write new > code that talks to the rest of Cassandra and is testable. How to fix bugs > in old code in a way that’s testable. We should also have some guidelines > around refactoring the wildly untested sections, how to get started, what > to do, what not to do, etc. > > Thoughts? To make the conversation practical. There is one class I personally really want to refactor so it can be tested: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/OutboundTcpConnection.java There is little coverage here. Questions like: what errors cause the connection to restart? when are undropable messages are dropped? what happens when the queue fills up? Infamous throw new AssertionError(ex); (which probably bubble up to nowhere) what does the COALESCED strategy do in case XYZ. A nifty label (wow a label you just never see those much!) outer: while (!isStopped) Comments to jira's that probably are not explicitly tested: // If we haven't retried this message yet, put it back on the queue to retry after re-connecting. // See CASSANDRA-5393 and CASSANDRA-12192. If I were to undertake this cleanup, would there actually be support? IE if this going to turn into an "it aint broken. don't fix it thing" or a "we don't want to change stuff just to add tests" . Like will someone pledge to agree its kinda wonky and merge the effort in < 1 years time?