As someone who spent a lot of time looking at the singletons topic in the past, Blake brings a great perspective here. Figuring out and communicating how best to test with the system we have (and of course incrementally making that system easier to work with/test) seems like an achievable goal.
On Fri, Mar 17, 2017 at 10:17 AM, Edward Capriolo <edlinuxg...@gmail.com> wrote: > On Fri, Mar 17, 2017 at 12:33 PM, Blake Eggleston <beggles...@apple.com> > wrote: > > > I think we’re getting a little ahead of ourselves talking about DI > > frameworks. Before that even becomes something worth talking about, we’d > > need to have made serious progress on un-spaghettifying Cassandra in the > > first place. It’s an extremely tall order. Adding a DI framework right > now > > would be like throwing gasoline on a raging tire fire. > > > > Removing singletons seems to come up every 6-12 months, and usually > > abandoned once people figure out how difficult they are to remove > properly. > > I do think removing them *should* be a long term goal, but we really need > > something more immediately actionable. Otherwise, nothing’s going to > > happen, and we’ll be having this discussion again in a year or so when > > everyone’s angry that Cassandra 5.0 still isn’t ready for production, a > > year after it’s release. > > > > That said, the reason singletons regularly get brought up is because > doing > > extensive testing of anything in Cassandra is pretty much impossible, > since > > the code is basically this big web of interconnected global state. > Testing > > anything in isolation can’t be done, which, for a distributed database, > is > > crazy. It’s a chronic problem that handicaps our ability to release a > > stable database. > > > > At this point, I think a more pragmatic approach would be to draft and > > enforce some coding standards that can be applied in day to day > development > > that drive incremental improvement of the testing and testability of the > > project. What should be tested, how it should be tested. How to write new > > code that talks to the rest of Cassandra and is testable. How to fix bugs > > in old code in a way that’s testable. We should also have some guidelines > > around refactoring the wildly untested sections, how to get started, what > > to do, what not to do, etc. > > > > Thoughts? > > > To make the conversation practical. There is one class I personally really > want to refactor so it can be tested: > > https://github.com/apache/cassandra/blob/trunk/src/java/ > org/apache/cassandra/net/OutboundTcpConnection.java > > There is little coverage here. Questions like: > what errors cause the connection to restart? > when are undropable messages are dropped? > what happens when the queue fills up? > Infamous throw new AssertionError(ex); (which probably bubble up to > nowhere) > what does the COALESCED strategy do in case XYZ. > A nifty label (wow a label you just never see those much!) > outer: > while (!isStopped) > > Comments to jira's that probably are not explicitly tested: > // If we haven't retried this message yet, put it back on the queue to > retry after re-connecting. > // See CASSANDRA-5393 and CASSANDRA-12192. > > If I were to undertake this cleanup, would there actually be support? IE if > this going to turn into an "it aint broken. don't fix it thing" or a "we > don't want to change stuff just to add tests" . Like will someone pledge to > agree its kinda wonky and merge the effort in < 1 years time? >