Refactoring cassandra service package

Simon Chemouil Tue, 03 Jun 2014 01:54:07 -0700

Hi,

I'm new to Cassandra and felt like exploring and hacking on the code. I
was surprised to see the usage of so many mutable/global state statics
all over the service package (basically global variables/singletons).


While I understand it can be practical to work with singletons, and that
in any case I'm not sure multi-tenant Cassandra (as in two different
Cassandra instances within the same process) would make sense at all (or
even work considering there is some native access going on with JNA), I
find static state can easily lead to tangled 'spaghetti' code (accessing
the singletons from anywhere, even where one shouldn't), and in general
it ties the code to the VM instance, rather than to the class.

I tried to find if it was an actual design choice, but from my
understanding this is more something inherited from the early Cassandra
times at Facebook. I just found this thread[1] pointing to issue
CASSANDRA-741 (slightly more limited scope) that was marked as WONTFIX
because no one took it (but still marked as open for contribution). The
current code conventions also don't mention the usage of singletons
except by stating:  "Do not extract interfaces (or abstract classes)
unless you actually need multiple implementations of it" (switching to a
"service"-style design doesn't require passing interfaces but it's
highly encouraged to help testability).

So, I'd like to try to make this refactoring happen and remove all (or
most) mutable static state. It would be an easy way in for me in
Cassandra's internals (maybe to contribute further). I think it would
help testing (ability to unit test components without going to the
storage for instance) and in general modernize the code. It would also
make hacking on Cassandra easier because people could pick different
pieces without pulling the whole thing.

It would definitely break backwards compatibility with current Java code
that directly embeds Cassandra / uses it as a library, but I would keep
the same abstraction so the refactoring would be easy. In any case,
backwards compatibility can be broken by many more changes than just
refactoring, and once this is done it will be easier to deal with
backwards compatibility.

Obviously all ".instance" fields would be gone, and I'd try to fix
potential cyclic class dependencies and generally make sure classes
dependencies form a direct acyclic graph with CassandraDaemon as its
root. The basic idea is to have each 'service' component require all its
service dependencies in their constructor (and keeping them as a final
field), rather than getting them via the global namespace (singleton
instances).

If I had it my way, I'd probably use a dependency injection framework,
namely Dagger which is as far as I knpw the lightest Java DI framework
actively developed (jointly developed by Square and Google's Java team
responsible for Guice & Guava), which has a neat compile-time annotation
processor that detects missing dependencies early on. It works with both
Android and J2SE and is very fast, simple and light (65kB vs 710kB for
Guice).

So, the question is: would you guys accept such a patch? I'd rather not
do the work if it has no chance of being merged upstream :).

Cheers,

-- 
Simon


[1]
http://grokbase.com/t/cassandra/dev/107xr48hek/creating-two-instances-in-code

Refactoring cassandra service package

Reply via email to