> > But your boss seems rather to be criticizing the fact that our system > > is made of components. In software engineering, this is usually > > considered a strength. As to 'roles', one of the bigtable author's > > argues that a cluster of master and slaves makes for simpler systems > > [1]. > > I definitely agree with you. However, my boss considers the simplicity > from > the users' viewpoint. More components make the system more complex for > users.
Who are the users? Are they deploying the software and responsible for maintaining backend databases? Or are there backend developers, frontend developers, operations, etc? In my experience, the "users" are generally writing the applications and not maintaining databases. And in the case of HBase, and it's been said already on this thread, that users generally have an easier time with the data and consistency models. Above all, I think the point made by Stack earlier is extremely relevant. Are you using HDFS already? Do you have needs for ZK? When you do, HBase in an additional piece to this stack and generally fits in nicely. From an admin/ops POV, the learning curve is minimal once familiar with these other systems. And even if you aren't already using Hadoop, might you in the future? If you don't and never will, then the single-component nature of Cassandra may be more appealing. Also, vector clocks are nice but are still a distributed algorithm. We've been doing lots of work benchmarking and optimizing increments recently, pushing extremely high throughput on relatively small clusters. I would not expect being able to achieve this level of performance or concurrency with any kind of per-counter distribution. Certainly not while providing the strict atomicity and consistency guarantees that HBase provides. I've never implemented counters w/ vector clocks so I could be wrong. But I do know that I could explain how we implement counters in a performant, consistent, atomic way and you wouldn't have to reach for Wikipedia once ;) JG
