Thinking N>1

Stefano Mazzocchi Mon, 08 Oct 2007 12:50:47 -0700

[apologies for the cross-posting]

There is a trend emerging in the IT space about system architectures: if
you software system is not designed with N>1 in mind from the start
(where N is the number of coordinated instances of the software running
on share-nothing machines) it's going to be a problem later.


Google, Yahoo and Amazon are famous for their N>1 architectural philosophy.

Normal web sites infrastructures are heavily multi-tiered and
horizontally scalable in several of these tiers, but the data-management
layer is notoriously N=1, at least in principle and it's pretty much a
given that today's growing pains on web infrastructure scalability is
around the data-management layer (which is normally an RDBMS).

We (SIMILE) are currently developing a system that tries really hardt to
be N>1-friendly but uses Sesame HTTP Sails as the persistent
data-management layer and memcached as a way to avoid querying the
triple-store unless absolutely necessary. My queries are normally very
simple, stuff like "s ?p ?o" or "s p ?o" or "?s p o" which, in fact,
don't need a triple store at all, a simple key-value store (such as
Amazon Dynamo or CouchDB) would do just fine.

But there are times (rare, but important), where a little more complex
query might be required... in order to avoid making hundreds of
key-value calls.

Right now, under development and with practically zero load, the HTTP
Sail performance is completely reasonable (especially since memcached or
local stores handles most of the load anyway) but I'm growing more and
more concerned about relying on a data-management tier that is, in fact,
designed with a N=1 architecture in mind.

Sure, I could use a mysql store instead of a native store, in case the
native store performance turns out to be suboptimal... or we could write
a BerkeleyDB-based native store to squeeze out some more performance...
or we could add more processors to the machine.... but if your web site
starts to grow, it grows quadratically, and there is no way you can
scale hardware on a single machine quadratically (without having your
own infrastructural costs grow even more than that!).

According to their Dynamo's paper[1], Amazon's requirement for 'quality
of service' of the persistent data management layer is "99.9% of the
requests have to be answered in less than 300ms with a load of 500
requests/sec".

Obviously, we don't have such high requirement for quality of service,
but I would very much like to have "95% of the requests have to be
answered in less than 300ms with a load of 30 req/sec" which is a *lot*
more feasible in real life but still incredibly problematic with a N=1
architectural vision (especially for fault-tolerance).

So, the trend is to move the higher level data management and semantics
completely to the application level and to rely on fast, massively
scalable and completely decentralized and self-managing key-value
'clouds' that expose a super simple "get/set/delete" hashtable-like API,
even as a web service.

My bet is that we'll see a lot of such "dynamo"-like systems emerging in
the future, more or less easy to maintain and to manage, more or less
reliable, more or less written in widely known languages (Dynamo is
written in Java) or powerful but largerly unknown ones (CouchDB is
written in Erlang).

The question then becomes: what about triple stores?

One of the reasons why triple stores are appealing as a data-management
tier in a web application is that they favor a data-centric incremental
development: it's practically impossible to know ahead of time what your
relational data model is going to look like once your web site goes in
production when you start prototyping it. Data-first data-management
approaches (triple stores, key-value stores, OODBMS) are much more
natural in following the evolution of a prototype than Structure-first
data management approaches (current-generation SQL-based RDBMS).

But unlike OODBMS (then) and triple stores (now), key-value stores are
the only one that focus on delivering performance more than on
delivering RDBMS-like functionality.

Let's face it: RDF is nothing but the good old entity-relationship model
(which is the base of any relational database) with URIs sprinkled on
top. And if it was possible to 'scale' an implementation of the
entity-relationship model without requiring you to 'compile' its
structure/schema into the database, it would already exist.

Instead, the trend is to go key-value pair and map/reduce jobs to
'precompile' the queries and keeping N>1 firmly in mind.

[Column stores could be seen as an bunch of clever disk-I/O-influenced
optimization of the above... but they still feel N=1 deep in their
souls, which concerns me]

                                  - o -

But here is what I think: key-value stores are great, simple to use and
refreshingly scalable, would the need emerge: if there was an open
source Dynamo today, I would probably use it.

But there isn't.. and I wonder: how hard would it be to adapt the Sesame
implementation to start considering itself a N>1 application? could we
obtain the "95% req under 300ms with 30 req/sec" QoS with "s p ?o" queries?

I don't care if SPARQL is slow, I'll cache the results or throw new
silicon at it.. but how hard would it be to make Sesame feel as
refreshing and comforting as Dynamo does (scalability and QoS-wise) at
least for the very basic data-management functionalities?

Because it would be a huge win: for simple queries, it performs just
like a key-value store, for more complicated queries, it's slower but
still scalable.

Yes, I perfectly understand that distribution means graph clustering,
minimum cuts, distributed transactions and all that tarpit that sank
most of the advancements in RDBMS technology over the last 20 years....
but what if SPARQL queries are "injected" in ahead of time and computed
with map/reduce jobs? And, if not, such 'exploratory' queries will be
slow, who cares, so be it.

What do you think?

-- 
Stefano Mazzocchi
Digital Libraries Research Group                 Research Scientist
Massachusetts Institute of Technology
E25-131, 77 Massachusetts Ave               skype: stefanomazzocchi
Cambridge, MA  02139-4307, USA         email: stefanom at mit . edu
-------------------------------------------------------------------

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Thinking N>1

Reply via email to