Also, whatever is chosen needs to scale to millions of documents, and
I wonder about an embedded DB doing that. I also have a hard time
believing that both a DB w/ millions of docs and Solr can live on the
same machine, which is presumably what an embedded DB must do.
Presumably, it also needs to be able to be replicated, right?
h2 can take millions of docs.. but as long as we just rely on JDBC,
the SQL scale/replication becomes a standard/known/solved problem
I've had over 10 million 'docs' in embedded derby...it didn't break a
sweat. I don't think the embedded part is much of a hindrance...your in
the same JVM, so you have those limitations, but otherwise its mostly
the same as none embedded...
If done right, its also easy to abstract out so that switching from
embedded to non embedded is very very easy.
- Mark