we currently evaluate to integrate a Couchbase NoSQL database [1] into a sling 
resource tree. as a starting point i had a deeper look on the MongoDB resource 
provider [2], because the concept is quite similar.

some thoughts on this:

1. what is the status of the mongodb provider? is someone using it already in 
production? looking at the code it seems to be not threadsafe concerning the 
CRUD handling with non-synchronized hash maps.

2. how to map resource URLs to NoSQL: the mongodb provider has a syntax like:
<root_path>/<collection>/<custom_path>
where root_path and the mongodb database name is configurable via OSGi 
(multiple entry points possible), collection has to match with an existing 
collection in mongodb and the remaining path is mapped to a property in a 
document in the collection.
i wonder if this is the best solution, the collection path part seems too 
restrictive to me (fails if the collection does not exist). i would favor 
specifying both root_path and collection via osgi allowing entry points with an 
unconstrained tree hierarchy below.
couchbase for example does not have such a collection concept, it has only 
"bucket" which is comparable to the mongodb "database".

3. the resource provider mixes up the in-memory CRUD handling with keeping maps 
of changed/deleted resources, and the mapping to the NoSQL structure. if these 
two aspects would be separated the former could be reused for all NoSQL 
databases and the latter is responsible only for the flat list 
resource-to-document mapping and will be different for each NoSQL database. 
bonus: the thread-safety of the CRUD handling has to be implemented only once, 
not once for each resource provider.
additional logic like type mapping values to strings, generic value map 
implementations, automatic tree creation etc. could be shared between all NoSQL 
providers.

4. an open point is whether to support binary data as well, or to leave it out 
in the first phase. storing binary data may be problematic for some NoSQL 
databases, requiring a separate storage concept for this. the mongodb resource 
provider currently does not support binary data.

5. there were plans to create a SOLR sling resource provider [3][4], which goes 
roughly in the same direction; but it seems it had no outcome.

WDYT?

stefan

[1] http://www.couchbase.com
[2] https://svn.apache.org/repos/asf/sling/trunk/contrib/extensions/mongodb
[3] https://issues.apache.org/jira/browse/SLING-2795
[4] 
http://apache-sling.73963.n3.nabble.com/GSoC-2013-Apache-Solr-backend-for-Apache-Sling-tt4023347.html

Reply via email to