Hello, I was wondering whether it’d make sense to normalise the RDB Document Store schema - get rid of the JSON/JSOP concatenated strings and store each key/value in a separate database row. Something like this:
id STRING
key STRING
revision STRING (nullable)
value (LONG) STRING
modcount INTEGER
The id+key+revision would make an unique primary key.
So, an example node from the DocumentMK documentation [1]:
{
"_id" : "1:/node",
"_deleted" : {
"r13f3875b5d1-0-1" : "false"
},
"_lastRev" : {
"r0-0-1" : "r13f3875b5d1-0-1"
},
"_modified" : NumberLong(274208361),
"_modCount" : NumberLong(1),
"_children" : Boolean(true),
"_revisions" : {
"r13f3875b5d1-0-1" : "c"
}
}
Would transform to following database rows:
(id, key, revision, value, modcount)
(“1:/node”, “_deleted”, "r13f3875b5d1-0-1”, “false”, 1)
(“1:/node”, “_lastRev”, "r0-0-1”, “r13f3875b5d1-0-1”, 1)
(“1:/node”, “_modified”, null, “274208361”, 1)
(“1:/node”, “_children”, null, “true”, 1)
(“1:/node”, “_revisions”, "r13f3875b5d1-0-1", “c”, 1)
Creating a new document would require batching a few INSERTs. Updating a
document will combine INSERTs (for the new properties) and UPDATEs (for the
modified ones). Each update would end with a modcount increment for all rows
related to the given document. Fetching a document will require reading all
rows for given id. I think all of these reads and writes can be done in
batches, so we’ll end up with a single database call anyway.
Advantages I can see here are:
* no need to parse/serialize JSONs and JSONPs (less load on the Oak instance),
* no need to periodically compact the JSONPs,
* more granular updates are possible, we can properly implement all the
UpdateOp cases,
* we can better use the database features, as now the DBE is aware about the
document internal structure (it’s not a blob anymore). Eg. we can fetch only a
few properties.
For me such design looks more natural and RDB-native. The schema is just a
draft and probably I’m missing something, but I wanted to ask about a general
feedback on this approach. WDYT?
Regards,
Tomek
--
Tomek Rękawek | Adobe Research | www.adobe.com
[email protected]
smime.p7s
Description: S/MIME cryptographic signature
