Hi Michael, > The trivial fix is to use DOCID/REVISIONID as DOC_KEY.
Yes that’s definitely one way to address storage of edit conflicts. I think there are other, more compact representations that we can explore if we have this “exploded” data model where each scalar value maps to an individual KV pair. E.g. if you have two large revisions of a document that differ in only one field it is possible to write down a model where both revisions share all the rest of the KV pairs, and there’s a special flag in the value of the conflicted path which indicate that an edit branch occurred here. I guess we’ll > I'm assuming the process will flatten the key paths of the document into an > array and then request the value of each key as multiple parallel queries > against FDB at once Ah, I think this is not one of Ilya’s assumptions. He’s trying to design a model which allows the retrieval of a document with a single range read, which is a good goal in my opinion. I do think a small number of parallel reads can be OK, e.g. retrieving some database-level mapping information in parallel to the encoded document. We should try to avoid serializing reads, and I think issuing a separate read for every field of a document would be an unnecessarily heavy load. > Assuming it only does "prefix" and not "segment", then I don't think this > will help because the DOCID for each key in JSON_PATH will be different, > making the "prefix" to each path across different documents distinct. I’m not sure I follow you here, or we have different understandings of the proposal. When I’m reading a document in this model I’m retrieving a set of keys that all share the same {DOCID}. Moreover, if I’ve got e.g. an array sitting in some deeply nested part of the document, the entire path doc.foo.bar.baz.myarray is common to every element of the array, so it’s actually quite a nice case for elision. > I think the answer is assuming every document modification can upload in > multiple txns. I would like to avoid this if possible. It adds a lot of extra complexity (the subspace with atomic rename dance, for example), and I think CouchDB should be focused on use cases that do fit within the 10MB / 5 second limit. Cheers, Adam