Re: # [DISCUSS] : things we need to solve/decide : storage of edit conflicts

Adam Kocoloski Mon, 11 Feb 2019 07:04:46 -0800

Not sure I follow. If a transaction needs the full revision tree for a single 
document it can retrieve that with a single range read for the (“_meta”, DocID) 
prefix.


Adam

> On Feb 8, 2019, at 6:35 PM, Paul Davis <paul.joseph.da...@gmail.com> wrote:
> 
> Ah, that all sounds good. The only thing I'm not initially seeing as
> obvious is how we lookup a revision path to extend during replication
> when the previous revision may be anywhere in the list of $revs_limit
> revisions. Feels like there might be some sort of trickery to do that
> efficiently. Although it may also be good enough to also just issue
> $revs_limit lookups in parallel given that we're maxed out on either
> $revs_limit or 2*$revs_limit if we have to check for both deleted and
> not.
> 
> On Fri, Feb 8, 2019 at 10:22 AM Adam Kocoloski <kocol...@apache.org> wrote:
>> 
>> Totally unacceptable! ;)  In fact some key bits of that model got dispersed 
>> into at least two separate emails so you’re likely not the only one. I’ll 
>> restate here:
>> 
>> The size limits in FoundationDB preclude us from storing the entire key tree 
>> as a single value; in pathological situations the tree could exceed 100KB. 
>> Rather, I think it would make sense to store each edit *branch* as a 
>> separate KV. We stem the branch long before it hits the value size limit, 
>> and in the happy case of no edit conflicts this means we store the edit 
>> history metadata in a single KV. It also means that we can apply an 
>> interactive edit without retrieving the entire conflicted revision tree; we 
>> need only retrieve and modify the single branch against which the edit is 
>> being applied. The downside is that we may duplicate historical revision 
>> identifiers shared by multiple edit branches, but I think this is a 
>> worthwhile tradeoff.
>> 
>> I’d also ensure that a document update only needs to read the edit branch KV 
>> against which the update is being applied, and it can read that branch 
>> immediately knowing only the content of the edit that is being attempted 
>> (i.e., it does not need to read the current version of the document itself).
>> 
>> I think we achieve these goals with a separate subspace (maybe “_meta”?) for 
>> the revision trees, with keys and values that look like
>> 
>> (“_meta”, DocID, NotDeleted, RevPosition, RevHash) = 
>> (VersionstampForCurrentRev, [ParentRev, GrandparentRev, …])
>> 
>> Some notes:
>> 
>> - including IsDeleted ensures that we can efficiently accept the case where 
>> we upload a new document with the same ID where all previous edit branches 
>> have been deleted; i.e. we can construct a key selector which automatically 
>> tells us there are no deleted=false edit branches
>> - access to VersionstampForCurrentRev ensures we can clear the old entry in 
>> the by_seq space during the update
>> - i need to remind myself how we handle an edit attempt which supplies a 
>> _rev representing a deleted leaf. Do we fail that as a conflict? That would 
>> be the natural thing to do here, otherwise we’re forced to check both 
>> deleted=false and deleted=true keys
>> - the keys can be made to naturally sort so that the winning revision sorts 
>> last, but I don’t believe that’s a requirement here like it is for the 
>> actual document data space
>> 
>> Cheers, Adam
>> 
>>> On Feb 8, 2019, at 8:59 AM, Paul Davis <paul.joseph.da...@gmail.com> wrote:
>>> 
>>>> I’m relatively happy with the revision history data model at this point.
>>> 
>>> I forgot to make a note, but which of the various models are you
>>> referring to by "revision history data model". There's been so many
>>> without firm names that my brain is having a hard time parsing that
>>> one.
>>> 
>>> On Thu, Feb 7, 2019 at 9:35 PM Adam Kocoloski <kocol...@apache.org> wrote:
>>>> 
>>>> Bob, Garren, Jan - heard you loud and clear, K.I.S.S. I do think it’s a 
>>>> bit “simplistic" to exclusively choose simplicity over performance and 
>>>> storage density. We’re (re)building a database here, one that has some 
>>>> users with pretty demanding performance and scalability requirements. And 
>>>> yes, we should certainly be testing and measuring. Kyle and team are 
>>>> setting up infrastructure in IBM land to help with that now, but I also 
>>>> believe we can design the data model and architecture with a basic 
>>>> performance model of FoundationDB in mind:
>>>> 
>>>> - reads cost 1ms
>>>> - short range reads are the same cost as a single lookup
>>>> - reads of independent parts of the keyspace can be parallelized for cheap
>>>> - writes are zero-cost until commit time
>>>> 
>>>> We ought to be able to use these assumptions to drive some decisions about 
>>>> data models ahead of any end-to-end performance test.
>>>> 
>>>> If there are specific elements of the edit conflicts management where you 
>>>> think greater simplicity is warranted, let’s get those called out. Ilya 
>>>> noted (correctly, in my opinion) that the term sharing stuff is one of 
>>>> those items. It’s relatively complex, potentially a performance hit, and 
>>>> only saves on storage density in the corner case of lots of edit 
>>>> conflicts. That’s a good one to drop.
>>>> 
>>>> I’m relatively happy with the revision history data model at this point. 
>>>> Hopefully folks find it easy to grok, and it’s efficient for both reads 
>>>> and writes. It costs some extra storage for conflict revisions compared to 
>>>> the current tree representation (up to 16K per edit branch, with default 
>>>> _revs_limit) but knowing what we know about the performance death spiral 
>>>> for wide revision trees today I’ll happily make a storage vs. performance 
>>>> tradeoff here :)
>>>> 
>>>> Setting the shared term approach aside, I’ve still been mulling over the 
>>>> key structure for the actual document data:
>>>> 
>>>> -  I thought about trying to construct a special _conflicts subspace, but 
>>>> I don’t like that approach because the choice of a “winning" revision can 
>>>> flip back and forth very quickly with concurrent writers to different edit 
>>>> branches. I think we really want to have a way for revisions to naturally 
>>>> sort themselves so the winner is the first or last revision in a list.
>>>> 
>>>> - Assuming we’re using key paths of the form (docid, revision-ish, path, 
>>>> to, field), the goal here is to find an efficient way to get the last key 
>>>> with prefix “docid” (assuming winner sorts last), and then all the keys 
>>>> that share the same (docid, revision-ish) prefix as that one. I see two 
>>>> possible approaches so far, neither perfect:
>>>> 
>>>> Option 1: Execute a get_key() operation with a key selector that asks for 
>>>> the last key less than “docid\xFF” (again assuming winner sorts last), and 
>>>> then do a get_range_startswith() request setting the streaming mode to 
>>>> “want_all” and the prefix to the docid plus whatever revision-ish we found 
>>>> from the get_key() request. This is two roundtrips instead of one, but it 
>>>> always retrieves exactly the right set of keys, and the second step is 
>>>> executed as fast as possible.
>>>> 
>>>> Option 2: Jump straight to get_range_startswith() request using only 
>>>> “docid” as the prefix, then cancel the iteration once we reach a revision 
>>>> not equal to the first one we see. We might transfer too much data, or we 
>>>> might end up doing multiple roundtrips if the default “iterator” streaming 
>>>> mode sends too little data to start (I haven’t checked what the default 
>>>> iteration block is there), but in the typical case of zero edit conflicts 
>>>> we have a good chance of retrieving the full document in one roundtrip.
>>>> 
>>>> I don’t have a good sense of which option wins out here from a performance 
>>>> perspective, but they’re both operating on the same data model so easy 
>>>> enough to test the alternatives. The important bit is getting the 
>>>> revision-ish things to sort correctly. I think we can do that by 
>>>> generating something like
>>>> 
>>>> revision-ish = NotDeleted/1bit : RevPos : RevHash
>>>> 
>>>> with some suitable order-preserving encoding on the RevPos integer.
>>>> 
>>>> Apologies for the long email. Happy for any comments, either here or over 
>>>> on IRC. Cheers,
>>>> 
>>>> Adam
>>>> 
>>>>> On Feb 7, 2019, at 4:52 PM, Robert Newson <rnew...@apache.org> wrote:
>>>>> 
>>>>> I think we should choose simple. We can then see if performance is too 
>>>>> low or storage overhead too high and then see what we can do about it.
>>>>> 
>>>>> B.
>>>>> 
>>>>> --
>>>>> Robert Samuel Newson
>>>>> rnew...@apache.org
>>>>> 
>>>>> On Thu, 7 Feb 2019, at 20:36, Ilya Khlopotov wrote:
>>>>>> We cannot do simple thing if we want to support sharing of JSON terms. I
>>>>>> think if we want the simplest path we should move sharing out of the
>>>>>> scope. The problem with sharing is we need to know the location of
>>>>>> shared terms when we do write. This means that we have to read full
>>>>>> document on every write. There might be tricks to replace full document
>>>>>> read with some sort of hierarchical signature or sketch of a document.
>>>>>> However these tricks do not fall into simplest solution category. We
>>>>>> need to choose the design goals:
>>>>>> - simple
>>>>>> - performance
>>>>>> - reduced storage overhead
>>>>>> 
>>>>>> best regards,
>>>>>> iilyak
>>>>>> 
>>>>>> On 2019/02/07 12:45:34, Garren Smith <gar...@apache.org> wrote:
>>>>>>> I’m also in favor of keeping it really simple and then testing and
>>>>>>> measuring it.
>>>>>>> 
>>>>>>> What is the best way to measure that we have something that works? I’m 
>>>>>>> not
>>>>>>> sure just relying on our current tests will prove that? Should we define
>>>>>>> and build some more complex situations e.g docs with lots of conflicts 
>>>>>>> or
>>>>>>> docs with wide revisions and make sure we can solve for those?
>>>>>>> 
>>>>>>> On Thu, Feb 7, 2019 at 12:33 PM Jan Lehnardt <j...@apache.org> wrote:
>>>>>>> 
>>>>>>>> I’m also very much in favour with starting with the simplest thing that
>>>>>>>> can possibly work and doesn’t go against the advertised best practices 
>>>>>>>> of
>>>>>>>> FoundationDB. Let’s get that going and get a feel for how it all works
>>>>>>>> together, before trying to optimise things we can’t measure yet.
>>>>>>>> 
>>>>>>>> Best
>>>>>>>> Jan
>>>>>>>> —
>>>>>>>> 
>>>>>>>>> On 6. Feb 2019, at 16:58, Robert Samuel Newson <rnew...@apache.org>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> With the Redwood storage engine under development and with prefix
>>>>>>>> elision part of its design, I don’t think we should get too hung up on
>>>>>>>> adding complications and indirections in the key space just yet. We 
>>>>>>>> haven’t
>>>>>>>> written a line of code or run any tests, this is premature 
>>>>>>>> optimisation.
>>>>>>>>> 
>>>>>>>>> I’d like to focus on the simplest solution that yields all required
>>>>>>>> properties. We can embellish later (if warranted).
>>>>>>>>> 
>>>>>>>>> I am intrigued by all the ideas that might allow us cheaper inserts 
>>>>>>>>> and
>>>>>>>> updates than the current code where there are multiple edit branches 
>>>>>>>> in the
>>>>>>>> stored document.
>>>>>>>>> 
>>>>>>>>> B.
>>>>>>>>> 
>>>>>>>>>> On 6 Feb 2019, at 02:18, Ilya Khlopotov <iil...@apache.org> wrote:
>>>>>>>>>> 
>>>>>>>>>> While reading Adam's proposal I came to realize that: we don't have 
>>>>>>>>>> to
>>>>>>>> calculate winning revision at read time.
>>>>>>>>>> Since FDB's transactions are atomic we can calculate it when we 
>>>>>>>>>> write.
>>>>>>>> This means we can just write latest values into separate range. This 
>>>>>>>> makes
>>>>>>>> lookup of latest version fast.
>>>>>>>>>> Another realization is if we want to share values for some json paths
>>>>>>>> we would have to introduce a level of indirection.
>>>>>>>>>> Bellow is the data model inspired by Adam's idea to share json_paths.
>>>>>>>> In this model the json_path is stored in the revision where it was 
>>>>>>>> first
>>>>>>>> added (we call that revision an owner of a json_path). The values for
>>>>>>>> json_path key can be scalar values, parts of scalar values or pointers 
>>>>>>>> to
>>>>>>>> owner location.
>>>>>>>>>> The below snippets are sketches of transactions.
>>>>>>>>>> The transactions will include updates to other keys as needed
>>>>>>>> (`external_size`, `by_seq` and so on).  The revision tree management 
>>>>>>>> is not
>>>>>>>> covered yet.
>>>>>>>>>> The `rev -> vsn` indirection is not strictly required. It is added
>>>>>>>> because it saves some space since `rev` is a long string and `vsn` is 
>>>>>>>> FDB
>>>>>>>> versionstamp of fixed size.
>>>>>>>>>> 
>>>>>>>>>> - `{NS} / {docid} / _by_rev / {rev} = vsn`
>>>>>>>>>> - `{NS} / {docid} / _used_by / {json_path} / {another_vsn} = NIL`
>>>>>>>>>> - `{NS} / {docid} / _data / {json_path} = latest_value | part`
>>>>>>>>>> - `{NS} / {docid} / {vsn} / _data / {json_path} = value | part |
>>>>>>>> {another_vsn}`
>>>>>>>>>> 
>>>>>>>>>> ```
>>>>>>>>>> write(txn, doc_id, prev_rev, json):
>>>>>>>>>> txn.add_write_conflict_key("{NS} / {doc_id} / _rev")
>>>>>>>>>> rev = generate_new_rev()
>>>>>>>>>> txn["{NS} / {docid} / _by_rev / {rev}"] = vsn
>>>>>>>>>> for every json_path in flattened json
>>>>>>>>>> - {NS} / {docid} / _used_by / {json_path} / {another_vsn} = NIL
>>>>>>>>>> if rev is HEAD:
>>>>>>>>>>   # this range contains values for all json paths for the latest
>>>>>>>> revision (read optimization)
>>>>>>>>>>   - {NS} / {docid} / _data / {json_path} = latest_value | part
>>>>>>>>>> - {NS} / {docid} / {vsn} / _data / {json_path} = value | part |
>>>>>>>> {another_vsn}
>>>>>>>>>> txn["{NS} / {doc_id} / _rev"] = rev
>>>>>>>>>> 
>>>>>>>>>> get_current(txn, doc_id):
>>>>>>>>>> # there is no sharing of json_paths in this range (read optimization)
>>>>>>>>>> txn.get_range("{NS} / {docid} / _data / 0x00", "{NS} / {docid} / 
>>>>>>>>>> _data
>>>>>>>> / 0xFF" )
>>>>>>>>>> 
>>>>>>>>>> get_revision(txn, doc_id, rev):
>>>>>>>>>> vsn = txn["{NS} / {docid} / _by_rev / {rev}"]
>>>>>>>>>> json_paths = txn.get_range("{NS} / {vsn} / {docid} / _data / 0x00",
>>>>>>>> "{NS} / {vsn} / {docid} / _data / 0xFF" )
>>>>>>>>>> for every json_path in json_paths:
>>>>>>>>>> if value has type vsn:
>>>>>>>>>>   another_vsn = value
>>>>>>>>>>      value = txn["{NS} / {docid} / {another_vsn} / _data /
>>>>>>>> {json_path}"]
>>>>>>>>>> result[json_path] = value
>>>>>>>>>> 
>>>>>>>>>> delete_revision(txn, doc_id, rev):
>>>>>>>>>> vsn = txn["{NS} / {docid} / _by_rev / {rev}"]
>>>>>>>>>> json_paths = txn.get_range("{NS} / {vsn} / {docid} / _data / 0x00",
>>>>>>>> "{NS} / {vsn} / {docid} / _data / 0xFF" )
>>>>>>>>>> for every json_path in json_paths:
>>>>>>>>>> if value has type vsn:
>>>>>>>>>>  # remove reference to deleted revision from the owner
>>>>>>>>>>   del txn[{NS} / {docid} / _used_by / {json_path} / {vsn}]
>>>>>>>>>> # check if deleted revision of json_path is not used by anything else
>>>>>>>>>> if txn.get_range("{NS} / {docid} / _used_by / {json_path} / {vsn}",
>>>>>>>> limit=1) == []:
>>>>>>>>>>   del txn["{NS} / {docid} / {vsn} / _data / {json_path}"]
>>>>>>>>>> if vsn is HEAD:
>>>>>>>>>>   copy range for winning revision into "{NS} / {docid} / _data /
>>>>>>>> {json_path}"
>>>>>>>>>> ```
>>>>>>>>>> 
>>>>>>>>>> best regards,
>>>>>>>>>> iilyak
>>>>>>>>>> 
>>>>>>>>>> On 2019/02/04 23:22:09, Adam Kocoloski <kocol...@apache.org> wrote:
>>>>>>>>>>> I think it’s fine to start a focused discussion here as it might 
>>>>>>>>>>> help
>>>>>>>> inform some of the broader debate over in that thread.
>>>>>>>>>>> 
>>>>>>>>>>> As a reminder, today CouchDB writes the entire body of each document
>>>>>>>> revision on disk as a separate blob. Edit conflicts that have common 
>>>>>>>> fields
>>>>>>>> between them do not share any storage on disk. The revision tree is 
>>>>>>>> encoded
>>>>>>>> into a compact format and a copy of it is stored directly in both the 
>>>>>>>> by_id
>>>>>>>> tree and the by_seq tree. Each leaf entry in the revision tree contain 
>>>>>>>> a
>>>>>>>> pointer to the position of the associated doc revision on disk.
>>>>>>>>>>> 
>>>>>>>>>>> As a further reminder, CouchDB 2.x clusters can generate edit 
>>>>>>>>>>> conflict
>>>>>>>> revisions just from multiple clients concurrently updating the same
>>>>>>>> document in a single cluster. This won’t happen when FoundationDB is
>>>>>>>> running under the hood, but users who deploy multiple CouchDB or 
>>>>>>>> PouchDB
>>>>>>>> servers and replicate between them can of course still produce 
>>>>>>>> conflicts
>>>>>>>> just like they could in CouchDB 1.x, so we need a solution.
>>>>>>>>>>> 
>>>>>>>>>>> Let’s consider the two sub-topics separately: 1) storage of edit
>>>>>>>> conflict bodies and 2) revision trees
>>>>>>>>>>> 
>>>>>>>>>>> ## Edit Conflict Storage
>>>>>>>>>>> 
>>>>>>>>>>> The simplest possible solution would be to store each document
>>>>>>>> revision separately, like we do today. We could store document bodies 
>>>>>>>> with
>>>>>>>> (“docid”, “revid”) as the key prefix, and each transaction could clear 
>>>>>>>> the
>>>>>>>> key range associated with the base revision against which the edit is 
>>>>>>>> being
>>>>>>>> attempted. This would work, but I think we can try to be a bit more 
>>>>>>>> clever
>>>>>>>> and save on storage space given that we’re splitting JSON documents 
>>>>>>>> into
>>>>>>>> multiple KV pairs.
>>>>>>>>>>> 
>>>>>>>>>>> One thought I’d had is to introduce a special enum Value which
>>>>>>>> indicates that the subtree “beneath” the given Key is in conflict. For
>>>>>>>> example, consider the documents
>>>>>>>>>>> 
>>>>>>>>>>> {
>>>>>>>>>>> “_id”: “foo”,
>>>>>>>>>>> “_rev”: “1-abc”,
>>>>>>>>>>> “owner”: “alice”,
>>>>>>>>>>> “active”: true
>>>>>>>>>>> }
>>>>>>>>>>> 
>>>>>>>>>>> and
>>>>>>>>>>> 
>>>>>>>>>>> {
>>>>>>>>>>> “_id”: “foo”,
>>>>>>>>>>> “_rev”: “1-def”,
>>>>>>>>>>> “owner”: “bob”,
>>>>>>>>>>> “active”: true
>>>>>>>>>>> }
>>>>>>>>>>> 
>>>>>>>>>>> We could represent these using the following set of KVs:
>>>>>>>>>>> 
>>>>>>>>>>> (“foo”, “active”) = true
>>>>>>>>>>> (“foo”, “owner”) = kCONFLICT
>>>>>>>>>>> (“foo”, “owner”, “1-abc”) = “alice”
>>>>>>>>>>> (“foo”, “owner”, “1-def”) = “bob”
>>>>>>>>>>> 
>>>>>>>>>>> This approach also extends to conflicts where the two versions have
>>>>>>>> different data types. Consider a more complicated example where bob 
>>>>>>>> dropped
>>>>>>>> the “active” field and changed the “owner” field to an object:
>>>>>>>>>>> 
>>>>>>>>>>> {
>>>>>>>>>>> “_id”: “foo”,
>>>>>>>>>>> “_rev”: “1-def”,
>>>>>>>>>>> “owner”: {
>>>>>>>>>>> “name”: “bob”,
>>>>>>>>>>> “email”: “b...@example.com"
>>>>>>>>>>> }
>>>>>>>>>>> }
>>>>>>>>>>> 
>>>>>>>>>>> Now the set of KVs for “foo” looks like this (note that a missing
>>>>>>>> field needs to be handled explicitly):
>>>>>>>>>>> 
>>>>>>>>>>> (“foo”, “active”) = kCONFLICT
>>>>>>>>>>> (“foo”, “active”, “1-abc”) = true
>>>>>>>>>>> (“foo”, “active”, “1-def”) = kMISSING
>>>>>>>>>>> (“foo”, “owner”) = kCONFLICT
>>>>>>>>>>> (“foo”, “owner”, “1-abc”) = “alice”
>>>>>>>>>>> (“foo”, “owner”, “1-def”, “name”) = “bob”
>>>>>>>>>>> (“foo”, “owner”, “1-def”, “email”) = “b...@example.com”
>>>>>>>>>>> 
>>>>>>>>>>> I like this approach for the common case where documents share most 
>>>>>>>>>>> of
>>>>>>>> their data in common but have a conflict in a very specific field or 
>>>>>>>> set of
>>>>>>>> fields.
>>>>>>>>>>> 
>>>>>>>>>>> I’ve encountered one important downside, though: an edit that
>>>>>>>> replicates in and conflicts with the entire document can cause a bit 
>>>>>>>> of a
>>>>>>>> data explosion. Consider a case where I have 10 conflicting versions 
>>>>>>>> of a
>>>>>>>> 100KB document, but the conflicts are all related to a single scalar 
>>>>>>>> value.
>>>>>>>> Now I replicate in an empty document, and suddenly I have a kCONFLICT 
>>>>>>>> at
>>>>>>>> the root. In this model I now need to list out every path of every one 
>>>>>>>> of
>>>>>>>> the 10 existing revisions and I end up with a 1MB update. Yuck. That’s
>>>>>>>> technically no worse in the end state than the “zero sharing” case 
>>>>>>>> above,
>>>>>>>> but one could easily imagine overrunning the transaction size limit 
>>>>>>>> this
>>>>>>>> way.
>>>>>>>>>>> 
>>>>>>>>>>> I suspect there’s a smart path out of this. Maybe the system 
>>>>>>>>>>> detects a
>>>>>>>> “default” value for each field and uses that instead of writing out the
>>>>>>>> value for every revision in a conflicted subtree. Worth some 
>>>>>>>> discussion.
>>>>>>>>>>> 
>>>>>>>>>>> ## Revision Trees
>>>>>>>>>>> 
>>>>>>>>>>> In CouchDB we currently represent revisions as a hash history tree;
>>>>>>>> each revision identifier is derived from the content of the revision
>>>>>>>> including the revision identifier of its parent. Individual edit 
>>>>>>>> branches
>>>>>>>> are bounded in *length* (I believe the default is 1000 entries), but 
>>>>>>>> the
>>>>>>>> number of edit branches is technically unbounded.
>>>>>>>>>>> 
>>>>>>>>>>> The size limits in FoundationDB preclude us from storing the entire
>>>>>>>> key tree as a single value; in pathological situations the tree could
>>>>>>>> exceed 100KB. Rather, I think it would make sense to store each edit
>>>>>>>> *branch* as a separate KV. We stem the branch long before it hits the 
>>>>>>>> value
>>>>>>>> size limit, and in the happy case of no edit conflicts this means we 
>>>>>>>> store
>>>>>>>> the edit history metadata in a single KV. It also means that we can 
>>>>>>>> apply
>>>>>>>> an interactive edit without retrieving the entire conflicted revision 
>>>>>>>> tree;
>>>>>>>> we need only retrieve and modify the single branch against which the 
>>>>>>>> edit
>>>>>>>> is being applied. The downside is that we duplicate historical revision
>>>>>>>> identifiers shared by multiple edit branches, but I think this is a
>>>>>>>> worthwhile tradeoff.
>>>>>>>>>>> 
>>>>>>>>>>> I would furthermore try to structure the keys so that it is possible
>>>>>>>> to retrieve the “winning” revision in a single limit=1 range query. 
>>>>>>>> Ideally
>>>>>>>> I’d like to proide the following properties:
>>>>>>>>>>> 
>>>>>>>>>>> 1) a document read does not need to retrieve the revision tree at 
>>>>>>>>>>> all,
>>>>>>>> just the winning revision identifier (which would be stored with the 
>>>>>>>> rest
>>>>>>>> of the doc)
>>>>>>>>>>> 2) a document update only needs to read the edit branch of the
>>>>>>>> revision tree against which the update is being applied, and it can 
>>>>>>>> read
>>>>>>>> that branch immediately knowing only the content of the edit that is 
>>>>>>>> being
>>>>>>>> attempted (i.e., it does not need to read the current version of the
>>>>>>>> document itself).
>>>>>>>>>>> 
>>>>>>>>>>> So, I’d propose a separate subspace (maybe “_meta”?) for the 
>>>>>>>>>>> revision
>>>>>>>> trees, with keys and values that look like
>>>>>>>>>>> 
>>>>>>>>>>> (“_meta”, DocID, IsDeleted, RevPosition, RevHash) = [ParentRev,
>>>>>>>> GrandparentRev, …]
>>>>>>>>>>> 
>>>>>>>>>>> The inclusion of IsDeleted, RevPosition and RevHash in the key 
>>>>>>>>>>> should
>>>>>>>> be sufficient (with the right encoding) to create a range query that
>>>>>>>> automatically selects the “winner” according to CouchDB’s arcane rules,
>>>>>>>> which are something like
>>>>>>>>>>> 
>>>>>>>>>>> 1) deleted=false beats deleted=true
>>>>>>>>>>> 2) longer paths (i.e. higher RevPosition) beat shorter ones
>>>>>>>>>>> 3) RevHashes with larger binary values beat ones with smaller values
>>>>>>>>>>> 
>>>>>>>>>>> ===========
>>>>>>>>>>> 
>>>>>>>>>>> OK, that’s all on this topic from me for now. I think this is a
>>>>>>>> particularly exciting area where we start to see the dividends of 
>>>>>>>> splitting
>>>>>>>> up data into multiple KV pairs in FoundationDB :) Cheers,
>>>>>>>>>>> 
>>>>>>>>>>> Adam
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> On Feb 4, 2019, at 2:41 PM, Robert Newson <rnew...@apache.org> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> This one is quite tightly coupled to the other thread on data 
>>>>>>>>>>>> model,
>>>>>>>> should we start much conversation here before that one gets closer to a
>>>>>>>> solution?
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Robert Samuel Newson
>>>>>>>>>>>> rnew...@apache.org
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, 4 Feb 2019, at 19:25, Ilya Khlopotov wrote:
>>>>>>>>>>>>> This is a beginning of a discussion thread about storage of edit
>>>>>>>>>>>>> conflicts and everything which relates to revisions.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Professional Support for Apache CouchDB:
>>>>>>>> https://neighbourhood.ie/couchdb-support/
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>> 
>>

Re: # [DISCUSS] : things we need to solve/decide : storage of edit conflicts

Reply via email to