Hi all,

   I recently started implementing _active_tasks for our fdb development
branch. At first, I thought it would be trivial, but technical limitations
have led me to modify our response as an interim solution. I'd like to get
more feedback on this solution and start a discussion on a more
accurate/correct solution going forward.

*Problem:*
Most active tasks rely upon a "Total" value to determine progress. This
relies on `count_changes_since/2` :
https://github.com/apache/couchdb/blob/master/src/couch/src/couch_db_engine.erl#L634-L652

I cannot think of an efficient way of implementing this on top of fdb
without it being inefficient. Paul has probably thought about this more
deeply during the initial layer design phase, but I may have missed some of
those discussions.

Since Couch 2.0, our update_seq string does has a snapshot of the number of
changes prepended. This also does not exist in the fdb-layer branch.

Ultimately, there is no way to calculate the total number of changes for
given a update_seq.

*Proposed Solution:*
We simply send out the versionstamp of db sequence we are trying to reach,
and the current versionstamp. So the responses look something like:

[
    {
        "node": "node1@127.0.0.1",
        "pid": "<0.622.0>",
        "changes_done": 199,
        "current_version_stamp": "8131141649532-198",
        "database": "testdb",
        "db_version_stamp": "8131141649532-999",
        "design_document": "_design/example",
        "started_on": 1594703583,
        "type": "indexer",
        "updated_on": 1594703586
    }
]

[
    {
        "node": "node1@127.0.0.1",
        "pid": "<0.1030.0>",
        "changes_done": 1000,
        "current_version_stamp": "8131194932916-999",
        "database": "testdb",
        "db_version_stamp": "8131194932916-999",
        "design_document": "_design/example",
        "started_on": 1594703636,
        "type": "indexer",
        "updated_on": 1594703665
    }
]

The user can utilize the changes_done (this is just a running counter for
that task process), current_version_stamp, and db_version_stamp to gauge if
the task is making progress.

My concern is that this a breaking change for users that rely on the
"total_changes" and "progress" fields.

I've opened a PR for this and have gotten good feedback on some
implementation details but would love to get consensus on the response
format: https://github.com/apache/couchdb/pull/3003

*Moving Forward:*
I've read a few foundationdb forum posts and topic of "Can I get the
changes to the DB, given a versionstamp?" has been discussed a few times.
I'm not sure it will be done on the fdb end anytime soon. I briefly
considered adding another b-tree in memory, but that seems overkill just
for this Total feature.

Thanks,

Tony

Reply via email to