Re: [PROPOSAL] Speed up _list and _update optimizing req object

Alexander Shorin Fri, 13 Nov 2015 02:45:45 -0800

On Fri, Nov 13, 2015 at 1:28 PM, Giovanni Lenzi <[email protected]> wrote:
>> No, slow is gathering all the stats. Especially in cluster. The
>> db_name you can get from req.userCtx without problem.
>>
>
> Does req.userCtx contain also db_name currently? I thought it was only for
> user data (username and roles). Are you saying that it would be  possible
> to gather db_name only or you are forced to fetch the entire set only?
>


not db_name exactly, but:

    "userCtx": {
        "db": "mailbox",
        "name": "Mike",
        "roles": [
            "user"
        ]
    }


>> > Also I was wondering how heavy could be to include some kind of machine
>> > identifier(hostname or ip address of machine running couchdb) inside of
>> the
>> > request object?
>>
>> What is use case for this? Technically, req.headers['Host'] points on
>> the requested CouchDB.
>>
>> > Or if you want to make it even more flexible: how heavy could be to
>> include
>> > a configuration parameter inside of the request object?
>> >
>> > That could be of great help in some N-nodes master-master redunded
>> database
>> > configurations, to let one node only(the write node) handle some specific
>> > background action.
>>
>> Can you describe this problem a little bit more? How this
>> configuration parameter could be used and what it will be?
>>
>>
> Ok let's think to a 2-node setup with master-master replication set up and
> a round-robin load-balancer in front of them. In normal condition, with
> master-master replication you can balance both read and write requests to
> every node, right?
>
> Now, let's think we need backend services too(email, sms, payments) by
> using some plugin or node.js process(like triggerjob). These  react to
> database _changes, execute some background task and then update the same
> document with a COMPLETED state. The drawback is that, in N-node
> configuration, every node is going to execute same background tasks(2 or
> N-emails will be sent instead of 1, 2 payment transaction instead of 1 and
> so on).
>
> Ok, you may say, with haproxy you can balance only reads(GET,HEAD) and use
> one node only for writes. But what if the write-node goes down? I won't
> have the chance to write anymore, only read.
>
> BUT we can probably do better.. let's step back to balance both read and
> writes. If we have a way to specify, in the update function itself, which
> node is in charge of executing those tasks, they could then be executed
> only once! A trivial, but efficient solution which comes to my mind is: let
> the backend task be handled by the node who received the write request. If
> the update function knows some kind of machine identifier (or configuration
> parameter previously setup), it could mark the task in the document itself
> with the name of the machine responsible for its execution. The plugin or
> node-js process may then execute only tasks allocated to him, by simply
> using a filtered _changes request with his own node name.
>
> This solution has the benefit of letting system administrators to have
> identical N nodes (same data, same ddocs and configuration, only node name
> differs) which balance both read, write requests and backend task
> processing. In this way you may then scale out by simply spawning a new
> node with the same amazon AMI as example.
>
> Am I missing something?

That's what 2.0 is going to solve (:

For 1.x I would use the following configuation:

db1 --- /_changes --\
db2 --- /_changes ---> notification-process -> notification-db
dbN --- /_changes --/

In notification db you store all the tasks that are need to be done
and are already done. Since your db1, db2, dbN are in sync, their
changes feed will eventually produce similar events which you'll have
to filter by using your notification-db data.

--
,,,^..^,,,

Re: [PROPOSAL] Speed up _list and _update optimizing req object

Reply via email to