nickimho edited a comment on issue #2329: Add option to enforce fetching data
from local shards, instead of from shards on on remote nodes(if data present on
local node)
URL: https://github.com/apache/couchdb/issues/2329#issuecomment-573249534
We tested some behavior with 3 zone cluster (each zone with 5 nodes, n=3,
q=1, and placement is one in each zone{a,b,c}). For us, we use network
impairment tools so that there is 60ms RTD between each zone. We used
CouchDB2.3.1
1. Terms/Definition
a. Client – This is the host that initiates the query to
couchdb's port 5984
b. Couchdb_QUERY_NODE – This is the couchdb node in cluster
that receives the database query from Client on port 5984. This node may or may
NOT be the node that holds shard for the database.
c. Couchdb_METALOOKUP_NODE – This is the couchdb node that
Couchdb_QUERY_NODE queries for some meta info (not sure what it is).
Couchdb_METALOOKUP_NODE is a node in Couchdb_DATA_NODES. The selection of this
Couchdb_METALOOKUP_NODE
i The selection of Couchdb_METALOOKUP_NODE
is based on "by_range" key in the couchdb:5986/dbs/mydb. The first one in the
array ia picked.
d. Couchdb_DATA_NODES – This is the set of couchdb nodes
that actually hold a copy of the database asked by the query.
2. General data flow we observed:
a. General data flow for doc query:
i. Client -> Couchdb_QUERY_NODE:5984
ii. If Couchdb_QUERY_NODE NOT is NOT
Couchdb_DATA_NODES, Couchdb_QUERY_NODE -> Couchdb_METALOOKUP_NODE:11500
1. This selection is
determinitic based on 1.c.i. Suppose Couchdb_DATA_NODES in zonea for mydb is
first in "by_range" key, it will always be queried for this phase. This makes
queries into mydb from zonec and zonb having an additional 60ms RTD network
delay compared to zonea.
iii. Couchdb_QUERY_NODE -> “three
Couchdb_DATA_NODES”:11500
1. Once enough
Couchdb_DATA_NODE’s (default read quorum is 2 when n=3) returns data, this
phase stops
iv. Couchdb_QUERY_NODE->Client with
query result
b. View query largely follows the same as doc. Except for
the following:
i. Couchdb_QUERY_NODE seems to cache
the View definition/metadata
1. During the first query to
/mydb/_view/myview, it will retrieve the the view doc following 2.a process
a. subsequent query to
/mydb/_view/myview would bypass this.
2. When Couchdb_QUERY_NODE actually
retrieve the myview result, it seems to ONLY query the Couchdb_DATA_NODES in
the SAME zone as itself. This is good as it saves bandwidth for large returns
between zones.
We haven't tested attachment retrieve yet, but it seems to me that it should
follow the same view query logic in 2.b.2 if not already. We will try to test
this some time next week.
Also, not sure if this needs to be a different ticket, but we would really
like to see 2.a.ii.1 to be optimized so that it would query its local zone
Couchdb_METALOOKUP_NODE first. Currently, we plan to workaround this by change
the "by_range" order in 5986/dbs/mydb to favorite the primary zone for our
service.
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org
With regards,
Apache Git Services