[GitHub] [couchdb] nickimho edited a comment on issue #2329: Add option to enforce fetching data from local shards, instead of from shards on on remote nodes(if data present on local node)

2020-01-14 Thread GitBox
nickimho edited a comment on issue #2329: Add option to enforce fetching data 
from local shards, instead of from shards on on remote nodes(if data present on 
local node)
URL: https://github.com/apache/couchdb/issues/2329#issuecomment-574383847
 
 
   @kocolosk 
   
   We brought up test environment and recheck the behavior, I updated 2.a.ii in 
https://github.com/apache/couchdb/issues/2329#issuecomment-573249534 .  The 
Couchdb_METALOOKUP_NODE lookup is skipped if Couchdb_QUERY_NODE holds the 
shard. 
   
   We also reconfirmed that the Couchdb_METALOOKUP_NODE lookup is based on 
"by_range"


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [couchdb] nickimho edited a comment on issue #2329: Add option to enforce fetching data from local shards, instead of from shards on on remote nodes(if data present on local node)

2020-01-14 Thread GitBox
nickimho edited a comment on issue #2329: Add option to enforce fetching data 
from local shards, instead of from shards on on remote nodes(if data present on 
local node)
URL: https://github.com/apache/couchdb/issues/2329#issuecomment-573249534
 
 
   We tested some behavior with 3 zone cluster (each zone with 5 nodes, n=3, 
q=1, and placement is one in each zone{a,b,c}). For us, we use network 
impairment tools so that there is 60ms RTD between each zone. We used 
CouchDB2.3.1
   
1. Terms/Definition
   a. Client – This is the host that initiates the query to 
couchdb's port 5984
   b. Couchdb_QUERY_NODE – This is the couchdb node in cluster 
that receives the database query from Client on port 5984. This node may or may 
NOT be the node that holds shard for the database. 
   c. Couchdb_METALOOKUP_NODE – This is the couchdb node that 
Couchdb_QUERY_NODE queries for some meta info (not sure what it is). 
Couchdb_METALOOKUP_NODE is a node in Couchdb_DATA_NODES. The selection of this 
Couchdb_METALOOKUP_NODE
i The selection of Couchdb_METALOOKUP_NODE 
is based on "by_range" key in the couchdb:5986/dbs/mydb. The first one in the 
array ia picked.
   d. Couchdb_DATA_NODES – This is the set of couchdb nodes 
that actually hold a copy of the database asked by the query.
2.  General data flow we observed:
   a.  General data flow for doc query:
i.  Client -> Couchdb_QUERY_NODE:5984
ii. If Couchdb_QUERY_NODE NOT is NOT 
Couchdb_DATA_NODES,  Couchdb_QUERY_NODE -> Couchdb_METALOOKUP_NODE:11500  
1. This selection is 
determinitic based on 1.c.i. Suppose Couchdb_DATA_NODES in zonea for mydb is 
first in "by_range" key, it will always be queried for this phase. This makes 
queries into mydb from zonec and zonb having an additional 60ms RTD network 
delay compared to zonea.  
iii. Couchdb_QUERY_NODE -> “three 
Couchdb_DATA_NODES”:11500
1. Once enough 
Couchdb_DATA_NODE’s (default read quorum is 2 when n=3) returns data, this 
phase stops
iv.  Couchdb_QUERY_NODE->Client with 
query result
   b. View query largely follows the same as doc. Except for 
the following:
i. Couchdb_QUERY_NODE seems to cache 
the View definition/metadata
1. During the first query to 
/mydb/_view/myview, it will retrieve the the view doc following 2.a process
a. subsequent query to 
/mydb/_view/myview would bypass this. 
2. When Couchdb_QUERY_NODE actually 
retrieve the myview result, it seems to ONLY query the Couchdb_DATA_NODES in 
the SAME zone as itself. This is good as it saves bandwidth for large returns 
between zones.
   
   We haven't tested attachment retrieve yet, but it seems to me that it should 
follow the same view query logic in 2.b.2 if not already. We will try to test 
this some time next week. 
   
   Also, not sure if this needs to be a different ticket, but we would really 
like to see 2.a.ii.1 to be optimized so that it would query its local zone 
Couchdb_METALOOKUP_NODE first. Currently, we plan to workaround this by change 
the "by_range" order in 5986/dbs/mydb to favorite the primary zone for our 
service. 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [couchdb] nickimho edited a comment on issue #2329: Add option to enforce fetching data from local shards, instead of from shards on on remote nodes(if data present on local node)

2020-01-10 Thread GitBox
nickimho edited a comment on issue #2329: Add option to enforce fetching data 
from local shards, instead of from shards on on remote nodes(if data present on 
local node)
URL: https://github.com/apache/couchdb/issues/2329#issuecomment-573271980
 
 
   @kocolosk 
   
   Actually, the commit you provided 
(https://github.com/apache/couchdb/blob/dd1b2817bbf7a0efce858414310a0c822ce89468/src/fabric/src/fabric_util.erl#L93-L105
 ) is on 2019-04-03. The build we are using, CouchDB2.3.1, was released on Feb 
right? So, maybe we just need a newer build =)
   
   I just looked it up, and this was the build we used:
   
   rpm -qi couchdb
   Name: couchdb
   Version : 2.3.1
   Release : 1.el7
   Architecture: x86_64
   Install Date: Mon 25 Mar 2019 05:57:48 PM GMT
   Group   : Applications/Databases
   Size: 43583049
   License : Apache License v2.0
   Signature   : (none)
   Source RPM  : couchdb-2.3.1-1.el7.src.rpm
   Build Date  : Mon 11 Mar 2019 10:33:54 PM GMT
   Build Host  : d6c906bce27a
   Relocations : /opt/couchdb
   Packager: CouchDB Developers 
   Vendor  : The Apache Software Foundation
   URL : https://couchdb.apache.org/
   Summary : RESTful document oriented database
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services