[
https://issues.apache.org/jira/browse/SOLR-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275194#comment-16275194
]
Varun Thacker commented on SOLR-11412:
--------------------------------------
cdcr-api.adoc
I think we can remove "Enable Buffer Response", "Disable Buffer Response",
"CDCR Start Response", "CDCR Stop Response"
The one line explaination doesn't add any value in my opinion.
For "OPS Response" maybe all we should say is : "Provides the average number of
operations as a sum and broken down by adds/deletes"
For "ERRORS Response" we should say "Provides the number of consecutive errors
encountered by the replicator thread, the number of bad requests or internal
errors since the start of the replication process, and a list of the last
errors encountered ordered by timestamp." The first part of the description
seems unecessary to me.
Also both OPS and ERROR's should really be exposed via the metrics API. I'll
file a separate Jira for this.
CDCR Architecture Page:
"The data changes can be replicated in near real-time (with a small delay) or
could be scheduled to be sent at longer intervals to the Target data center" :
"The data changes can be replicated at a configurable amount of time"
Should Source and Target should start with a capital letter?
"Since this is a full copy of the entire index, network bandwidth should be
considered." : What value does this line add to the user?
"CDCR can "bootstrap" the collection to the Target data center. Since this is a
full copy of the entire index, network bandwidth should be considered. Of
course both Source and Target collections may be empty to start." - Remove this
part?
The fifth paragraph ( "Replication supports both..." ) basically explains this
in a better fashion
"The directional nature of the implementation implies a "push" model from the
Source collection to the Target collection. Therefore, the Source configuration
must be able to "see" the ZooKeeper ensemble in the Target cluster. The
ZooKeeper ensemble is provided configured in the Source’s solrconfig.xml file."
I feel we can remove this entire para and just add a line to the 3rd para where
we mention it's pushed based. Here's a stab at an updated 3rd para
"Each shard leader in the Source data center is responsible for replicating its
updates to the corresponding leader in the Target data center. This is a push
model and the source data center must be able to connect to the target
ZooKeeper. Shard leaders in the Target data center will replicate the changes
to their own replicas as normal SolrCloud updates."
"CDCR can be configured to replicate from one collection to a second collection
within the same cluster. That is a specialized scenario not covered in this
Guide." : Does this point have any value? I'm +0 in removing it
>From "Figure 1. Uni-Directional Data Flow" : "Since leaders may ...
>Firewalls, ACL rules, etc., must be configured to allow this." I feel like we
>have the ACL part could be mentioned in the architecture overview and removed
>from here.
"With bi-directional updates, indexing and querying " : I guess the only hard
requirement is indexing. Querying doesn't have any impact in this design i.e
it's the same as uni directional
"Updates sent from Source data center to Target is not propagated back to
Source when bi-directional updates are configured" : This is what point 7 talks
about so maybe remove this
CDCR Configuration :
"<!-- optional -->" : We recommend everyone to disable buffering. Let's remove
this comment
From the "Initial Startup" section
"Sync the index directories from the Source collection to Target collection
across to the corresponding shard nodes. rsync works well for this" till the
end of the section : Seems like a lot of info or notes which are already known?
"ZooKeeper Settings"
800 is a typo? We say we want to set 200 but use
"Cross Data Center Replication Operations" : Should talk about how to update a
schema. I'll add some docs after this commit
> Documentation changes for SOLR-11003: Bi-directional CDCR support
> -----------------------------------------------------------------
>
> Key: SOLR-11412
> URL: https://issues.apache.org/jira/browse/SOLR-11412
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: CDCR, documentation
> Reporter: Amrit Sarkar
> Assignee: Varun Thacker
> Attachments: CDCR_bidir.png, SOLR-11412-split.patch,
> SOLR-11412.patch, SOLR-11412.patch, SOLR-11412.patch, SOLR-11412.patch,
> SOLR-11412.patch
>
>
> Since SOLR-11003: Bi-directional CDCR scenario support, is reaching its
> conclusion. The relevant changes in documentation needs to be done.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]