Funny, I took a different approach to the same monitoring problem.
Each document has a published_timestamp field set when it is generated. The
schema has an indexed_timestamp field with a default of NOW. I wrote some
Python to get the set of nodes in the collection, query each one, then report
Actually I wrote a service that calls the collections API Cluster Status,
but it adds data for each replica by calling the Core Admin STATUS
https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API#CoreAdminAPI-STATUS
my service fills in the index information for more data
This returns the
oh, those logs probably reflect the update job that runs every 15 minutes
if there are updates, typically 1 or 2 changes. thanks for the info
On Wed, May 24, 2017 at 10:37 AM, Erick Erickson
wrote:
> By default, enough closed log files will be kept to hold the last 100
> documents indexed. This
By default, enough closed log files will be kept to hold the last 100
documents indexed. This is for "peer sync" purposes. Say replica1 goes
offline for a bit. When it comes back online, if it's fallen behind by
no more than 100 docs, the docs are replayed from another replica's
tlog.
Having such
The tlog sizes are strange
In the case of the collection where we had issues with the replicas the
tlog sizes are 740 bytes and 938 bytes on the target side and the same on
the source side. There are a lot of them on the source side, when do tlog
files get deleted?
On Tue, May 23, 2017 at 12:52
I wouldn't rely on the "current" flag in the admin UI as an indicator.
As long as your numDocs and the like match I'd say it's a UI issue.
Best,
Erick
On Wed, May 24, 2017 at 8:15 AM, Webster Homer wrote:
> We see data in the target clusters. CDCR replication is working. We first
> noticed the c
We see data in the target clusters. CDCR replication is working. We first
noticed the current=false flag on the target replicas, but since I started
looking I see it on the source too.
I have removed the IgnoreCommitOptimizeUpdateProcessorFactory from our
update processor chain, I did two data lo
This is all quite strange. Optimize (BTW, it's rarely
necessary/desirable on an index that changes, despite its name)
shouldn't matter here. CDCR forwards the raw documents to the target
cluster.
Ample time indeed. With a soft commit of 15 seconds, that's your
window (with some slop for how long C
We see a pretty consistent issue where the replicas show in the admin
console as not current, indicating that our auto commit isn't commiting. In
one case we loaded the data to the source, cdcr replicated it to the
targets and we see the source and the target as having current = false. It
is search
You can ping individual replicas by addressing to a specific replica
and setting distrib=false, something like
http://SOLR_NODE:port/solr/collection1_shard1_replica1/query?distrib=false&q=..
But one thing to check first is that you've committed. I'd:
1> turn off indexing on the source c
I have a solrcloud collection with 2 shards and 4 replicas. The replicas
for shard 1 have different numbers of records, so different queries will
return different numbers of records.
I am not certain how this occurred, it happened in a collection that was a
cdcr target.
Is there a way to limit a
11 matches
Mail list logo