Hi,
In our Solr 7.4 cluster, we have noticed that some replicas of some of our
Collections are out of sync, the slave replica has more number of records
than the leader.
This is resulting in different number of records on subsequent queries on
the same Collection. Commit is also not helping in
ler-19-thread-1) [ x:###]
> o.a.s.u.LoggingInfoStream [MS][commitScheduler-19-thread-1]: too many
> merges; stalling...
> 2020-05-03 16:31:31.402 INFO (Lucene Merge Thread #55) [ x:###]
> o.a.s.u.LoggingInfoStream [SM][Lucene Merge Thread #55]: 1291879 msec to
> merge do
, you could reduce some of the disk pressure if you can
> put your
> tlogs on another drive, don’t know if that’s possible. Ditto the Solr logs.
>
> Beyond that, it may be a matter of increasing the hardware. You’re really
> indexing
> 120K records second ((1 leader + 2 followers) * 40K)
Hi,
We have a 10 node (150G RAM, 1TB SAS HDD, 32 cores) Solr 8.5.1 cluster with
50 shards, rf 2 (NRT replicas), 7B docs, We have 5 Zk with 2 running on the
same nodes where Solr is running. Our use case requires continuous
ingestions (updates mostly). If we ingest at 40k records per sec, after
n10/ =>
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://x.x.x.24:8983/solr/collection_4_shard3_replica_n10/:
null
On Tue, Aug 11, 2020 at 2:08 AM Anshuman Singh
wrote:
> Just to give you an idea, this is how we are ingesting:
>
> {&
What are your settings for hard/soft commit?
>
> For the shared going to recovery - do you have a log entry or something ?
>
> What is the Solr version?
>
> How do you setup ZK?
>
> > Am 10.08.2020 um 16:24 schrieb Anshuman Singh >:
> >
> > Hi,
> >
> &g
Hi,
We have a SolrCloud cluster with 10 nodes. We have 6B records ingested in
the Collection. Our use case requires atomic updates ("inc") on 5 fields.
Now almost 90% documents are atomic updates and as soon as we start our
ingestion pipelines, multiple shards start going into recovery, sometimes
Hi,
Which file system would be better for Solr, ext4 or XFS?
Regards,
Anshuman
Hi,
I'm using Solr-7.4 and I want to create collections in my cluster such that
no two replicas should be assigned to the same Rack.
I read about Rule-based Replica Placement
https://lucene.apache.org/solr/guide/7_4/rule-based-replica-placement.html.
What I got is I have to create a tag/snitch
Hi,
We missed the fact that case insensitive search doesn't work with
field type "string". We have 3B docs indexed and we cannot reindex the data.
Now, as schema changes require reindexing, is there any other way to
achieve case insensitive search on string fields?
Regards,
Anshuman
Hi,
I'm using Solr-7.4.0 and I want to export 4TB of data from our current Solr
cluster to a different cluster. The new cluster has twice the number of
nodes than the current cluster and I want data to be distributed among all
the nodes. Is this possible with the Backup/Restore feature
I was reading about in-place updates
https://lucene.apache.org/solr/guide/7_4/updating-parts-of-documents.html,
In my use case I have to update the field "LASTUPDATETIME", all other
fields are same. Updates are very frequent and I can't bear the cost of
deleted docs.
If I provide all the fields,
We are running a test case, ingesting 2B records in a collection in 24 hrs.
This collection is spread across 10 solr nodes with a replication factor of
2.
We are noticing many replicas going into recovery while indexing. And it is
degrading indexing performance.
We are observing errors like:
mportant portions of your index. If the OS isn’t large
> enough, the additional I/O pressure from merging may be enough to start
> your system swapping which is A Bad Thing.
>
> See:
> https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
> for how Lucene uses MMapDir
ch one becomes 10G. What happens when the 11th
> segment is created and it’s 100M? Do you rewrite one of the 10G segments
> just
> to add 100M? Your problem gets worse, not better.
>
>
> Best,
> Erick
>
> > On Jun 5, 2020, at 1:41 AM, Anshuman Singh
> wrote:
> &g
d the
> updates.
> And yes, that requires reading the old segment.
> It is common to allow multiple segments when you update often,
> so updating does not interfere with reading the index too often.
>
>
> > On 4 Jun 2020, at 14:08, Anshuman Singh
> wrote:
> >
> &
I noticed that while indexing, when commit happens, there is high disk read
by Solr. The problem is that it is impacting search performance when the
index is loaded from the disk with respect to the query, as the disk read
speed is not quite good and the whole index is not cached in RAM.
When no
a bunch of them
> in order to not get fooled by hitting, say, your queryResultCache. I
> had one client who “stress tested” with the same query and was
> getting 3ms response times because, after the first one, they never
> needed to do any searching at all, everything was
ur use-case requires 100K rows, you should be using streaming or
> cursorMark. While that won’t make the end-to-end time shorter, but won’t
> put such a strain on the system.
>
> Best,
> Erick
>
> > On May 27, 2020, at 10:38 AM, Anshuman Singh
> wrote:
> >
> >
I have a Solr cloud setup (Solr 7.4) with a collection "test" having two
shards on two different nodes. There are 4M records equally distributed
across the shards.
If I query the collection like below, it is slow.
http://localhost:8983/solr/*test*/select?q=*:*=10
QTime: 6930
If I query a
Suppose I have two phone numbers P1 and P2 and the number of records with
P1 are X and with P2 are 2X (2 times X) respectively. If I query for R rows
for P1 and P2, the QTime in case of P2 is more. I am not specifying any
sort parameter and the number of rows I'm asking for is same in both the
21 matches
Mail list logo