What Walter said. Although with Solr 7.6, unless you specify maxSegments
explicitly,
you won’t create segments over the default 5G maximum.
And if you have in the past specified maxSegments so you have segments over 5G,
optimize (again without specifying maxSegments) will do a “singleton merge”
From that short description, you should not be running optimize at all.
Just stop doing it. It doesn’t make that big a difference.
It may take your indexes a few weeks to get back to a normal state after the
forced merges.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood
Thank you David, Walt , Eric.
1. First time bloated index generated , there is no disk space issue. one copy
of index is 1/6 of disk capacity. we ran into disk capacity after more than 2
copies of bloated copies.2. Solr is upgraded from 5.*. in 5.* more than 5
segments is causing performance is
It Depends (tm).
As of Solr 7.5, optimize is different. See:
https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/
So, assuming you have _not_ specified maxSegments=1, any very large
segment (near 5G) that has _zero_ deleted documents won’t be merged.
So there are two scenarios:
For a full forced merge (mistakenly named “optimize”), the worst case disk space
is 3X the size of the index. It is common to need 2X the size of the index.
When I worked on Ultraseek Server 20+ years ago, it had the same merge behavior.
I implemented a disk space check that would refuse to merge
I cant give you a 100% true answer but ive experienced this, and what
"seemed" to happen to me was that the optimize would start, and that will
drive the size up by 3 fold, and if you out of disk space in the process
the optimize will quit since, it cant optimize, and leave the live index
pieces in
when optimize command is issued, the expectation after the completion of
optimization process is that the index size either decreases or at most remain
same. In solr 7.6 cluster with 50 plus shards, when optimize command is issued,
some of the shard's transient or older segment files are not de
On 2/6/2014 4:00 AM, Shawn Heisey wrote:
I would not recommend it, but if you know for sure that your
infrastructure can handle it, then you should be able to optimize them
all at once by sending parallel optimize requests with distrib=false
directly to the Solr cores that hold the shard replicas
On 2/5/2014 11:20 PM, Sesha Sendhil Subramanian wrote:
> I am running solr cloud with 10 shards. I do a batch indexing once everyday
> and once indexing is done I call optimize.
>
> I see that optimize happens on each shard one at a time and not in
> parallel. Is it possible for the optimize to ha
Hi,
I am running solr cloud with 10 shards. I do a batch indexing once everyday
and once indexing is done I call optimize.
I see that optimize happens on each shard one at a time and not in
parallel. Is it possible for the optimize to happen in parallel? Each shard
is on a separate box.
Thanks
S
: Subject: "optimize" index : impact on performance
: References: <1375381044900-4082026.p...@n3.nabble.com>
: In-Reply-To: <1375381044900-4082026.p...@n3.nabble.com>
https://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new d
Hi,
We already did some benchmarks during optimize and we haven't noticed a big
impact on overall performance of search. The benchmarks' results were almost
the same with vs. without running optimization. We have enough free RAM for the
two OS disk caches during optimize (15 GB represents the
Hi,
[I am sending again my message to the mailing list, as well as Shawn's reply.
Thanks Shawn for your explanations]
We are trying to improve the performance of our Solr Search application in
terms of QPS (queries per second).
We tuned SOLR settings (e.g. mergeFactor=3), launched several ben
On 8/2/2013 8:13 AM, Anca Kopetz wrote:
Then we optimized the index to 1 segment / 0 deleted docs and we got
+40% of QPS compared to the previous test.
Therefore we thought of optimizing the index every two hours, as our
index is evolving due to frequent commits (every 30 minutes) and thus
the p
Hi,
We are trying to improve the performance of our Solr Search application in
terms of QPS (queries per second).
We tuned SOLR settings (e.g. mergeFactor=3), launched several benchmarks and
had better performance results, but still unsatisfactory for our traffic volume.
Then we optimized the
no, you didn't miss anything. The comment at Lucen Revolution was more
along the lines that optimize didn't actually improve much #absent# deletes.
Plus, on a significant size corpus, the doc frequencies won't changed that
much by deleting documents, but that's a case-by-case thing
Best
Erick
On
what you can try maxSegments=2 or more as a 'partial' optimize:
"If the index is so large that optimizes are taking longer than desired
or using more disk space during optimization than you can spare,
consider adding the maxSegments parameter to the optimize command. In
the XML message, this
Huh? That's something new for me. Optmize removed documents that have been
flagged for deletion. For relevancy it's important those are removed because
document frequencies are not updated for deletes.
Did i miss something?
> For what it's worth, the Solr class instructor at the Lucene Revoluti
For what it's worth, the Solr class instructor at the Lucene Revolution
conference recommended *against* optimizing, and instead suggested to just
let the merge factor do it's job.
On Thu, Nov 4, 2010 at 2:55 PM, Shawn Heisey wrote:
> On 11/4/2010 7:22 AM, stockiii wrote:
>
>> how can i start an
On 11/4/2010 7:22 AM, stockiii wrote:
how can i start an optimize by using DIH, but NOT after an delta- or
full-import ?
I'm not aware of a way to do this with DIH, though there might be
something I'm not aware of. You can do it with an HTTP POST. Here's
how to do it with curl:
/usr/bin/c
Ha
Now I feel stupid !!
I had a misspell in the data path and you were correct.
Can I ask Erik was the command correct though ?
Thank you
Lee
On 2 Mar 2010, at 13:54, Erick Erickson wrote:
> My very first guess would be that you're removing an index that isn't
> the one your SOLR configurati
My very first guess would be that you're removing an index that isn't
the one your SOLR configuration points at.
Second guess would be that your browser is caching the results of
your first query and not going to SOLR at all. Stranger things have
happened .
Third guess is you've mis-identified th
Hi All
Is there a post request method to clean the index?
I have removed my index folder and restarted solr and its still showing
documents in the stats.
I have run this post request:
http://localhost:8983/solr/core1/update?optimize=true
I get no errors but the stats are still show my 4 doc
On Sat, Mar 28, 2009 at 7:38 AM, Otis Gospodnetic
wrote:
>
> Hi,
>
> Answers inlined.
>
>
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
> - Original Message
>> We have a distributed Solr system (2-3 boxes with each running 2
>> instances of Solr and each Solr instan
Thanks Otis. This is very useful. I'll try all your suggestions and
post my findings (and improvements).
Thanks,
-vivek
On Fri, Mar 27, 2009 at 7:08 PM, Otis Gospodnetic
wrote:
>
> Hi,
>
> Answers inlined.
>
>
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
> - Original Me
Hi,
Answers inlined.
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> We have a distributed Solr system (2-3 boxes with each running 2
> instances of Solr and each Solr instance can write to multiple cores).
Is this really optimal? How many CPU co
Hi,
We have a distributed Solr system (2-3 boxes with each running 2
instances of Solr and each Solr instance can write to multiple cores).
Our use case is high index volume - we can get up to 100 million
records (1 record = 500 bytes) per day, but very low query traffic
(only administrators may
e.apache.org
Subject: Re: Optimize index
While we're on the subject of optimizing: Are there any benefits to
optimizing an index before merging it into another index?
Thanks,
Stu
-Original Message-
From: Mike Klaas <[EMAIL PROTECTED]>
Sent: Wed, August 8, 2007 5:16 pm
To: solr
<[EMAIL PROTECTED]>
> Sent: Wed, August 8, 2007 5:16 pm
> To: solr-user@lucene.apache.org
> Subject: Re: Optimize index
>
> On 8-Aug-07, at 2:09 PM, Jae Joo wrote:
>
> > How about standformat optimizion?
> > Jae
>
> Optimized indexes are always faster
While we're on the subject of optimizing: Are there any benefits to optimizing
an index before merging it into another index?
Thanks,
Stu
-Original Message-
From: Mike Klaas <[EMAIL PROTECTED]>
Sent: Wed, August 8, 2007 5:16 pm
To: solr-user@lucene.apache.org
Subject: R
On 8-Aug-07, at 2:09 PM, Jae Joo wrote:
How about standformat optimizion?
Jae
Optimized indexes are always faster at query time that their non-
optimized counterparts. Sometimes significantly so.
-Mike
How about standformat optimizion?
Jae
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Wednesday, August 08, 2007 5:07 PM
To: solr-user@lucene.apache.org
Subject: Re: Optimize index
On 8/8/07, Jae Joo <[EMAIL PROTECTED]> wrote:
&g
On 8/8/07, Jae Joo <[EMAIL PROTECTED]> wrote:
> So, is compound index faster at query time?
Slower (but very slightly). A little less concurrency under heavy load.
-Yonik
So, is compound index faster at query time?
Jae
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Wednesday, August 08, 2007 4:32 PM
To: solr-user@lucene.apache.org
Subject: Re: Optimize index
On 8/8/07, Jae Joo <[EMAIL PROTECTED]>
On 8/8/07, Jae Joo <[EMAIL PROTECTED]> wrote:
> Does anyone know how to optimize the index and what the difference between
> compound format and stand format?
Compound index format squishes almost all the files of a segment into
a single file. It's slower at index time.
-Yonik
You optimize by sending a command to the SOLR update
handler. I'm not sure about the different index formats though..
++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
+-
Does anyone know how to optimize the index and what the difference between
compound format and stand format?
Thanks,
Jae Joo
37 matches
Mail list logo