Hey folks

an update! Sadly, not the "All fixed" I'd hoped for :(

We tried moving the boost files to a RAM disk to maximise their
availability and minimise the issue. It definitely improved things some,
but we're still well below the performance we want/need that matches what
we currently get from 8.3.1

We even tried moving all solr files to RAM disk but still performance was
down in the ~80%s rather than the high 90%s

So, back to the drawing board. Having run out of other avenues to
investigate, I tried going version-by-version from 8.3 to see where the
problem is introduced.

8.4 and 8.5 both upgraded seamlessly without issue. 8.6 we started seeing
the problem, and it remains all the way up to 8.10. I'm not seeing anything
in the changes for this version that jumps out at me as a possible cause -
anybody have any ideas what it might be about this upgrade that introduces
the issue?

On Thu, 28 Oct 2021 at 15:16, Charlie Hull <[email protected]>
wrote:

> Thanks Dominic. I'm guessing that something in the replication
> invalidates cacheing of these files, and once they're in memory again
> everything is fine, although I don't know how this might have changed.
>
> I found this interesting snippet about ExternalFileField performance
> being improved by sorting that might be related, but then again it's
> pretty old.
>
> https://stackoverflow.com/questions/29470458/solr-external-file-field-performance-issue
> . I also note that my ex-colleague Alan did improve EFF performance a
> while ago https://issues.apache.org/jira/browse/SOLR-3985 . Everything
> I've read including from Issuu
> https://engineering.issuu.com/2013/03/11/how-search-at-issuu-actually-works
> implies that EFF isn't particularly performant anyway. There doesn't
> seem to have been any activity around EFF between those versions apart
> from some doc fixes
>
> https://issues.apache.org/jira/browse/SOLR-14968?jql=text%20~%20%22externalfilefield%22
>
>
> Hope some of these links help you track down the problem!
>
> Best
>
> Charlie
>
> On 26/10/2021 15:31, Dominic Humphries wrote:
> > No problem, I've been trying to get my head around how it all works
> myself!
> >
> > As per
> >
> https://solr.apache.org/guide/8_9/working-with-external-files-and-processes.html
> > our schema defines a field type:
> >      <fieldType name="fileboost" keyField="id" defVal="1" stored="false"
> > indexed="false" class="solr.ExternalFileField"/>
> > which is then used to define a field:
> >      <field name="boostvalue" type="fileboost"/>
> > which pulls data from a file, external_boostvalue, living in
> $SOLR_HOME/data
> >
> > This is used to set a boost value that increases the visibility of some
> > search results.
> >
> > Setting this file to be empty completely removes the performance hit we
> see
> > taking several minutes to resolve after each replication. But we do need
> > the functionality still, and I'm unclear on why this is an issue for 8.9
> > when it wasn't for 8.3
> >
> > Hope this clarifies the problem!
> >
> > Dominic
> >
> > On Mon, 25 Oct 2021 at 19:03, Charlie Hull <
> [email protected]>
> > wrote:
> >
> >> Hi Dominic,
> >>
> >> Could you clarify what you mean by boost files in this context? Just
> >> curious....
> >>
> >> Charlie
> >>
> >> On 25/10/2021 17:11, Dominic Humphries wrote:
> >>> Performance with the replica pulling from 8.3.1 was actually worse. And
> >>> looking at the data in the databases and the boost file contents, I'm
> >>> dubious it's a problem of incompatible boost files. I think the
> >> performance
> >>> of importing/applying the boosts really is what's responsible for the
> >> issue
> >>> we see. Not sure what else to test to verify or disprove this..
> >>>
> >>> On Mon, 25 Oct 2021 at 14:56, Dominic Humphries <[email protected]>
> >> wrote:
> >>>> I think I found it!
> >>>>
> >>>> I didn't realise, but we have boost files for the core I'm testing and
> >> the
> >>>> boost is applied after replication! Setting the contents of the files
> to
> >>>> empty completely removes the post-replication performance problem we
> >> were
> >>>> seeing.
> >>>>
> >>>> So now my question becomes "Why is boosting taking so much longer for
> >> the
> >>>> upgrade?"
> >>>>
> >>>> Since the upgrade has its own independent set of data, I'm wondering
> if
> >>>> it's as simple as the IDs it's trying to boost don't exist and it
> takes
> >>>> longer to find out an item is missing than it does to find one that
> >> does? I
> >>>> believe I can point an 8.9.0 follower at an 8.3.1 leader, that seems
> >> like
> >>>> the next logical step - if there's no performance hit when it has the
> >> same
> >>>> data as the 8.3.1 replica, then that's almost certainly the problem.
> >>>>
> >>>> Fingers crossed!
> >>>>
> >>>> On Sun, 24 Oct 2021 at 10:26, Deepak Goel <[email protected]> wrote:
> >>>>
> >>>>> There could be some testing and cooling happening post-replication.
> >> will
> >>>>> have to dig a bit more into the code.
> >>>>>
> >>>>> Deepak
> >>>>> "The greatness of a nation can be judged by the way its animals are
> >>>>> treated
> >>>>> - Mahatma Gandhi"
> >>>>>
> >>>>> +91 73500 12833
> >>>>> [email protected]
> >>>>>
> >>>>> Facebook: https://www.facebook.com/deicool
> >>>>> LinkedIn: www.linkedin.com/in/deicool
> >>>>>
> >>>>> "Plant a Tree, Go Green"
> >>>>>
> >>>>> Make In India : http://www.makeinindia.com/home
> >>>>>
> >>>>>
> >>>>> On Thu, Oct 21, 2021 at 9:57 PM Dominic Humphries
> >>>>> <[email protected]> wrote:
> >>>>>
> >>>>>> One more tidbit: I just tried leaving replication off for a few
> hours
> >>>>> and
> >>>>>> then triggering a "big" replication run so I could see the distinct
> >>>>> stages.
> >>>>>>      - Beginning replication didn't cause any performance
> degradation.
> >>>>>>      - Several minutes of downloading the replication files saw no
> >>>>>> degradation
> >>>>>>      - Only after downloading had completed did we start to see
> >>>>> performance
> >>>>>>      issues in our tests
> >>>>>>      - But we saw the "number of docs/timestamp of latest file" both
> >> jump
> >>>>>>      almost immediately after downloading completed and never move
> >> again
> >>>>>>      - But the performance degradation continued for about seven
> more
> >>>>> minutes
> >>>>>>      even though replication was clearly finished at this point
> >>>>>>
> >>>>>>
> >>>>>> Is there some kind of re-indexing optimization thing that solr can
> run
> >>>>>> post-replication? At this point it's about my only remaining
> suspect..
> >>>>>>
> >> --
> >> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> >> <www.o19s.com>
> >> Founding member of The Search Network <https://thesearchnetwork.com/>
> >> and co-author of Searching the Enterprise
> >> <https://opensourceconnections.com/about-us/books-resources/>
> >> tel/fax: +44 (0)8700 118334
> >> mobile: +44 (0)7767 825828
> >>
> >> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> >> Amtsgericht Charlottenburg | HRB 230712 B
> >> Geschäftsführer: John M. Woodell | David E. Pugh
> >> Finanzamt: Berlin Finanzamt für Körperschaften II
> >>
>
> --
> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> <www.o19s.com>
> Founding member of The Search Network <https://thesearchnetwork.com/>
> and co-author of Searching the Enterprise
> <https://opensourceconnections.com/about-us/books-resources/>
> tel/fax: +44 (0)8700 118334
> mobile: +44 (0)7767 825828
>
> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> Amtsgericht Charlottenburg | HRB 230712 B
> Geschäftsführer: John M. Woodell | David E. Pugh
> Finanzamt: Berlin Finanzamt für Körperschaften II
>

Reply via email to