Sure, I can do that. Let me create an index with a few million docs, call
RTG with a few million iterations on it and note the times between 7.x and
8.x. I assume this should be sufficient (?)

On Wed, May 31, 2023 at 5:19 PM Jan Høydahl <jan....@cominvent.com> wrote:

> Would be nice to determine whether RTG is orders of magnitude slower in
> 8.x than 7.x and is the main culprit.  Then we could isolate the testing to
> RTG only and not involce Atomic Update?
>
> Jan
>
> > 31. mai 2023 kl. 21:33 skrev Rahul Goswami <rahul196...@gmail.com>:
> >
> > I don’t have any nested documents. And the results are consistent across
> > multiple runs. I tried looking for similar issues in the mailing list,
> but
> > couldn’t find anything relevant . So if you do happen to find any JIRAs
> > addressing it that would be really helpful (thanks!).
> >
> > To Jan’s question about RTG taking more time in Solr 8.x, I can say with
> > good certainty that this is the case. Although it does look into
> > transaction logs first, thread dumps suggest that it is the next phase
> > (when it doesn't find the doc in tlog) which seems to be time consuming .
> > It tries to look up the document via the current searcher
> > (searcher.getFirstMatch() ). Proceeding further in the stack, it is this
> > call where many threads are spending time:
> >
> >
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.1/lucene/core/src/java/org/apache/lucene/codecs/blocktree/SegmentTermsEnum.java#L485
> >
> > Although this call is the same in 7.7.2 and 8.11.1 quite likely
> > something changed in Lucene's FST.java which is causing the slowness. I
> am
> > trying to dig further and might also ask folks on the Lucene mailing
> list.
> > Thanks.
> >
> >
> >
> > On Wed, May 31, 2023 at 11:36 AM Srijan <shree...@gmail.com> wrote:
> >
> >> I would love some profiling as well. I know 8.8 or 8.9 had some
> performance
> >> problems with atomic update but this was later addressed. I cant find
> the
> >> jira atm though. Also I am on 8.11.1 and atomic update is not an issue
> for
> >> me.
> >>
> >> By the way, do you happen to have nested docs?
> >>
> >>
> >> On Wed, May 31, 2023, 11:20 Jan Høydahl <jan....@cominvent.com> wrote:
> >>
> >>> Hi
> >>>
> >>> MMap is most important for searching. Indexing bypasses the cache by
> >> using
> >>> direct IO.
> >>>
> >>> I have noticed slow real time get on Solr 8.x during atomic update
> >> myself.
> >>> Would be interesting with a comparison with profiling. RTG gets the
> >>> document from transaction log I believe? Could there be some RTG
> changes
> >> in
> >>> 8.x that caused such slowdown?
> >>>
> >>> Jan Høydahl
> >>>
> >>>> 31. mai 2023 kl. 16:57 skrev Rahul Goswami <rahul196...@gmail.com>:
> >>>>
> >>>> Thanks for the response Shawn. We are using Windows server with
> pretty
> >>> huge
> >>>> indexes (multiple TB cores). With Mmap, I have observed that the
> >> machine
> >>>> just completely freezes with high CPU and memory usage to a point
> where
> >>> it
> >>>> becomes impossible to even connect to it. SimpleFS works out well for
> >> us
> >>> in
> >>>> this case.
> >>>>
> >>>> As noted in my first email, even with SimpleFS, Solr 7 completes the
> >>> crawl
> >>>> in nearly 1/5th the time taken in Solr 8. Hence there should be
> >> something
> >>>> OUTSIDE the directory factory in the code which is causing this.
> >>>>
> >>>> Thanks,
> >>>> Rahul
> >>>>
> >>>>
> >>>>> On Tue, May 30, 2023 at 10:47 PM Shawn Heisey <apa...@elyograg.org>
> >>> wrote:
> >>>>>
> >>>>>> On 5/30/23 15:34, Rahul Goswami wrote:
> >>>>>> Environment details: - Java 11 on Windows server - Xms1536m Xmx3072m
> >> -
> >>>>>> Indexing client code running 15 parallel threads indexing in batches
> >> of
> >>>>>> 1000 - using SimpleFSDirectoryFactory (since Mmap doesn't quite work
> >>>>>> well on Windows for our index sizes which commonly run north of 1
> TB)
> >>>>>
> >>>>> Don't change the directoryFactory.  You *WANT* Solr to use MMAP for
> >> your
> >>>>> indexes.  Not using MMAP is likely to slow things down considerably.
> >>>>> MMAP should work just fine on 64-bit Windows with 64-bit Java.  Which
> >> of
> >>>>> course requires 64-bit hardware.
> >>>>>
> >>>>> 32 bit systems and software cannot properly deal with data larger
> than
> >>>>> about 2GB.
> >>>>>
> >>>>> Thanks,
> >>>>> Shawn
> >>>>>
> >>>
> >>
>
>

Reply via email to