Lucene (and thus Solr) does not split segments.

The only closest maybe sorta kinda does situation would be a shard split
with the "rewrite" method.

On Tue, Dec 10, 2024 at 11:18 AM ufuk yılmaz <uyil...@vivaldi.net.invalid>
wrote:

> Correct me if I’m wrong, I don’t know if there is an upside of splitting
> an already merged segment, afaik a single big segment helps with queries
> and multiple small segments help with indexing but the benefit is only
> during the indexing so splitting an already merged index shouldn’t make
> indexing faster, because the hard work is already done and finished long
> ago.
>
> About your original question I’m sorry that I can’t say yes or no without
> testing and confirming but I’d be surprised if Solr goes back to a merged
> index and splits it. It is like fragmenting an already defragmented hard
> drive. From the documentation:
>
> “If creating a new segment would cause the number of lowest level segments
> to exceed the mergeFactor value, then all those segments are merged
> together to form a single large segment.”
>
> What I understand from the above is Solr tries to merge them to 1 all the
> time, you can only set how many segments will trigger the initial merge. If
> the limit is low, merges are triggered frequently so it puts additional
> load on the node during heavy indexing.
>
> -ufuk
>
> —
>
> > On Dec 11, 2024, at 0:02, Kevin Liang (BLOOMBERG/ 919 3RD A) <
> klian...@bloomberg.net> wrote:
> >
> > Yes - I am wondering if solr will split segments when the config is
> updated to reduce the max segment size
> >
> > From: users@solr.apache.org At: 12/09/24 20:53:10 UTC-5:00To:
> users@solr.apache.org
> > Subject: Re: Merge Policy Max Segment Size + Reindexing
> >
> > Do you mean if Solr will go back to the old already merged segments and
> split
> > them after the change?
> >
> > —
> >
> >>> On Dec 10, 2024, at 2:27, Kevin Liang (BLOOMBERG/ 919 3RD A)
> >> <klian...@bloomberg.net> wrote:
> >>
> >> Sorry, I didn't word my original message very well. The situation is
> we
> > have a larger max segment size currently configured (let's say 10GB) and
> we
> > want to update the config to reduce the max segment size (let's say
> 5GB). Will
> > we need to reindex for that reduction to take effect? I have only seen
> > code/explanations around merging (i.e. Joining two segments into larger
> > segments) but never anything about segment splitting (which I imagine
> doesn't
> > happen?)
> >>
> >> From: users@solr.apache.org At: 12/09/24 12:22:38 UTC-5:00To:
> > users@solr.apache.org
> >> Subject: Re: Merge Policy Max Segment Size + Reindexing
> >>
> >> As far as I know it does not require reindexing. But not sure if there
> is
> >> no NRT is happening will the above take effect .
> >>
> >> It is good to read.
> >>
> >
> https://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
> >>
> >>
> >> Thanks,
> >> Ramesh
> >>
> >>> On Mon, Dec 9, 2024 at 11:57 AM Kevin Liang (BLOOMBERG/ 919 3RD A) <
> >>> klian...@bloomberg.net> wrote:
> >>>
> >>> Hello,
> >>>
> >>> We've configured a merge policy factory such that max segment size is
> >>> greater than the default. If we update the configuration and restart
> the
> >>> cloud, will the lucene segments automatically be split to meet the
> smaller
> >>> max segment size? Or will this require a full reindex of the data?
> >>>
> >>> -Kevin Liang
> >>
> >>
> >> --
> >> Thanks,
> >> Ramesh
> >>
> >>
> >
> >
>
>

Reply via email to