The VersionInfo bucket lazy creation thing feels like an unnecessary optimization to me. I think 65k version buckets as a default doesn't make sense. Reducing that default to 1k or so should suffice. Historic reasons to have it at 65k is documented somewhere in jira, more to do with improving leader to follower throughput. But, those who are concerned about that can switch to TLOG replicas and I guess then they won't be affected adversely? Maybe Tim Potter (who increased the defaults to 65k) might remember more info/context. Having said that, I don't oppose that change per se.
On Thu, 19 Oct, 2023, 7:28 pm David Smiley, <david.w.smi...@gmail.com> wrote: > While we touch VersionInfo in https://github.com/apache/solr/pull/2021 I > see no class javadocs on it. It's really frustrating; we're left to > wonder/guess the sorts of things I ask in this email. Could someone > knowledgeable (like Mark) care to suggest javadocs for this class? A > couple sentences is better than nothing. > > Houston: Can you (or anyone) please elaborate on an example where the > _version_ field is needed for TLOG non-leader to gain leadership and retain > data integrity? I naively think the leader shares its updates with a > follower sequentially as it happens. > > ~ David > > > On Wed, Oct 18, 2023 at 9:45 AM Houston Putman <houstonput...@gmail.com> > wrote: > > > I believe its still useful for TLOG replicas as well. When they gain > > leadership, and they replay the TLOG which could have the same issues > that > > non leader NRT replicas have. > > > > - Houston > > > > On Wed, Oct 18, 2023 at 8:26 AM David Smiley <david.w.smi...@gmail.com> > > wrote: > > > > > Thank you both. It helps to know that "_version"_ is for, I would say > > > succinctly, "NRT replication". I mean; that deserves to be said > > internally > > > in some places! > > > Might it be advantageous to imagine it being optional for non-NRT > > > replicas? I'm not sure if it saves anything or reduces complexity > > > anywhere. > > > Related question: Is the VersionInfo (with its striped VersionBucket > > > locks) related to this -- is it a vestige of "_version_" or is it for > > > something else? If it isn't for something else, then I could imagine > it > > > being omitted for non-NRT; maybe a dummy implementation. BTW Bruno > > opened > > > an issue/PR on it yesterday -- > > > https://issues.apache.org/jira/browse/SOLR-17036 > > > > > > ~ David > > > > > > > > > On Wed, Oct 18, 2023 at 1:41 AM Ishan Chattopadhyaya < > > > ichattopadhy...@gmail.com> wrote: > > > > > > > Fyi, SOLR-5944, is unreadable, but introduced the concept of previous > > > > version or something like that. > > > > > > > > On Wed, 18 Oct, 2023, 10:35 am Mark Miller, <markrmil...@gmail.com> > > > wrote: > > > > > > > > > The primary reason is as Ishan says - so that update reorders from > > > leader > > > > > to replica can be handled in both normal and failure cases. > > > > > > > > > > It’s also true that a part of the reason that the per document, NRT > > > > design, > > > > > with versions, was chosen was a desire to support per document > > > optimistic > > > > > concurrency. > > > > > > > > > > On Tue, Oct 17, 2023 at 11:37 PM Ishan Chattopadhyaya < > > > > > ichattopadhy...@gmail.com> wrote: > > > > > > > > > > > Also DBQs use the version field to ensure they are applied > > correctly, > > > > > even > > > > > > if a DBQ is reordered > > > > > > > > > > > > On Wed, 18 Oct, 2023, 10:05 am Ishan Chattopadhyaya, < > > > > > > ichattopadhy...@gmail.com> wrote: > > > > > > > > > > > > > To ensure reordered updates are processed properly from leader > to > > > > other > > > > > > > replicas in NRT replication mode. > > > > > > > > > > > > > > On Wed, 18 Oct, 2023, 9:55 am David Smiley, < > dsmi...@apache.org> > > > > > wrote: > > > > > > > > > > > > > >> Question: Does the _version_ field have a purpose other than > for > > > > > "atomic > > > > > > >> updates"? > > > > > > >> I know SolrCloud and/or having an UpdateLog insists on it. > But > > I > > > > > don't > > > > > > >> know if it's for that feature alone, or for additional > > non-obvious > > > > > > >> internal > > > > > > >> workings of SolrCloud. Mostly I'm just asking to have a > deeper > > > > > > >> understanding; the field doesn't bother me. If someone knows > of > > > any > > > > > > docs > > > > > > >> on it or old interesting JIRAs to read, I'd appreciate it. > > > > > > >> > > > > > > >> ~ David Smiley > > > > > > >> Apache Lucene/Solr Search Developer > > > > > > >> http://www.linkedin.com/in/davidwsmiley > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >