Thanks for the responses and advice. Un-deprecating sounds great, it solves our issue and gives us the flexibility to choose different strategies to deal with it (soft/hard limits etc.). Created LUCENE-9680 <https://issues.apache.org/jira/browse/LUCENE-9680> to track this, I'll have a patch ready by the beginning of next week.
Best, Oren P.S: getFieldNames was deprecated after SOLR-12368 <https://issues.apache.org/jira/browse/SOLR-12368> made in-place DV updates easier for fields that didn't exist. On Tue, Jan 19, 2021 at 7:42 AM Michael McCandless < [email protected]> wrote: > I think it makes sense to un-deprecate that API (why did we deprecate > it?), but I'm not sure IW should be in the business of soft/hard limits on > field count? > > I agree such limits make sense if the integrity of the index is at risk, > e.g. IW does enforce a max number of unique documents in one index. > > But for number of fields, as long as we expose the API, then the layer > above Lucene can handle soft/hard limits, notifying the user correctly, > rejecting updates, etc.? > > Mike McCandless > > http://blog.mikemccandless.com > > > On Thu, Jan 14, 2021 at 5:36 PM Marcus Eagan <[email protected]> > wrote: > >> I like Oren's idea and Simon's proposal of unlimited by default but >> configurable. >> Marcus >> >> On Thu, Jan 14, 2021 at 12:16 AM Simon Willnauer < >> [email protected]> wrote: >> >>> I personally have pretty positive experience with what I call >>> softlimits. At elastic we use them all over the place to catch issues when >>> a user likely misconfigures something or if there is likely a issue on the >>> users end. >>> I think having an option on the IW that allows to limit the >>> fieldnumbers. We can even extract a general limits object with total num >>> docs etc. if we want. We can still set stuff to unlimited by default. >>> >>> WDYT >>> >>> Sent from a mobile device >>> >>> On 14. Jan 2021, at 06:36, David Smiley <[email protected]> wrote: >>> >>> >>> I don't like the idea of IndexWriter limiting field names, but I do like >>> the idea of un-deprecating that method, which appeared to have a trivial >>> implementation. Try commenting on the issue of it's deprecations, which >>> has various watchers to get their attention. >>> >>> ~ David Smiley >>> Apache Lucene/Solr Search Developer >>> http://www.linkedin.com/in/davidwsmiley >>> >>> >>> On Wed, Jan 13, 2021 at 5:02 PM Oren Ovadia >>> <[email protected]> wrote: >>> >>>> Hi All, >>>> >>>> I work on Lucene at MongoDB. >>>> >>>> I would like to limit the amount of fields in an index to prevent >>>> tenants from causing a mapping explosion. >>>> >>>> Since IndexWriter.getFieldNames has been deprecated >>>> <https://issues.apache.org/jira/browse/LUCENE-8909>, there is no way >>>> to do this without using a reader (which comes with a set of problems >>>> regarding flush/commit rates). >>>> >>>> Would love to add to Lucene the ability to have IndexWriters limiting >>>> the number of fields. Curious to hear your thoughts. >>>> >>>> Thanks, >>>> Oren >>>> >>>> >> >> -- >> Marcus Eagan >> >>
