It's worth investigating deprecating the stats component also. I believe JSON facets covers that functionality as well. It will be painful for users though to switch over unfortunately.
Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Jan 22, 2021 at 1:14 PM Jason Gerlowski <gerlowsk...@gmail.com> wrote: > Personally I'd love to see us stop maintaining the duplicated code of > the underlying implementations. I wouldn't mind losing the legacy > syntax as well - I'll take a clear, verbose API over a less-clear, > concise one any day. But I'm probably a minority there. > > Either way I agree with Michael when he said above that the first step > would have to be a parity investigation for features and performance. > > Best, > > Jason > > On Fri, Jan 22, 2021 at 10:05 AM Michael Gibney > <mich...@michaelgibney.net> wrote: > > > > I agree it would make long-term sense to consolidate the backend > implementation. I think leaving the "classic" user-facing facet API (with > JSON Facet module as a backend) would be a good idea. Either way, I think a > first step would be checking for parity between existing backend > implementations -- possibly in terms of features [1], but certainly in > terms of performance for common use cases [2]. > > > > I think removal of the "classic" user-facing API would cause a lot of > consternation in the user community. I can even see a > non-backward-compatibility argument for preserving the "classic" > user-facing API: it's simpler for simple use cases. _If_ the ultimate goal > is removal of the "classic" user-facing API (not presuming that it is), > that approach could be facilitated in the short term by enticing users > towards "JSON Facet" API ... basically with a "feature freeze" on the > legacy implementation. No new features [3], no new optimizations [4] for > "classic"; concentrate such efforts on JSON Facet. This seems to already be > the de facto case, but it could be a more intentional decision -- e.g. in > [3] it's straightforward to extend the the proposed "facet cache" to the > "classic" impl ... but I could see an argument for intentionally not doing > so. > > > > Robert, I think your concerns about UninvertedField could be addressed > by the `uninvertible="false"` property (currently defaults to "true" for > backward compatibility iiuc; but could default to "false", or at least > provide the ability to set the default for all fields to "false" at node > level solr.xml? -- I know I've wished for the latter!). Also fwiw I'm not > aware of any JSON Facet processors that work with string values in RAM ... > I do think all JSON Facet processors use OrdinalMap now, where relevant. > > > > [1] https://issues.apache.org/jira/browse/SOLR-14921 > > [2] https://issues.apache.org/jira/browse/SOLR-14764 > > [3] https://issues.apache.org/jira/browse/SOLR-13807 > > [4] https://issues.apache.org/jira/browse/SOLR-10732 > > > > On Fri, Jan 22, 2021 at 12:46 AM Robert Muir <rcm...@gmail.com> wrote: > >> > >> Do these two options conflate concerns of input format vs. actual > >> algorithm? That was always my disappointment. > >> > >> I feel like the java apis are off here at the lower level, and it > >> hurts the user. > >> I don't talk about the input format from the user, instead I mean the > >> execution of the faceting query. > >> > >> IMO: building top-level caches (e.g. uninvertedfield) or > >> on-the-fly-caches (e.g. fieldcache) is totally trappy already. > >> But with the uninvertedfield of json facets it does its own thing, > >> even if you went thru the trouble to enable docvalues at index time: > >> that's sad. > >> > >> the code by default should not give the user jvm > >> heap/garbage-collector hell. If you want to do that to yourself, for a > >> totally static index, IMO that should be opt-in. > >> > >> But for the record, it is no longer just two shitty choices like > >> "top-level vs per-segment". There are different field types, e.g. > >> numeric types where the per-segment approach works efficiently. > >> Then you have the strings, but there is a newish middle ground for > >> Strings: OrdinalMap (lucene Multi* interfaces do it) which builds > >> top-level integers structures to speed up string-faceting, but doesnt > >> need *string values* in ram. > >> It is just integers and mostly compresses as deltas. Adrien compresses > >> the shit out of it. > >> > >> So I'd hate for the user to lose the option here of using docvalues to > >> keep faceting out of heap memory, which should not be hassling them > >> already in 2021. > >> Maybe better to refactor the code such that all these concerns aren't > >> unexpectedly tied together. > >> > >> On Thu, Jan 21, 2021 at 10:08 PM David Smiley <dsmi...@apache.org> > wrote: > >> > > >> > There's a JIRA issue about this from 5 years ago: > https://issues.apache.org/jira/browse/SOLR-7296 > >> > I don't recall seeing any resistance to the idea of having the JSON > Faceting module act as a back-end to the front-end (API surface) of Solr's > common/classic/original/whatever faceting API. I don't think that simple > API should go away; it's strength is simple/common cases that are > comparatively verbose in the JSON one. > >> > > >> > ~ David Smiley > >> > Apache Lucene/Solr Search Developer > >> > http://www.linkedin.com/in/davidwsmiley > >> > > >> > > >> > On Thu, Jan 21, 2021 at 9:57 PM Marcus Eagan <marcusea...@gmail.com> > wrote: > >> >> > >> >> Hi all, > >> >> > >> >> Sorry to spam the list. I am querying the list in such quick > succession because of a realization I came to while on Twitter. Is it time > to deprecate the Legacy Facet API? > >> >> > >> >> I understood in the past that they behaved slightly differently. > Now, I'm wondering if it makes sense to keep the legacy facets package as > it adds a burden of maintenance to the project. If some activists really > want it, I will abandon the effort. If the interest is very light, I > suppose they can package it up in a plugin. In fact, I would help if they > run into trouble and I am able to help. > >> >> > >> >> Anyway, let me know what you think. If it's a good idea, I will head > over to the chopping block. > >> >> > >> >> -- > >> >> Marcus Eagan > >> >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: dev-h...@lucene.apache.org > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >