[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192918#comment-17192918 ]
Michael McCandless commented on LUCENE-9444: -------------------------------------------- Overall the patch looks good! Some small feedback: * Shouldn't {{FacetLabelReader}} be public so callers can interact with it? This is the danger of putting unit tests in the same package as the classes they are testing... * I think {{FacetLabelReader}} is not thread safe (since {{decodedOrds}} is reused)? So, each thread must pull its own instance? Can you add javadocs explaining the thread safety? * Instead of returning {{null}} if caller calls {{next()}} when {{hasNext()}} had returned {{false}}, can you throw an exception? * I think the caller must provide {{docId}} in non-decreasing order every time? Can you 1) add unit test confirming you get an exception if you break that? And 2) advertise this requirement in the javadocs (for both {{lookupLabels}} methods)? * Could you add {{@lucene.experimental}} to the class level javadocs? This notifies the caller that these APIs are allowed to suddenly change, even in feature release. * Shouldn't the comparison be {{== INVALID_ORDINAL}} not {{<=}}? * I think {{hasNext}} might incorrectly return {{true}} when there are in fact no more labels, for the variant of {{lookupLabels}} taking a {{facetDimension}}, when all remaining ordinals are from a different dimension? Maybe 1) add a unit test showing this issue (failing on the current patch), then 2) fix it, and 3) confirm the test then passes? {{Iterator}} is annoying because of this {{hasNext}}/{{next}} dynamic! * Maybe, instead of {{Iterator}}, we should simply return {{List<FacetLabel>}}? Or, maybe we just do {{nextFacetLabel}} (and it can return {{null}} when done)? > Need an API to easily fetch facet labels for a field in a document > ------------------------------------------------------------------ > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet > Affects Versions: 8.6 > Reporter: Ankur > Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch > > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org