[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17249980#comment-17249980 ] Ankur commented on LUCENE-9444: --- [~mikemccand], Sorry for the late response. Yes we can resolve this one now. > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Fix For: master (9.0) > > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 5h 10m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206314#comment-17206314 ] Michael McCandless commented on LUCENE-9444: [~goankur] can we re-resolve this one now? > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Fix For: master (9.0), 8.7 > > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 5h 10m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17204908#comment-17204908 ] ASF subversion and git services commented on LUCENE-9444: - Commit cc341292d2f2dd4bd0a6ccbd379424d8c2f23644 in lucene-solr's branch refs/heads/branch_8x from goankur [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cc34129 ] LUCENE-9444: Improve test coverage for TaxonomyFacetLabels (#1928) Co-authored-by: Ankur Goel > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Fix For: master (9.0), 8.7 > > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 5h 10m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17204906#comment-17204906 ] ASF subversion and git services commented on LUCENE-9444: - Commit 2e2161b0e09808b1856c746a6748875c395b855e in lucene-solr's branch refs/heads/master from goankur [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2e2161b ] LUCENE-9444: Improve test coverage for TaxonomyFacetLabels (#1928) Co-authored-by: Ankur Goel > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Fix For: master (9.0), 8.7 > > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 5h 10m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203405#comment-17203405 ] ASF subversion and git services commented on LUCENE-9444: - Commit acce3c15b4a1dcdb69301d06e4ab9bda3d461d58 in lucene-solr's branch refs/heads/branch_8x from Michael McCandless [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=acce3c1 ] LUCENE-9444: woops, cannot use Set.of until JDK 9 > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Fix For: master (9.0), 8.7 > > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 4h 20m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203367#comment-17203367 ] ASF subversion and git services commented on LUCENE-9444: - Commit d71923065dba0b6fe275cb60bd5cffb017509827 in lucene-solr's branch refs/heads/branch_8x from goankur [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d719230 ] LUCENE-9444: add utility class to retrieve facet labels from the taxonomy index for a facet field (#1893) LUCENE-9444: add utility class to retrieve facet labels from the taxonomy index for a facet field so such fields do not also have to be redundantly stored in the index. Co-authored-by: Ankur Goel > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 4h 20m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203368#comment-17203368 ] ASF subversion and git services commented on LUCENE-9444: - Commit d71923065dba0b6fe275cb60bd5cffb017509827 in lucene-solr's branch refs/heads/branch_8x from goankur [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d719230 ] LUCENE-9444: add utility class to retrieve facet labels from the taxonomy index for a facet field (#1893) LUCENE-9444: add utility class to retrieve facet labels from the taxonomy index for a facet field so such fields do not also have to be redundantly stored in the index. Co-authored-by: Ankur Goel > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 4h 20m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203369#comment-17203369 ] ASF subversion and git services commented on LUCENE-9444: - Commit cc31f167519920f85a1ebcc6fc494f2823ab3a06 in lucene-solr's branch refs/heads/branch_8x from Michael McCandless [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cc31f16 ] LUCENE-9444: add CHANGES.txt entry > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 4h 20m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203366#comment-17203366 ] ASF subversion and git services commented on LUCENE-9444: - Commit 98a49ed18d9f258823d3395b3c84c893c35d818b in lucene-solr's branch refs/heads/master from Michael McCandless [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=98a49ed ] LUCENE-9444: add CHANGES.txt entry > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 4h 20m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203282#comment-17203282 ] ASF subversion and git services commented on LUCENE-9444: - Commit 24aadc220ba9578f581637b9fd0e7e973d46426c in lucene-solr's branch refs/heads/master from goankur [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=24aadc2 ] LUCENE-9444: add utility class to retrieve facet labels from the taxonomy index for a facet field (#1893) LUCENE-9444: add utility class to retrieve facet labels from the taxonomy index for a facet field so such fields do not also have to be redundantly stored in the index. Co-authored-by: Ankur Goel > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 4h 20m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203281#comment-17203281 ] ASF subversion and git services commented on LUCENE-9444: - Commit 24aadc220ba9578f581637b9fd0e7e973d46426c in lucene-solr's branch refs/heads/master from goankur [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=24aadc2 ] LUCENE-9444: add utility class to retrieve facet labels from the taxonomy index for a facet field (#1893) LUCENE-9444: add utility class to retrieve facet labels from the taxonomy index for a facet field so such fields do not also have to be redundantly stored in the index. Co-authored-by: Ankur Goel > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 4h 20m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200529#comment-17200529 ] Ankur commented on LUCENE-9444: --- [~mikemccand], I incorporated the code review feedback and * Replaced {{assert} with {{IllegalArgumentException}} for invalid inputs * Added javadoc notes explaining that returned _FacetLabels_ may not be in the same order in which they were indexed. * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method to exercise the API to get facet labels for each matching document. * Here is the updated PR - https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 1h 40m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199374#comment-17199374 ] Michael McCandless commented on LUCENE-9444: Thanks [~goankur]; I'll have a look! > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 10m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198780#comment-17198780 ] Ankur commented on LUCENE-9444: --- Thanks [~mikemccand] for making those changes. I made a couple of minor edits * Fixed typo in javadoc * Replaced {code:java} if (parentOrd == INVALID_ORDINAL) { throw new AssertionError("Root ordinal not found for facet dimension: " + facetDimension); }{code} with single line {code:java} assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension; {code} in method {code:java} public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException{code} * Created a pull request as you suggested in one of your earlier comments :) ** [https://github.com/apache/lucene-solr/pull/1893/commits/bf8eaf98901cbe83f23067bea90dfb2f3102603a] Can you take a look and see if it's ready to be committed ? > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 10m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193350#comment-17193350 ] Ankur commented on LUCENE-9444: --- Thanks [~mikemccand]. I uploaded a new patch that incorporates the code review feed. The patch * Makes {{FacetLabelReader}} public as suggested. * Adds javadoc to explaining that {{FacetLabelReader}} is not thread-safe. * Eliminates {{Iterator}} and replaces {{lookupLabels()}} methods with {{nextFacetLabel()}} methods that just return {{null}} if no more FacetLabels exist for input docId. * Adds {{@lucene.experimental}} to class level javadocs. * Enhances {{TestTaxonomyLabels.testBasic()}} method to check that fetching FacetLabels in decreasing docId order throws {{AssertionError.}} > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.v2.patch > > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192918#comment-17192918 ] Michael McCandless commented on LUCENE-9444: Overall the patch looks good! Some small feedback: * Shouldn't {{FacetLabelReader}} be public so callers can interact with it? This is the danger of putting unit tests in the same package as the classes they are testing... * I think {{FacetLabelReader}} is not thread safe (since {{decodedOrds}} is reused)? So, each thread must pull its own instance? Can you add javadocs explaining the thread safety? * Instead of returning {{null}} if caller calls {{next()}} when {{hasNext()}} had returned {{false}}, can you throw an exception? * I think the caller must provide {{docId}} in non-decreasing order every time? Can you 1) add unit test confirming you get an exception if you break that? And 2) advertise this requirement in the javadocs (for both {{lookupLabels}} methods)? * Could you add {{@lucene.experimental}} to the class level javadocs? This notifies the caller that these APIs are allowed to suddenly change, even in feature release. * Shouldn't the comparison be {{== INVALID_ORDINAL}} not {{<=}}? * I think {{hasNext}} might incorrectly return {{true}} when there are in fact no more labels, for the variant of {{lookupLabels}} taking a {{facetDimension}}, when all remaining ordinals are from a different dimension? Maybe 1) add a unit test showing this issue (failing on the current patch), then 2) fix it, and 3) confirm the test then passes? {{Iterator}} is annoying because of this {{hasNext}}/{{next}} dynamic! * Maybe, instead of {{Iterator}}, we should simply return {{List}}? Or, maybe we just do {{nextFacetLabel}} (and it can return {{null}} when done)? > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch > > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192473#comment-17192473 ] Michael McCandless commented on LUCENE-9444: [~goankur] maybe try GitHub PR instead? > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch > > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190342#comment-17190342 ] Ankur commented on LUCENE-9444: --- Patch has been available for 1+ day, not sure why automated patch testing has picked it up yet. > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch > > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618 ] Ankur commented on LUCENE-9444: --- Here is a patch that adds a new utility class - `TaxonomyFacetLabels` with the method - `getFacetLabelReader(LeafReaderContext)` that returns an instance of nested class - `FacetLabelReader`. `FacetLabelReader` uses an instance of `OrdinalsSegmentReader` to fetch and decode ordinals for the input docid into a reusable buffer and returns an `Iterator` that uses `TaxonomyReader` instance to lookup `FacetLabels`. The patch also adds a new test case `TestTaxonomyLabels` demonstrating the usage. > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch > > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172213#comment-17172213 ] Michael McCandless commented on LUCENE-9444: {quote}Should this class extends {{DocIdSetIterator}} to allow intersection with another {{DocIdSetIterator}} created from {{FacetsCollector.MatchingDoc.bits}} ? {quote} I think so? {quote}Making {{dim}} part of ctor feels a bit restrictive, how about providing 2 separate APIs, one that accepts dimension and another that does not ? {quote} Maybe the method that produces the iterator could optionally take a {{dim}} to filter for only those labels under that dimension? {quote}how about returning a {{java.util.Iterator}} instead of {{FacetLabel[]}} ? {quote} +1 > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171759#comment-17171759 ] Ankur commented on LUCENE-9444: --- > ...and it returns another class for actually iterating over the >{{FacetLabel}} for each document in that segment? Should this class extends {{DocIdSetIterator}} to allow intersection with another {{DocIdSetIterator}} created from {{FacetsCollector.MatchingDoc.bits}} ? Making {{dim}} part of ctor feels a bit restrictive, how about providing 2 separate APIs, one that accepts dimension and another that does not ? > Instead of FacetLabel[] maybe ... how about returning a {{java.util.Iterator}} instead of {{FacetLabel[]}} ? > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171518#comment-17171518 ] Michael McCandless commented on LUCENE-9444: Actually, I think a dedicated helper class makes sense? I like that approach. Because, if we add that 3rd {{BinaryDocValues}} argument then the caller needs to figure out how to create that themselves? The ctor of this helper class could take a {{FacetsConfig}} and a {{dim}} and it'd look up which indexed field has the encoded {{BinaryDocValues}}, and then provide a method where you could pass a {{LeafReaderContext}} and it returns another class for actually iterating over the {{FacetLabel}} for each document in that segment? Perhaps even {{dim}} should be optional so you could efficiently make a single pass over all {{FacetLabel}} for that document. Instead of {{FacetLabel[]}} maybe we could do something similar to how {{SortedSetDocValues}} iterates over multiple {{int}} ordinals per document, with a per-document {{nextOrd}} iterator? So we could have a {{nextFacetLabel}} method letting you step through all {{FacetLabel}} for the document? > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171215#comment-17171215 ] Ankur commented on LUCENE-9444: --- Thanks for your response [~mikemccand] Yes, having _dim_ as additional parameter makes sense. Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a concrete implementation - *_TaxonomyFacetsLabels_* of abstract class *_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* which will then be used to instantiate and reuse the BinaryDocValues iterator between multiple calls to *getLabels(docId, dim).* That way a caller does not need to know if a _*BinaryDocValues*_ field existed at all. The downside is that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for each different *_LeafReaderContext._* But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd argument to *getLabels().* In order to take care of hierarchical fields, I think it makes sense to return FacetLabel[] instead of String[]. One last thing, should we make the API _*static*_ ? The proposed API signature would look like this {{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}} > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169367#comment-17169367 ] Michael McCandless commented on LUCENE-9444: Yeah +1 for some simple sugar API here. We want to make it easy for users to get back their originally indexed facet labels. Maybe {{String[] getLabels(int docId, String dim)}} so the caller can filter to a specific dimension (facet field)? But one tricky part of this is, under the hood, this API must pull a {{BinaryDocValues}} iterator, {{advanceExact}} to that {{docId}}, retrieve the {{byte[]}} and then decode the ordinals. Should we maybe allow/require caller to pass in the {{BinaryDocValues}} for reuse? Another trickiness is what to do with hierarchical fields? Would we return a {{String}} that joined with the standard separator character? Maybe we should return {{FacetLabel[]}} instead? > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169319#comment-17169319 ] Ankur commented on LUCENE-9444: --- Any thoughts folks ? > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org