[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203559#comment-17203559 ] Ankur edited comment on LUCENE-9444 at 9/29/20, 12:06 AM: -- Thanks [~mikemccand] for merging the [PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files] I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}} did not exercise the API to get facet labels for specific dimension - {{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I added the required changes in [PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files] Re-opening the issue so that you can take a look. was (Author: goankur): Thanks [~mikemccand] for merging the [PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files] I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}} did not exercise the API to get facet labels for specific dimension - {{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I added the required changes in [PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files] Re-opening the issue so that you can take a look. > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Fix For: master (9.0), 8.7 > > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 4.5h > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203559#comment-17203559 ] Ankur edited comment on LUCENE-9444 at 9/29/20, 12:05 AM: -- Thanks [~mikemccand] for merging the [PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files] I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}} did not exercise the API to get facet labels for specific dimension - {{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I added the required changes in [PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files] Re-opening the issue so that you can take a look. was (Author: goankur): Thanks [~mikemccand] for merging the [PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files] I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}} did not exercise the API to get facet labels for specific dimension - {{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I made changes in [PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files] Can you please take a look ? > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Fix For: master (9.0), 8.7 > > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 4.5h > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200529#comment-17200529 ] Ankur edited comment on LUCENE-9444 at 9/25/20, 5:50 PM: - Thanks [~mikemccand], I incorporated the code review feedback and * Replaced {{assert}} with {{IllegalArgumentException}} for invalid inputs * Added javadoc notes explaining that returned _FacetLabels_ may not be in the same order in which they were indexed. * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method to exercise the API to get facet labels for each matching document. * Here is the updated PR - https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae was (Author: goankur): Thanks [~mikemccand], I incorporated the code review feedback and * Replaced {{assert} with {{IllegalArgumentException}} for invalid inputs * Added javadoc notes explaining that returned _FacetLabels_ may not be in the same order in which they were indexed. * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method to exercise the API to get facet labels for each matching document. * Here is the updated PR - https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 3h 40m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200529#comment-17200529 ] Ankur edited comment on LUCENE-9444 at 9/23/20, 4:34 AM: - Thanks [~mikemccand], I incorporated the code review feedback and * Replaced {{assert} with {{IllegalArgumentException}} for invalid inputs * Added javadoc notes explaining that returned _FacetLabels_ may not be in the same order in which they were indexed. * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method to exercise the API to get facet labels for each matching document. * Here is the updated PR - https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae was (Author: goankur): [~mikemccand], I incorporated the code review feedback and * Replaced {{assert} with {{IllegalArgumentException}} for invalid inputs * Added javadoc notes explaining that returned _FacetLabels_ may not be in the same order in which they were indexed. * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method to exercise the API to get facet labels for each matching document. * Here is the updated PR - https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 1h 40m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198780#comment-17198780 ] Ankur edited comment on LUCENE-9444 at 9/19/20, 6:30 PM: - Thanks [~mikemccand] for making those changes. I made a couple of minor edits to _TaxonomyFacetLabels.java_ * Removed the reference to \{@link java.util.Iterator} as it is no longer used. * Fixed typo in javadoc. * Replaced {code:java} if (parentOrd == INVALID_ORDINAL) { throw new AssertionError("Root ordinal not found for facet dimension: " + facetDimension); }{code} with single line {code:java} assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension; {code} in method {code:java} public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException{code} * Created a pull request as you suggested in one of your earlier comments :) ** [https://github.com/apache/lucene-solr/pull/1893/commits/bf8eaf98901cbe83f23067bea90dfb2f3102603a] Can you take a look and see if it's ready to be committed ? was (Author: goankur): Thanks [~mikemccand] for making those changes. I made a couple of minor edits * Fixed typo in javadoc * Replaced {code:java} if (parentOrd == INVALID_ORDINAL) { throw new AssertionError("Root ordinal not found for facet dimension: " + facetDimension); }{code} with single line {code:java} assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet dimension: " + facetDimension; {code} in method {code:java} public FacetLabel nextFacetLabel(int docId, String facetDimension) throws IOException{code} * Created a pull request as you suggested in one of your earlier comments :) ** [https://github.com/apache/lucene-solr/pull/1893/commits/bf8eaf98901cbe83f23067bea90dfb2f3102603a] Can you take a look and see if it's ready to be committed ? > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.patch, > LUCENE-9444.v2.patch > > Time Spent: 10m > Remaining Estimate: 0h > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193350#comment-17193350 ] Ankur edited comment on LUCENE-9444 at 9/10/20, 5:06 PM: - Thanks [~mikemccand]. I uploaded a new patch that incorporates the code review feed. The patch * Makes {{FacetLabelReader}} public as suggested. * Adds javadoc explaining why {{FacetLabelReader}} is not thread-safe. * Eliminates {{Iterator}} and replaces {{lookupLabels()}} methods with {{nextFacetLabel()}} methods that just return {{null}} if no more FacetLabels exist for input docId. * Adds {{@lucene.experimental}} to class level javadocs. * Enhances {{TestTaxonomyLabels.testBasic()}} method to check that fetching FacetLabels in decreasing docId order throws {{AssertionError.}} was (Author: goankur): Thanks [~mikemccand]. I uploaded a new patch that incorporates the code review feed. The patch * Makes {{FacetLabelReader}} public as suggested. * Adds javadoc to explaining that {{FacetLabelReader}} is not thread-safe. * Eliminates {{Iterator}} and replaces {{lookupLabels()}} methods with {{nextFacetLabel()}} methods that just return {{null}} if no more FacetLabels exist for input docId. * Adds {{@lucene.experimental}} to class level javadocs. * Enhances {{TestTaxonomyLabels.testBasic()}} method to check that fetching FacetLabels in decreasing docId order throws {{AssertionError.}} > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch, LUCENE-9444.v2.patch > > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190342#comment-17190342 ] Ankur edited comment on LUCENE-9444 at 9/3/20, 6:09 PM: Patch has been available for 1+ day, not sure why automated patch testing has not picked it up yet. was (Author: goankur): Patch has been available for 1+ day, not sure why automated patch testing has picked it up yet. > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch > > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618 ] Ankur edited comment on LUCENE-9444 at 9/2/20, 6:35 PM: Here is a patch that adds a new utility class {{TaxonomyFacetLabels}} with a single method {{getFacetLabelReader(LeafReaderContext)}} that returns an instance of nested class {{FacetLabelReader}}. It uses an instance of {{OrdinalsSegmentReader}} to fetch and decode ordinals for input docid into a reusable buffer and returns an {{Iterator}} that uses {{TaxonomyReader}} to lookup and return {{FacetLabels}} for each ordinal. The patch also adds a new test case {{TestTaxonomyLabels}} demonstrating the usage. was (Author: goankur): Here is a patch that adds a new utility class {{TaxonomyFacetLabels}} with a single method {{getFacetLabelReader(LeafReaderContext)}} that returns an instance of nested class {{FacetLabelReader}}. It uses an instance of {{OrdinalsSegmentReader}} to fetch and decode ordinals for input docid into a reusable buffer and returns an {{Iterator}} that uses {{TaxonomyReader}} to lookup and return {{FacetLabels}} for each ordinal. The patch also adds a new test case {{TestTaxonomyLabels}} demonstrating the usage. > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch > > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618 ] Ankur edited comment on LUCENE-9444 at 9/2/20, 6:33 PM: Here is a patch that adds a new utility class {{TaxonomyFacetLabels}} with a single method {{getFacetLabelReader(LeafReaderContext)}} that returns an instance of nested class {{FacetLabelReader}}. It uses an instance of {{OrdinalsSegmentReader}} to fetch and decode ordinals for input docid into a reusable buffer and returns an {{Iterator}} that uses {{TaxonomyReader}} to lookup and return {{FacetLabels}} for each ordinal. The patch also adds a new test case {{TestTaxonomyLabels}} demonstrating the usage. was (Author: goankur): Here is a patch that adds a new utility class __TaxonomyFacetLabels__ with a single method {code:java} getFacetLabelReader(LeafReaderContext){code} that returns an instance of nested class {code:java} FacetLabelReader{code} It uses an instance of {code:java} OrdinalsSegmentReader{code} to fetch and decode ordinals for the input docid into a reusable buffer and returns an {code:java} Iterator{code} that uses {code:java} TaxonomyReader{code} to lookup {code:java} FacetLabels{code} for each ordinal. The patch also adds a new test case {code:java} TestTaxonomyLabels{code} demonstrating the usage. > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch > > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618 ] Ankur edited comment on LUCENE-9444 at 9/2/20, 6:28 PM: Here is a patch that adds a new utility class __TaxonomyFacetLabels__ with a single method {code:java} getFacetLabelReader(LeafReaderContext){code} that returns an instance of nested class {code:java} FacetLabelReader{code} It uses an instance of {code:java} OrdinalsSegmentReader{code} to fetch and decode ordinals for the input docid into a reusable buffer and returns an {code:java} Iterator{code} that uses {code:java} TaxonomyReader{code} to lookup {code:java} FacetLabels{code} for each ordinal. The patch also adds a new test case {code:java} TestTaxonomyLabels{code} demonstrating the usage. was (Author: goankur): Here is a patch that adds a new utility class {code:java} {code} _TaxonomyFacetLabels_with a single method {code:java} getFacetLabelReader(LeafReaderContext){code} that returns an instance of nested class {code:java} FacetLabelReader{code} It uses an instance of {code:java} OrdinalsSegmentReader{code} to fetch and decode ordinals for the input docid into a reusable buffer and returns an {code:java} Iterator{code} that uses {code:java} TaxonomyReader{code} to lookup {code:java} FacetLabels{code} for each ordinal. The patch also adds a new test case {code:java} TestTaxonomyLabels{code} demonstrating the usage. > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch > > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618 ] Ankur edited comment on LUCENE-9444 at 9/2/20, 6:27 PM: Here is a patch that adds a new utility class {code:java} {code} _TaxonomyFacetLabels_with a single method {code:java} getFacetLabelReader(LeafReaderContext){code} that returns an instance of nested class {code:java} FacetLabelReader{code} It uses an instance of {code:java} OrdinalsSegmentReader{code} to fetch and decode ordinals for the input docid into a reusable buffer and returns an {code:java} Iterator{code} that uses {code:java} TaxonomyReader{code} to lookup {code:java} FacetLabels{code} for each ordinal. The patch also adds a new test case {code:java} TestTaxonomyLabels{code} demonstrating the usage. was (Author: goankur): Here is a patch that adds a new utility class - `TaxonomyFacetLabels` with the method - `getFacetLabelReader(LeafReaderContext)` that returns an instance of nested class - `FacetLabelReader`. `FacetLabelReader` uses an instance of `OrdinalsSegmentReader` to fetch and decode ordinals for the input docid into a reusable buffer and returns an `Iterator` that uses `TaxonomyReader` instance to lookup `FacetLabels`. The patch also adds a new test case `TestTaxonomyLabels` demonstrating the usage. > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > Labels: facet > Attachments: LUCENE-9444.patch > > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171215#comment-17171215 ] Ankur edited comment on LUCENE-9444 at 8/5/20, 1:41 AM: Thanks for your response [~mikemccand] Yes, having _dim_ as additional parameter makes sense. Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a concrete implementation - *_TaxonomyFacetsLabels_* of abstract class *_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* which will then be used to instantiate and reuse the BinaryDocValues iterator between multiple calls to *getLabels(docId, dim).* That way a caller does not need to know if a _*BinaryDocValues*_ field existed at all. The downside is that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for each different *_LeafReaderContext._* But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd argument to *getLabels().* In order to take care of hierarchical fields, I think it makes sense to return FacetLabel[] instead of String[]. The proposed API signature would look like this {{public FacetLabel[] getLabels(int docId, String dim, BinaryDocValues dv)}} was (Author: goankur): Thanks for your response [~mikemccand] Yes, having _dim_ as additional parameter makes sense. Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a concrete implementation - *_TaxonomyFacetsLabels_* of abstract class *_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* which will then be used to instantiate and reuse the BinaryDocValues iterator between multiple calls to *getLabels(docId, dim).* That way a caller does not need to know if a _*BinaryDocValues*_ field existed at all. The downside is that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for each different *_LeafReaderContext._* But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd argument to *getLabels().* In order to take care of hierarchical fields, I think it makes sense to return FacetLabel[] instead of String[]. The proposed API signature would look like this {{public FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}} > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171215#comment-17171215 ] Ankur edited comment on LUCENE-9444 at 8/5/20, 1:40 AM: Thanks for your response [~mikemccand] Yes, having _dim_ as additional parameter makes sense. Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a concrete implementation - *_TaxonomyFacetsLabels_* of abstract class *_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* which will then be used to instantiate and reuse the BinaryDocValues iterator between multiple calls to *getLabels(docId, dim).* That way a caller does not need to know if a _*BinaryDocValues*_ field existed at all. The downside is that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for each different *_LeafReaderContext._* But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd argument to *getLabels().* In order to take care of hierarchical fields, I think it makes sense to return FacetLabel[] instead of String[]. The proposed API signature would look like this {{public FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}} was (Author: goankur): Thanks for your response [~mikemccand] Yes, having _dim_ as additional parameter makes sense. Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a concrete implementation - *_TaxonomyFacetsLabels_* of abstract class *_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* which will then be used to instantiate and reuse the BinaryDocValues iterator between multiple calls to *getLabels(docId, dim).* That way a caller does not need to know if a _*BinaryDocValues*_ field existed at all. The downside is that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for each different *_LeafReaderContext._* But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd argument to *getLabels().* In order to take care of hierarchical fields, I think it makes sense to return FacetLabel[] instead of String[]. The proposed API signature would look like this {{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}} > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document
[ https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171215#comment-17171215 ] Ankur edited comment on LUCENE-9444 at 8/5/20, 1:38 AM: Thanks for your response [~mikemccand] Yes, having _dim_ as additional parameter makes sense. Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a concrete implementation - *_TaxonomyFacetsLabels_* of abstract class *_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* which will then be used to instantiate and reuse the BinaryDocValues iterator between multiple calls to *getLabels(docId, dim).* That way a caller does not need to know if a _*BinaryDocValues*_ field existed at all. The downside is that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for each different *_LeafReaderContext._* But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd argument to *getLabels().* In order to take care of hierarchical fields, I think it makes sense to return FacetLabel[] instead of String[]. The proposed API signature would look like this {{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}} was (Author: goankur): Thanks for your response [~mikemccand] Yes, having _dim_ as additional parameter makes sense. Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a concrete implementation - *_TaxonomyFacetsLabels_* of abstract class *_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* which will then be used to instantiate and reuse the BinaryDocValues iterator between multiple calls to *getLabels(docId, dim).* That way a caller does not need to know if a _*BinaryDocValues*_ field existed at all. The downside is that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for each different *_LeafReaderContext._* But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd argument to *getLabels().* In order to take care of hierarchical fields, I think it makes sense to return FacetLabel[] instead of String[]. One last thing, should we make the API _*static*_ ? The proposed API signature would look like this {{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}} > Need an API to easily fetch facet labels for a field in a document > -- > > Key: LUCENE-9444 > URL: https://issues.apache.org/jira/browse/LUCENE-9444 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.6 >Reporter: Ankur >Priority: Major > > A facet field may be included in the list of fields whose values are to be > returned for each hit. > In order to get the facet labels for each hit we need to > # Create an instance of _DocValuesOrdinalsReader_ and invoke > _getReader(LeafReaderContext context)_ method to obtain an instance of > _OrdinalsSegmentReader()_ > # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then > used to fetch and decode the binary payload in the document's BinaryDocValues > field. This provides the ordinals that refer to facet labels in the > taxonomy.** > # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be > returned. > > Ideally there should be a simple API - *String[] getLabels(docId)* that hides > all the above details and gives us the string labels. This can be part of > *TaxonomyFacets* but that's just one idea. > I am opening this issue to get community feedback and suggestions. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org