[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-28 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203559#comment-17203559
 ] 

Ankur edited comment on LUCENE-9444 at 9/29/20, 12:06 AM:
--

Thanks [~mikemccand] for merging the 
[PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files]

I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}}  
did not exercise the API to get facet labels for specific dimension -  
{{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I added the 
required changes in 
[PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files]

 Re-opening the issue so that you can take a look.


was (Author: goankur):
Thanks [~mikemccand] for merging the 
[PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files]

I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}}  
did not exercise the API to get facet labels for specific dimension -  
{{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I added the 
required changes in 
[PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files]

 

Re-opening the issue so that you can take a look.

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Fix For: master (9.0), 8.7
>
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-28 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203559#comment-17203559
 ] 

Ankur edited comment on LUCENE-9444 at 9/29/20, 12:05 AM:
--

Thanks [~mikemccand] for merging the 
[PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files]

I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}}  
did not exercise the API to get facet labels for specific dimension -  
{{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I added the 
required changes in 
[PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files]

 

Re-opening the issue so that you can take a look.


was (Author: goankur):
Thanks [~mikemccand] for merging the 
[PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files]

I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}}  
did not exercise the API to get facet labels for specific dimension -  
{{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I made changes 
in [PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files] Can you 
please take a look ?

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Fix For: master (9.0), 8.7
>
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-25 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200529#comment-17200529
 ] 

Ankur edited comment on LUCENE-9444 at 9/25/20, 5:50 PM:
-

Thanks [~mikemccand], I incorporated the code review feedback and
 * Replaced {{assert}} with {{IllegalArgumentException}}  for invalid inputs
 * Added javadoc notes explaining that returned _FacetLabels_ may not be in the 
same order in which they were indexed.
 * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method  to exercise the API 
to get facet labels for each matching document. 
  * Here is the updated PR - 
https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae

 


was (Author: goankur):
Thanks [~mikemccand], I incorporated the code review feedback and
 * Replaced {{assert} with {{IllegalArgumentException}}  for invalid inputs
 * Added javadoc notes explaining that returned _FacetLabels_ may not be in the 
same order in which they were indexed.
 * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method  to exercise the API 
to get facet labels for each matching document. 
  * Here is the updated PR - 
https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-22 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200529#comment-17200529
 ] 

Ankur edited comment on LUCENE-9444 at 9/23/20, 4:34 AM:
-

Thanks [~mikemccand], I incorporated the code review feedback and
 * Replaced {{assert} with {{IllegalArgumentException}}  for invalid inputs
 * Added javadoc notes explaining that returned _FacetLabels_ may not be in the 
same order in which they were indexed.
 * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method  to exercise the API 
to get facet labels for each matching document. 
  * Here is the updated PR - 
https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae

 


was (Author: goankur):
[~mikemccand], I incorporated the code review feedback and
 * Replaced {{assert} with {{IllegalArgumentException}}  for invalid inputs
 * Added javadoc notes explaining that returned _FacetLabels_ may not be in the 
same order in which they were indexed.
 * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method  to exercise the API 
to get facet labels for each matching document. 
  * Here is the updated PR - 
https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-19 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198780#comment-17198780
 ] 

Ankur edited comment on LUCENE-9444 at 9/19/20, 6:30 PM:
-

Thanks [~mikemccand] for making those changes. I made a couple of minor edits 
to _TaxonomyFacetLabels.java_
 * Removed the reference to \{@link java.util.Iterator} as it is no longer used.
 * Fixed typo in javadoc.
 * Replaced 
{code:java}
 if (parentOrd == INVALID_ORDINAL) {
throw new AssertionError("Root ordinal not found for facet dimension: " 
+ facetDimension);
  }{code}
with single line
{code:java}
assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet 
dimension: " + facetDimension; {code}
 in method
{code:java}
public FacetLabel nextFacetLabel(int docId, String facetDimension) throws 
IOException{code}

 * Created a pull request as you suggested in one of your earlier comments :)
 ** 
[https://github.com/apache/lucene-solr/pull/1893/commits/bf8eaf98901cbe83f23067bea90dfb2f3102603a]

Can you take a look and see if it's ready to be committed ?


was (Author: goankur):
Thanks [~mikemccand] for making those changes. I made a couple of minor edits
 * Fixed typo in javadoc
 * Replaced 
{code:java}
 if (parentOrd == INVALID_ORDINAL) {
throw new AssertionError("Root ordinal not found for facet dimension: " 
+ facetDimension);
  }{code}
with single line
{code:java}
assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet 
dimension: " + facetDimension; {code}
 in method
{code:java}
public FacetLabel nextFacetLabel(int docId, String facetDimension) throws 
IOException{code}

 * Created a pull request as you suggested in one of your earlier comments :)
 ** 
[https://github.com/apache/lucene-solr/pull/1893/commits/bf8eaf98901cbe83f23067bea90dfb2f3102603a]

Can you take a look and see  if it's ready to be committed ?

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-10 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193350#comment-17193350
 ] 

Ankur edited comment on LUCENE-9444 at 9/10/20, 5:06 PM:
-

Thanks [~mikemccand]. I uploaded a new patch that incorporates the code review 
feed. The patch
 * Makes {{FacetLabelReader}} public as suggested.
 * Adds javadoc explaining why {{FacetLabelReader}} is not thread-safe.
 * Eliminates {{Iterator}} and replaces {{lookupLabels()}} methods with 
{{nextFacetLabel()}} methods that just return {{null}} if no more FacetLabels 
exist for input docId.
 * Adds {{@lucene.experimental}} to class level javadocs.
 * Enhances {{TestTaxonomyLabels.testBasic()}} method to check that fetching 
FacetLabels in decreasing docId order throws {{AssertionError.}}


was (Author: goankur):
Thanks [~mikemccand]. I uploaded a new patch that incorporates the code review 
feed. The patch
 * Makes {{FacetLabelReader}} public as suggested.
 * Adds javadoc to explaining that {{FacetLabelReader}} is not thread-safe.
 * Eliminates {{Iterator}} and replaces {{lookupLabels()}} methods with 
{{nextFacetLabel()}} methods that just return {{null}} if no more FacetLabels 
exist for input docId.
 * Adds {{@lucene.experimental}} to class level javadocs.
 * Enhances {{TestTaxonomyLabels.testBasic()}} method to check that fetching 
FacetLabels in decreasing docId order throws {{AssertionError.}}

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.v2.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-03 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190342#comment-17190342
 ] 

Ankur edited comment on LUCENE-9444 at 9/3/20, 6:09 PM:


Patch has been available for 1+ day, not sure why automated patch testing has 
not picked it up yet.


was (Author: goankur):
Patch has been available for 1+ day, not sure why automated patch testing has 
picked it up yet.

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-02 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618
 ] 

Ankur edited comment on LUCENE-9444 at 9/2/20, 6:35 PM:


Here is a patch that adds a new utility class {{TaxonomyFacetLabels}}
 with a single method {{getFacetLabelReader(LeafReaderContext)}} that returns 
an instance of nested class {{FacetLabelReader}}. 

It uses an instance of {{OrdinalsSegmentReader}} to fetch and decode ordinals 
for input docid into a reusable buffer and returns an {{Iterator}} that uses 
{{TaxonomyReader}} to lookup and return {{FacetLabels}} for each ordinal.

The patch also adds a new test case {{TestTaxonomyLabels}} demonstrating the 
usage. 


was (Author: goankur):
Here is a patch that adds a new utility class {{TaxonomyFacetLabels}}
with a single method {{getFacetLabelReader(LeafReaderContext)}} that returns an 
instance of nested class {{FacetLabelReader}}. 

It uses an instance of {{OrdinalsSegmentReader}} to fetch and decode ordinals 
for input docid into a reusable buffer and returns an {{Iterator}} that uses 
{{TaxonomyReader}} to lookup and return {{FacetLabels}} for each ordinal.

 The patch also adds a new test case {{TestTaxonomyLabels}} demonstrating the 
usage. 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-02 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618
 ] 

Ankur edited comment on LUCENE-9444 at 9/2/20, 6:33 PM:


Here is a patch that adds a new utility class {{TaxonomyFacetLabels}}
with a single method {{getFacetLabelReader(LeafReaderContext)}} that returns an 
instance of nested class {{FacetLabelReader}}. 

It uses an instance of {{OrdinalsSegmentReader}} to fetch and decode ordinals 
for input docid into a reusable buffer and returns an {{Iterator}} that uses 
{{TaxonomyReader}} to lookup and return {{FacetLabels}} for each ordinal.

 The patch also adds a new test case {{TestTaxonomyLabels}} demonstrating the 
usage. 


was (Author: goankur):
Here is a patch that adds a new utility class __TaxonomyFacetLabels__ with a 
single method
{code:java}
getFacetLabelReader(LeafReaderContext){code}
that returns an instance of nested class

 

 
{code:java}
FacetLabelReader{code}
 

It uses an instance of
{code:java}
OrdinalsSegmentReader{code}
to fetch and decode ordinals for the input docid into a reusable buffer and 
returns an
{code:java}
Iterator{code}
that uses
{code:java}
TaxonomyReader{code}
to lookup
{code:java}
FacetLabels{code}
 for each ordinal.

 

The patch also adds a new test case
{code:java}
TestTaxonomyLabels{code}
demonstrating the usage. 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-02 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618
 ] 

Ankur edited comment on LUCENE-9444 at 9/2/20, 6:28 PM:


Here is a patch that adds a new utility class __TaxonomyFacetLabels__ with a 
single method
{code:java}
getFacetLabelReader(LeafReaderContext){code}
that returns an instance of nested class

 

 
{code:java}
FacetLabelReader{code}
 

It uses an instance of
{code:java}
OrdinalsSegmentReader{code}
to fetch and decode ordinals for the input docid into a reusable buffer and 
returns an
{code:java}
Iterator{code}
that uses
{code:java}
TaxonomyReader{code}
to lookup
{code:java}
FacetLabels{code}
 for each ordinal.

 

The patch also adds a new test case
{code:java}
TestTaxonomyLabels{code}
demonstrating the usage. 


was (Author: goankur):
Here is a patch that adds a new utility class 
{code:java}

{code}
_TaxonomyFacetLabels_with a single method
{code:java}
getFacetLabelReader(LeafReaderContext){code}
that returns an instance of nested class

 

 
{code:java}
FacetLabelReader{code}
 

It uses an instance of
{code:java}
OrdinalsSegmentReader{code}
to fetch and decode ordinals for the input docid into a reusable buffer and 
returns an
{code:java}
Iterator{code}
that uses
{code:java}
TaxonomyReader{code}
to lookup
{code:java}
FacetLabels{code}
 for each ordinal.

 

The patch also adds a new test case
{code:java}
TestTaxonomyLabels{code}
demonstrating the usage. 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-02 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618
 ] 

Ankur edited comment on LUCENE-9444 at 9/2/20, 6:27 PM:


Here is a patch that adds a new utility class 
{code:java}

{code}
_TaxonomyFacetLabels_with a single method
{code:java}
getFacetLabelReader(LeafReaderContext){code}
that returns an instance of nested class

 

 
{code:java}
FacetLabelReader{code}
 

It uses an instance of
{code:java}
OrdinalsSegmentReader{code}
to fetch and decode ordinals for the input docid into a reusable buffer and 
returns an
{code:java}
Iterator{code}
that uses
{code:java}
TaxonomyReader{code}
to lookup
{code:java}
FacetLabels{code}
 for each ordinal.

 

The patch also adds a new test case
{code:java}
TestTaxonomyLabels{code}
demonstrating the usage. 


was (Author: goankur):
Here is a patch that adds a new utility class - `TaxonomyFacetLabels` with the 
method - `getFacetLabelReader(LeafReaderContext)` that returns an instance of 
nested class - `FacetLabelReader`.

`FacetLabelReader` uses an instance of `OrdinalsSegmentReader` to fetch and 
decode ordinals for the input docid into a reusable buffer and returns an 
`Iterator` that uses `TaxonomyReader` instance to lookup `FacetLabels`.

The patch also adds a new test case `TestTaxonomyLabels` demonstrating the 
usage. 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-08-04 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171215#comment-17171215
 ] 

Ankur edited comment on LUCENE-9444 at 8/5/20, 1:41 AM:


Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().* In order to take care of hierarchical fields, I 
think it makes sense to return FacetLabel[] instead of String[].

The proposed API signature would look like this

{{public FacetLabel[] getLabels(int docId, String dim, BinaryDocValues dv)}}

 


was (Author: goankur):
Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().* In order to take care of hierarchical fields, I 
think it makes sense to return FacetLabel[] instead of String[].

The proposed API signature would look like this

{{public FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-08-04 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171215#comment-17171215
 ] 

Ankur edited comment on LUCENE-9444 at 8/5/20, 1:40 AM:


Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().* In order to take care of hierarchical fields, I 
think it makes sense to return FacetLabel[] instead of String[].

The proposed API signature would look like this

{{public FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}

 


was (Author: goankur):
Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().* In order to take care of hierarchical fields, I 
think it makes sense to return FacetLabel[] instead of String[].

The proposed API signature would look like this

{{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-08-04 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171215#comment-17171215
 ] 

Ankur edited comment on LUCENE-9444 at 8/5/20, 1:38 AM:


Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().* In order to take care of hierarchical fields, I 
think it makes sense to return FacetLabel[] instead of String[].

The proposed API signature would look like this

{{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}

 


was (Author: goankur):
Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().*

In order to take care of hierarchical fields, I think it makes sense to return 
FacetLabel[] instead of String[].

One last thing, should we make the API _*static*_ ?

The proposed API signature would look like this

{{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org