[jira] [Commented] (LUCENE-10134) TestSortedSetDocValuesFacets fails with Bits shared between threads

2021-09-30 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17423047#comment-17423047
 ] 

Ankur commented on LUCENE-10134:


[~dweiss]

Thanks for cleaning up the resource leaks on error. Here is a PR with the fix.

[https://github.com/apache/lucene/pull/345/files]

The issue did not surface in my local testing using `./gradlew precommit` and 
`./gradlew test`.

I wonder if I am missing a test step that could have helped catch this error 
during development.

Was this error produced by randomized tests running in Lucene nightly 
benchmarks ?

 

> TestSortedSetDocValuesFacets fails with Bits shared between threads
> ---
>
> Key: LUCENE-10134
> URL: https://issues.apache.org/jira/browse/LUCENE-10134
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Reporter: Dawid Weiss
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Repro:
> {code}
> gradlew -p lucene\facet test -Ptests.seed=5E8A2F2BBCCBDF1B
> {code}
> {code}
> org.apache.lucene.facet.sortedset.TestSortedSetDocValuesFacets > testCountAll 
> FAILED
> java.lang.AssertionError: Bits are only supposed to be consumed in the 
> thread in which they have been acquired. But
> was acquired in 
> Thread[TEST-TestSortedSetDocValuesFacets.testCountAll-seed#[5E8A2F2BBCCBDF1B],5,TGRP-TestSortedSetDocValuesFacets]
>  and consumed in 
> Thread[TestIndexSearcher-2-thread-1,5,TGRP-TestSortedSetDocValuesFacets].
> at 
> __randomizedtesting.SeedInfo.seed([5E8A2F2BBCCBDF1B:FFB0A39BEF474EA3]:0)
> at 
> org.apache.lucene.index.AssertingLeafReader.assertThread(AssertingLeafReader.java:43)
> at 
> org.apache.lucene.index.AssertingLeafReader$AssertingBits.get(AssertingLeafReader.java:1374)
> at org.apache.lucene.facet.FacetUtils$1.doNext(FacetUtils.java:62)
> at org.apache.lucene.facet.FacetUtils$1.nextDoc(FacetUtils.java:70)
> at 
> org.apache.lucene.facet.sortedset.ConcurrentSortedSetDocValuesFacetCounts$CountOneSegment.call(ConcurrentSortedSetDocValuesFacetCounts.java:260)
> at 
> org.apache.lucene.facet.sortedset.ConcurrentSortedSetDocValuesFacetCounts$CountOneSegment.call(ConcurrentSortedSetDocValuesFacetCounts.java:159)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
> at java.base/java.lang.Thread.run(Thread.java:832)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10070) "count all" faceting functionality counts deleted docs for multiple implementations

2021-09-10 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413450#comment-17413450
 ] 

Ankur edited comment on LUCENE-10070 at 9/11/21, 1:12 AM:
--

[~gsmiller] Thanks for taking a look at the above PR. 

I incorporated the feedback in a different PR due to GIT related issues at my 
end.

Request folks to continue the conversation there.

[https://github.com/apache/lucene/pull/293/files]


was (Author: goankur):
[~gsmiller] Thanks for taking a look at the above PR. 

I incorporated your feedback into the changes capture in a different PR due to 
GIT related issues at my end.

Request you to continue the conversation there.

[https://github.com/apache/lucene/pull/293/files]

> "count all" faceting functionality counts deleted docs for multiple 
> implementations
> ---
>
> Key: LUCENE-10070
> URL: https://issues.apache.org/jira/browse/LUCENE-10070
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Reporter: Greg Miller
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A few different {{Facets}} implementations supporting a "count all" style 
> constructor that allows the user to not pass in a {{FacetsCollector}} 
> instance. It advertises that it's equivalent to using a {{FacetsCollector}} 
> populated with a {{MatchAllDocsQuery}}, but more efficient. It looks like, 
> with the exception of {{FastTaxonomyFacetCounts}}, none of the 
> implementations correctly account for deleted documents (have a look at 
> {{FastTaxonomyFacetCounts}} for a correct example that consults "live docs."
> From what I can tell, the affected implementations are:
>  * SortedSetDocValueFacetCounts
>  * ConcurrentSortedSetDocValueFacetCounts
>  * LongValueFacetCounts
>  * StringValueFacetCounts
> I'll attach a PR shortly illustrating unit tests I wrote that confirm the bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10070) "count all" faceting functionality counts deleted docs for multiple implementations

2021-09-10 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413450#comment-17413450
 ] 

Ankur edited comment on LUCENE-10070 at 9/11/21, 1:11 AM:
--

[~gsmiller] Thanks for taking a look at the above PR. 

I incorporated your feedback into the changes capture in a different PR due to 
GIT related issues at my end.

Request you to continue the conversation there.

[https://github.com/apache/lucene/pull/293/files]


was (Author: goankur):
[~gsmiller] Thanks for taking a look the the above PR. 

I incorporated your feedback into the changes capture in a different PR due to 
GIT related issues at my end.

Request you to continue the conversation there.

https://github.com/apache/lucene/pull/293/files

> "count all" faceting functionality counts deleted docs for multiple 
> implementations
> ---
>
> Key: LUCENE-10070
> URL: https://issues.apache.org/jira/browse/LUCENE-10070
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Reporter: Greg Miller
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A few different {{Facets}} implementations supporting a "count all" style 
> constructor that allows the user to not pass in a {{FacetsCollector}} 
> instance. It advertises that it's equivalent to using a {{FacetsCollector}} 
> populated with a {{MatchAllDocsQuery}}, but more efficient. It looks like, 
> with the exception of {{FastTaxonomyFacetCounts}}, none of the 
> implementations correctly account for deleted documents (have a look at 
> {{FastTaxonomyFacetCounts}} for a correct example that consults "live docs."
> From what I can tell, the affected implementations are:
>  * SortedSetDocValueFacetCounts
>  * ConcurrentSortedSetDocValueFacetCounts
>  * LongValueFacetCounts
>  * StringValueFacetCounts
> I'll attach a PR shortly illustrating unit tests I wrote that confirm the bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10070) "count all" faceting functionality counts deleted docs for multiple implementations

2021-09-10 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413450#comment-17413450
 ] 

Ankur commented on LUCENE-10070:


[~gsmiller] Thanks for taking a look the the above PR. 

I incorporated your feedback into the changes capture in a different PR due to 
GIT related issues at my end.

Request you to continue the conversation there.

https://github.com/apache/lucene/pull/293/files

> "count all" faceting functionality counts deleted docs for multiple 
> implementations
> ---
>
> Key: LUCENE-10070
> URL: https://issues.apache.org/jira/browse/LUCENE-10070
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Reporter: Greg Miller
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A few different {{Facets}} implementations supporting a "count all" style 
> constructor that allows the user to not pass in a {{FacetsCollector}} 
> instance. It advertises that it's equivalent to using a {{FacetsCollector}} 
> populated with a {{MatchAllDocsQuery}}, but more efficient. It looks like, 
> with the exception of {{FastTaxonomyFacetCounts}}, none of the 
> implementations correctly account for deleted documents (have a look at 
> {{FastTaxonomyFacetCounts}} for a correct example that consults "live docs."
> From what I can tell, the affected implementations are:
>  * SortedSetDocValueFacetCounts
>  * ConcurrentSortedSetDocValueFacetCounts
>  * LongValueFacetCounts
>  * StringValueFacetCounts
> I'll attach a PR shortly illustrating unit tests I wrote that confirm the bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10070) "count all" faceting functionality counts deleted docs for multiple implementations

2021-09-03 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17409785#comment-17409785
 ] 

Ankur commented on LUCENE-10070:


New PR available - [https://github.com/apache/lucene/pull/282/files]

 

> "count all" faceting functionality counts deleted docs for multiple 
> implementations
> ---
>
> Key: LUCENE-10070
> URL: https://issues.apache.org/jira/browse/LUCENE-10070
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Reporter: Greg Miller
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A few different {{Facets}} implementations supporting a "count all" style 
> constructor that allows the user to not pass in a {{FacetsCollector}} 
> instance. It advertises that it's equivalent to using a {{FacetsCollector}} 
> populated with a {{MatchAllDocsQuery}}, but more efficient. It looks like, 
> with the exception of {{FastTaxonomyFacetCounts}}, none of the 
> implementations correctly account for deleted documents (have a look at 
> {{FastTaxonomyFacetCounts}} for a correct example that consults "live docs."
> From what I can tell, the affected implementations are:
>  * SortedSetDocValueFacetCounts
>  * ConcurrentSortedSetDocValueFacetCounts
>  * LongValueFacetCounts
>  * StringValueFacetCounts
> I'll attach a PR shortly illustrating unit tests I wrote that confirm the bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10070) "count all" faceting functionality counts deleted docs for multiple implementations

2021-09-03 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17409708#comment-17409708
 ] 

Ankur commented on LUCENE-10070:


Let me pick it up

> "count all" faceting functionality counts deleted docs for multiple 
> implementations
> ---
>
> Key: LUCENE-10070
> URL: https://issues.apache.org/jira/browse/LUCENE-10070
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Reporter: Greg Miller
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A few different {{Facets}} implementations supporting a "count all" style 
> constructor that allows the user to not pass in a {{FacetsCollector}} 
> instance. It advertises that it's equivalent to using a {{FacetsCollector}} 
> populated with a {{MatchAllDocsQuery}}, but more efficient. It looks like, 
> with the exception of {{FastTaxonomyFacetCounts}}, none of the 
> implementations correctly account for deleted documents (have a look at 
> {{FastTaxonomyFacetCounts}} for a correct example that consults "live docs."
> From what I can tell, the affected implementations are:
>  * SortedSetDocValueFacetCounts
>  * ConcurrentSortedSetDocValueFacetCounts
>  * LongValueFacetCounts
>  * StringValueFacetCounts
> I'll attach a PR shortly illustrating unit tests I wrote that confirm the bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10048) Bypass total frequency check if field uses custom term frequency

2021-08-12 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17398355#comment-17398355
 ] 

Ankur commented on LUCENE-10048:


Thanks for your response [~rcmuir].

Let me try to explain the use case at a high level
 # An offline (map-reduce style) batch process consumes a set of indexable 
documents.
 # The process also consumes terms and metadata information from external data 
sources.
 # For each indexable document, the batch-process computes a set of term-doc 
scores and add this set to a document field (to be indexed later). 
 # A document will only have a small number of such terms in a field, *less 
than 10K*.
 # There could be *many such fields* in a single document populated by 
different offline processes, all of which scale these values arbitrarily (due 
to historical reasons) but still make sure a single value fits in 4-bytes.
 # The document also has usual textual fields (title, description etc) for 
which Lucene computes term/field statistics and produces BM25 scores.
 # All of these scores are used by a ranking method.

You are referring to 
[Payloads|https://cwiki.apache.org/confluence/display/LUCENE/Payloads] right? 
It is a viable option but less space efficient (no delta compression) compared 
to storing these values directly as term-frequencies.

So only for fields that are populated by an external process, I am hoping we 
can come up with a mechanism to ignore the overflow checks on term/field 
statistics.

 

> Bypass total frequency check if field uses custom term frequency
> 
>
> Key: LUCENE-10048
> URL: https://issues.apache.org/jira/browse/LUCENE-10048
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Tony Xu
>Priority: Minor
>
> For all fields whose index option is not *IndexOptions.NONE*. There is a 
> check on per field total token count (i.e. field-length) to ensure we don't 
> index too many tokens. This is done by accumulating the token's 
> *TermFrequencyAttribute.*
>  
> Given that currently Lucene allows custom term frequency attached to each 
> token and the usage of the frequency can be pretty wild. It is possible to 
> have the following case where the check fails with only a few tokens that 
> have large frequencies. Currently Lucene will skip indexing the whole 
> document.
> *"foo| bar|"*
>  
> What should be way to inform the indexing chain not to check the field length?
> A related observation, when custom term frequency is in use, user is not 
> likely to use the similarity for this field. Maybe we can offer a way to 
> specify that, too?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10048) Bypass total frequency check if field uses custom term frequency

2021-08-11 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397771#comment-17397771
 ] 

Ankur commented on LUCENE-10048:


@[~rcmuir]

Consider the case where these term-document level scoring factors are computed 
in an offline process, indexed in Lucene and accessed at query time by a 
ranking function that does not rely on Lucene's 
[Scorer|https://lucene.apache.org/core/8_9_0/core/org/apache/lucene/search/Scorer.html]
 and 
[Similarity|https://lucene.apache.org/core/8_9_0/core/org/apache/lucene/search/similarities/Similarity.html]
 abstractions.

What is considered reasonable is up to the offline process that serves the 
needs of the ranking function and is outside our control. A single 
term-document scoring factor can still be less than {{Integer.MAX_VALUE}} but 
the sum of all such factors for a document could easily exceed the 
{{Integer.MAX_VALUE}} range.

Without this our only option (I think) is to use {{BinaryDocValues}} and 
implement mechanisms to serialize/deserialize term-document level scoring 
factors at indexing and searching time ourselves. With this we don't get the 
space efficiencies that come with the use of highly optimized terms dictionary 
and the integer compression techniques used to encode postings data (at least 
not without significant work).

Maybe we can keep the restriction on the custom term frequency to be less than 
{{Integer.MAX_VALUE}} but relax the check on per field total token count for 
the expert use case ?
  

 

> Bypass total frequency check if field uses custom term frequency
> 
>
> Key: LUCENE-10048
> URL: https://issues.apache.org/jira/browse/LUCENE-10048
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Tony Xu
>Priority: Minor
>
> For all fields whose index option is not *IndexOptions.NONE*. There is a 
> check on per field total token count (i.e. field-length) to ensure we don't 
> index too many tokens. This is done by accumulating the token's 
> *TermFrequencyAttribute.*
>  
> Given that currently Lucene allows custom term frequency attached to each 
> token and the usage of the frequency can be pretty wild. It is possible to 
> have the following case where the check fails with only a few tokens that 
> have large frequencies. Currently Lucene will skip indexing the whole 
> document.
> *"foo| bar|"*
>  
> What should be way to inform the indexing chain not to check the field length?
> A related observation, when custom term frequency is in use, user is not 
> likely to use the similarity for this field. Maybe we can offer a way to 
> specify that, too?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9385) Skip indexing facet drill down terms

2021-03-18 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304324#comment-17304324
 ] 

Ankur commented on LUCENE-9385:
---

Hi Zach,  I am not currently working on it. Feel free to grab it :)

> Skip indexing facet drill down terms
> 
>
> Key: LUCENE-9385
> URL: https://issues.apache.org/jira/browse/LUCENE-9385
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/facet
>Affects Versions: 8.5.2
>Reporter: Ankur
>Priority: Minor
>  Labels: easyfix
>
> FacetsConfig creates index terms from the Facet dimension and path 
> automatically for the purpose of supporting drill-down queries.
> An application that does not need drill-down ends up paying the index cost of 
> the extra terms.
> Ideally an option to skip indexing these drill down terms should be exposed 
> to the application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9838) simd version of VectorUtil.dotProduct

2021-03-15 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302157#comment-17302157
 ] 

Ankur edited comment on LUCENE-9838 at 3/16/21, 2:23 AM:
-

This is cool - [~rcmuir]. 

I played with this a little on my MacBook Pro (2019, *Memory*: 32 GB 2667 MHZ 
DDR4; *Processor*:  2.6 GHz 6-Core Intel Core i7) after downloading *OpenJDK 
build 16+36-2231* and setting up a standalone [JMH 
benchmark|https://github.com/openjdk/jmh] project.

I copied over the old dotProduct implementation and the new one from your patch 
to _MyBenchmark.java_ in the JMH project space. Here are the results I got
{code:java}
Benchmark  (size)   Mode  CntScore   Error   Units
MyBenchmark.dotProductOld  16  thrpt5   90.896 ± 5.302  ops/us
MyBenchmark.dotProductNew  16  thrpt5  100.901 ± 5.105  ops/us

MyBenchmark.dotProductOld  32  thrpt5   53.563 ± 2.378  ops/us
MyBenchmark.dotProductNew  32  thrpt5   97.610 ± 5.393  ops/us

MyBenchmark.dotProductOld  64  thrpt5   29.792 ± 1.246  ops/us
MyBenchmark.dotProductNew  64  thrpt5   73.499 ± 3.640  ops/us

MyBenchmark.dotProductOld 128  thrpt5   16.906 ± 0.751  ops/us
MyBenchmark.dotProductNew 128  thrpt5   65.068 ± 3.986  ops/us

MyBenchmark.dotProductOld 256  thrpt58.360 ± 0.125  ops/us
MyBenchmark.dotProductNew 256  thrpt5   42.595 ± 2.958  ops/us

MyBenchmark.dotProductOld 512  thrpt54.231 ± 0.158  ops/us
MyBenchmark.dotProductNew 512  thrpt5   26.283 ± 0.640  ops/us

MyBenchmark.dotProductOld1024  thrpt52.104 ± 0.093  ops/us
MyBenchmark.dotProductNew1024  thrpt5   14.389 ± 0.720  ops/us

{code}
 

These benchmarks were run after adding annotations to disable TieredCompilation 
and vector bounds check. Looks like for small vector size (*16 elements*) we 
see *10%* improvement but for large vectors (*128 or more* elements) the 
improvement is *_4X or higher._*


was (Author: goankur):
This is cool - [~rcmuir]. 

I played with this a little on my MacBook Pro (2019, *Memory*: 32 GB 2667 MHZ 
DDR4; *Processor*:  2.6 GHz 6-Core Intel Core i7) after downloading [OpenJDK 
build 
16+36-2231|https://download.java.net/java/GA/jdk16/7863447f0ab643c585b9bdebf67c69db/36/GPL/openjdk-16_osx-x64_bin.tar.gz]
 and setting up a standalone [JMH benchmark|https://github.com/openjdk/jmh] 
project.

I copied over the old dotProduct implementation and the new one from your patch 
to _MyBenchmark.java_ in the JMH project space. Here are the results I got
{code:java}
Benchmark  (size)   Mode  CntScore   Error   Units
MyBenchmark.dotProductOld  16  thrpt5   90.896 ± 5.302  ops/us
MyBenchmark.dotProductNew  16  thrpt5  100.901 ± 5.105  ops/us

MyBenchmark.dotProductOld  32  thrpt5   53.563 ± 2.378  ops/us
MyBenchmark.dotProductNew  32  thrpt5   97.610 ± 5.393  ops/us

MyBenchmark.dotProductOld  64  thrpt5   29.792 ± 1.246  ops/us
MyBenchmark.dotProductNew  64  thrpt5   73.499 ± 3.640  ops/us

MyBenchmark.dotProductOld 128  thrpt5   16.906 ± 0.751  ops/us
MyBenchmark.dotProductNew 128  thrpt5   65.068 ± 3.986  ops/us

MyBenchmark.dotProductOld 256  thrpt58.360 ± 0.125  ops/us
MyBenchmark.dotProductNew 256  thrpt5   42.595 ± 2.958  ops/us

MyBenchmark.dotProductOld 512  thrpt54.231 ± 0.158  ops/us
MyBenchmark.dotProductNew 512  thrpt5   26.283 ± 0.640  ops/us

MyBenchmark.dotProductOld1024  thrpt52.104 ± 0.093  ops/us
MyBenchmark.dotProductNew1024  thrpt5   14.389 ± 0.720  ops/us

{code}
 

These benchmarks were run after adding annotations to disable TieredCompilation 
and vector bounds check. Looks like for small vector size (*16 elements*) we 
see *10%* improvement but for large vectors (*128 or more* elements) the 
improvement is *_4X or higher._*

> simd version of VectorUtil.dotProduct
> -
>
> Key: LUCENE-9838
> URL: https://issues.apache.org/jira/browse/LUCENE-9838
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9838.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Followup to LUCENE-9837
> Let's explore using JDK 16 vector API to speed this up more. It might be a 
> hassle to try to MR-JAR/package up for users (adding commandline flags and 
> stuff), but it gives good performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9838) simd version of VectorUtil.dotProduct

2021-03-15 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302157#comment-17302157
 ] 

Ankur commented on LUCENE-9838:
---

This is cool - [~rcmuir]. 

I played with this a little on my MacBook Pro (2019, *Memory*: 32 GB 2667 MHZ 
DDR4; *Processor*:  2.6 GHz 6-Core Intel Core i7) after downloading [OpenJDK 
build 
16+36-2231|https://download.java.net/java/GA/jdk16/7863447f0ab643c585b9bdebf67c69db/36/GPL/openjdk-16_osx-x64_bin.tar.gz]
 and setting up a standalone [JMH benchmark|https://github.com/openjdk/jmh] 
project.

I copied over the old dotProduct implementation and the new one from your patch 
to _MyBenchmark.java_ in the JMH project space. Here are the results I got
{code:java}
Benchmark  (size)   Mode  CntScore   Error   Units
MyBenchmark.dotProductOld  16  thrpt5   90.896 ± 5.302  ops/us
MyBenchmark.dotProductNew  16  thrpt5  100.901 ± 5.105  ops/us

MyBenchmark.dotProductOld  32  thrpt5   53.563 ± 2.378  ops/us
MyBenchmark.dotProductNew  32  thrpt5   97.610 ± 5.393  ops/us

MyBenchmark.dotProductOld  64  thrpt5   29.792 ± 1.246  ops/us
MyBenchmark.dotProductNew  64  thrpt5   73.499 ± 3.640  ops/us

MyBenchmark.dotProductOld 128  thrpt5   16.906 ± 0.751  ops/us
MyBenchmark.dotProductNew 128  thrpt5   65.068 ± 3.986  ops/us

MyBenchmark.dotProductOld 256  thrpt58.360 ± 0.125  ops/us
MyBenchmark.dotProductNew 256  thrpt5   42.595 ± 2.958  ops/us

MyBenchmark.dotProductOld 512  thrpt54.231 ± 0.158  ops/us
MyBenchmark.dotProductNew 512  thrpt5   26.283 ± 0.640  ops/us

MyBenchmark.dotProductOld1024  thrpt52.104 ± 0.093  ops/us
MyBenchmark.dotProductNew1024  thrpt5   14.389 ± 0.720  ops/us

{code}
 

These benchmarks were run after adding annotations to disable TieredCompilation 
and vector bounds check. Looks like for small vector size (*16 elements*) we 
see *10%* improvement but for large vectors (*128 or more* elements) the 
improvement is *_4X or higher._*

 

 

 

 

 

 

> simd version of VectorUtil.dotProduct
> -
>
> Key: LUCENE-9838
> URL: https://issues.apache.org/jira/browse/LUCENE-9838
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9838.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Followup to LUCENE-9837
> Let's explore using JDK 16 vector API to speed this up more. It might be a 
> hassle to try to MR-JAR/package up for users (adding commandline flags and 
> stuff), but it gives good performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9838) simd version of VectorUtil.dotProduct

2021-03-15 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302157#comment-17302157
 ] 

Ankur edited comment on LUCENE-9838 at 3/16/21, 2:10 AM:
-

This is cool - [~rcmuir]. 

I played with this a little on my MacBook Pro (2019, *Memory*: 32 GB 2667 MHZ 
DDR4; *Processor*:  2.6 GHz 6-Core Intel Core i7) after downloading [OpenJDK 
build 
16+36-2231|https://download.java.net/java/GA/jdk16/7863447f0ab643c585b9bdebf67c69db/36/GPL/openjdk-16_osx-x64_bin.tar.gz]
 and setting up a standalone [JMH benchmark|https://github.com/openjdk/jmh] 
project.

I copied over the old dotProduct implementation and the new one from your patch 
to _MyBenchmark.java_ in the JMH project space. Here are the results I got
{code:java}
Benchmark  (size)   Mode  CntScore   Error   Units
MyBenchmark.dotProductOld  16  thrpt5   90.896 ± 5.302  ops/us
MyBenchmark.dotProductNew  16  thrpt5  100.901 ± 5.105  ops/us

MyBenchmark.dotProductOld  32  thrpt5   53.563 ± 2.378  ops/us
MyBenchmark.dotProductNew  32  thrpt5   97.610 ± 5.393  ops/us

MyBenchmark.dotProductOld  64  thrpt5   29.792 ± 1.246  ops/us
MyBenchmark.dotProductNew  64  thrpt5   73.499 ± 3.640  ops/us

MyBenchmark.dotProductOld 128  thrpt5   16.906 ± 0.751  ops/us
MyBenchmark.dotProductNew 128  thrpt5   65.068 ± 3.986  ops/us

MyBenchmark.dotProductOld 256  thrpt58.360 ± 0.125  ops/us
MyBenchmark.dotProductNew 256  thrpt5   42.595 ± 2.958  ops/us

MyBenchmark.dotProductOld 512  thrpt54.231 ± 0.158  ops/us
MyBenchmark.dotProductNew 512  thrpt5   26.283 ± 0.640  ops/us

MyBenchmark.dotProductOld1024  thrpt52.104 ± 0.093  ops/us
MyBenchmark.dotProductNew1024  thrpt5   14.389 ± 0.720  ops/us

{code}
 

These benchmarks were run after adding annotations to disable TieredCompilation 
and vector bounds check. Looks like for small vector size (*16 elements*) we 
see *10%* improvement but for large vectors (*128 or more* elements) the 
improvement is *_4X or higher._*


was (Author: goankur):
This is cool - [~rcmuir]. 

I played with this a little on my MacBook Pro (2019, *Memory*: 32 GB 2667 MHZ 
DDR4; *Processor*:  2.6 GHz 6-Core Intel Core i7) after downloading [OpenJDK 
build 
16+36-2231|https://download.java.net/java/GA/jdk16/7863447f0ab643c585b9bdebf67c69db/36/GPL/openjdk-16_osx-x64_bin.tar.gz]
 and setting up a standalone [JMH benchmark|https://github.com/openjdk/jmh] 
project.

I copied over the old dotProduct implementation and the new one from your patch 
to _MyBenchmark.java_ in the JMH project space. Here are the results I got
{code:java}
Benchmark  (size)   Mode  CntScore   Error   Units
MyBenchmark.dotProductOld  16  thrpt5   90.896 ± 5.302  ops/us
MyBenchmark.dotProductNew  16  thrpt5  100.901 ± 5.105  ops/us

MyBenchmark.dotProductOld  32  thrpt5   53.563 ± 2.378  ops/us
MyBenchmark.dotProductNew  32  thrpt5   97.610 ± 5.393  ops/us

MyBenchmark.dotProductOld  64  thrpt5   29.792 ± 1.246  ops/us
MyBenchmark.dotProductNew  64  thrpt5   73.499 ± 3.640  ops/us

MyBenchmark.dotProductOld 128  thrpt5   16.906 ± 0.751  ops/us
MyBenchmark.dotProductNew 128  thrpt5   65.068 ± 3.986  ops/us

MyBenchmark.dotProductOld 256  thrpt58.360 ± 0.125  ops/us
MyBenchmark.dotProductNew 256  thrpt5   42.595 ± 2.958  ops/us

MyBenchmark.dotProductOld 512  thrpt54.231 ± 0.158  ops/us
MyBenchmark.dotProductNew 512  thrpt5   26.283 ± 0.640  ops/us

MyBenchmark.dotProductOld1024  thrpt52.104 ± 0.093  ops/us
MyBenchmark.dotProductNew1024  thrpt5   14.389 ± 0.720  ops/us

{code}
 

These benchmarks were run after adding annotations to disable TieredCompilation 
and vector bounds check. Looks like for small vector size (*16 elements*) we 
see *10%* improvement but for large vectors (*128 or more* elements) the 
improvement is *_4X or higher._*

 

 

 

 

 

 

> simd version of VectorUtil.dotProduct
> -
>
> Key: LUCENE-9838
> URL: https://issues.apache.org/jira/browse/LUCENE-9838
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9838.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Followup to LUCENE-9837
> Let's explore using JDK 16 vector API to speed this up more. It might be a 
> hassle to try to MR-JAR/package up for users (adding commandline flags and 
> stuff), but it gives good performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-12-15 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17249980#comment-17249980
 ] 

Ankur commented on LUCENE-9444:
---

[~mikemccand], Sorry for the late response. Yes we can resolve this one now.

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Fix For: master (9.0)
>
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-12-15 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur resolved LUCENE-9444.
---
Resolution: Fixed

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Fix For: master (9.0)
>
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-28 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203559#comment-17203559
 ] 

Ankur edited comment on LUCENE-9444 at 9/29/20, 12:06 AM:
--

Thanks [~mikemccand] for merging the 
[PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files]

I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}}  
did not exercise the API to get facet labels for specific dimension -  
{{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I added the 
required changes in 
[PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files]

 Re-opening the issue so that you can take a look.


was (Author: goankur):
Thanks [~mikemccand] for merging the 
[PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files]

I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}}  
did not exercise the API to get facet labels for specific dimension -  
{{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I added the 
required changes in 
[PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files]

 

Re-opening the issue so that you can take a look.

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Fix For: master (9.0), 8.7
>
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-28 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203559#comment-17203559
 ] 

Ankur edited comment on LUCENE-9444 at 9/29/20, 12:05 AM:
--

Thanks [~mikemccand] for merging the 
[PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files]

I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}}  
did not exercise the API to get facet labels for specific dimension -  
{{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I added the 
required changes in 
[PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files]

 

Re-opening the issue so that you can take a look.


was (Author: goankur):
Thanks [~mikemccand] for merging the 
[PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files]

I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}}  
did not exercise the API to get facet labels for specific dimension -  
{{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I made changes 
in [PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files] Can you 
please take a look ?

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Fix For: master (9.0), 8.7
>
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Reopened] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-28 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur reopened LUCENE-9444:
---

Thanks [~mikemccand] for merging the 
[PR-1893.|https://github.com/apache/lucene-solr/pull/1893/files]

I just realized that the changes in {{TestTaxonomyFacetCounts.testRandom()}}  
did not exercise the API to get facet labels for specific dimension -  
{{TaxonomyFacetLabels.nextFacetLabel(docId, facetDimension)}} so I made changes 
in [PR-1928.|https://github.com/apache/lucene-solr/pull/1928/files] Can you 
please take a look ?

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Fix For: master (9.0), 8.7
>
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-25 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200529#comment-17200529
 ] 

Ankur edited comment on LUCENE-9444 at 9/25/20, 5:50 PM:
-

Thanks [~mikemccand], I incorporated the code review feedback and
 * Replaced {{assert}} with {{IllegalArgumentException}}  for invalid inputs
 * Added javadoc notes explaining that returned _FacetLabels_ may not be in the 
same order in which they were indexed.
 * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method  to exercise the API 
to get facet labels for each matching document. 
  * Here is the updated PR - 
https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae

 


was (Author: goankur):
Thanks [~mikemccand], I incorporated the code review feedback and
 * Replaced {{assert} with {{IllegalArgumentException}}  for invalid inputs
 * Added javadoc notes explaining that returned _FacetLabels_ may not be in the 
same order in which they were indexed.
 * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method  to exercise the API 
to get facet labels for each matching document. 
  * Here is the updated PR - 
https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-22 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200529#comment-17200529
 ] 

Ankur edited comment on LUCENE-9444 at 9/23/20, 4:34 AM:
-

Thanks [~mikemccand], I incorporated the code review feedback and
 * Replaced {{assert} with {{IllegalArgumentException}}  for invalid inputs
 * Added javadoc notes explaining that returned _FacetLabels_ may not be in the 
same order in which they were indexed.
 * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method  to exercise the API 
to get facet labels for each matching document. 
  * Here is the updated PR - 
https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae

 


was (Author: goankur):
[~mikemccand], I incorporated the code review feedback and
 * Replaced {{assert} with {{IllegalArgumentException}}  for invalid inputs
 * Added javadoc notes explaining that returned _FacetLabels_ may not be in the 
same order in which they were indexed.
 * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method  to exercise the API 
to get facet labels for each matching document. 
  * Here is the updated PR - 
https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-22 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200529#comment-17200529
 ] 

Ankur commented on LUCENE-9444:
---

[~mikemccand], I incorporated the code review feedback and
 * Replaced {{assert} with {{IllegalArgumentException}}  for invalid inputs
 * Added javadoc notes explaining that returned _FacetLabels_ may not be in the 
same order in which they were indexed.
 * Enhanced {{TestTaxonomyFacetCount.testRandom()}} method  to exercise the API 
to get facet labels for each matching document. 
  * Here is the updated PR - 
https://github.com/apache/lucene-solr/pull/1893/commits/75ff251ebac9034c93edbb43dcf5d8dd0f1058ae

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-19 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198780#comment-17198780
 ] 

Ankur edited comment on LUCENE-9444 at 9/19/20, 6:30 PM:
-

Thanks [~mikemccand] for making those changes. I made a couple of minor edits 
to _TaxonomyFacetLabels.java_
 * Removed the reference to \{@link java.util.Iterator} as it is no longer used.
 * Fixed typo in javadoc.
 * Replaced 
{code:java}
 if (parentOrd == INVALID_ORDINAL) {
throw new AssertionError("Root ordinal not found for facet dimension: " 
+ facetDimension);
  }{code}
with single line
{code:java}
assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet 
dimension: " + facetDimension; {code}
 in method
{code:java}
public FacetLabel nextFacetLabel(int docId, String facetDimension) throws 
IOException{code}

 * Created a pull request as you suggested in one of your earlier comments :)
 ** 
[https://github.com/apache/lucene-solr/pull/1893/commits/bf8eaf98901cbe83f23067bea90dfb2f3102603a]

Can you take a look and see if it's ready to be committed ?


was (Author: goankur):
Thanks [~mikemccand] for making those changes. I made a couple of minor edits
 * Fixed typo in javadoc
 * Replaced 
{code:java}
 if (parentOrd == INVALID_ORDINAL) {
throw new AssertionError("Root ordinal not found for facet dimension: " 
+ facetDimension);
  }{code}
with single line
{code:java}
assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet 
dimension: " + facetDimension; {code}
 in method
{code:java}
public FacetLabel nextFacetLabel(int docId, String facetDimension) throws 
IOException{code}

 * Created a pull request as you suggested in one of your earlier comments :)
 ** 
[https://github.com/apache/lucene-solr/pull/1893/commits/bf8eaf98901cbe83f23067bea90dfb2f3102603a]

Can you take a look and see  if it's ready to be committed ?

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-19 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198780#comment-17198780
 ] 

Ankur commented on LUCENE-9444:
---

Thanks [~mikemccand] for making those changes. I made a couple of minor edits
 * Fixed typo in javadoc
 * Replaced 
{code:java}
 if (parentOrd == INVALID_ORDINAL) {
throw new AssertionError("Root ordinal not found for facet dimension: " 
+ facetDimension);
  }{code}
with single line
{code:java}
assert parentOrd != INVALID_ORDINAL : "Category ordinal not found for facet 
dimension: " + facetDimension; {code}
 in method
{code:java}
public FacetLabel nextFacetLabel(int docId, String facetDimension) throws 
IOException{code}

 * Created a pull request as you suggested in one of your earlier comments :)
 ** 
[https://github.com/apache/lucene-solr/pull/1893/commits/bf8eaf98901cbe83f23067bea90dfb2f3102603a]

Can you take a look and see  if it's ready to be committed ?

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.patch, 
> LUCENE-9444.v2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-10 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193350#comment-17193350
 ] 

Ankur edited comment on LUCENE-9444 at 9/10/20, 5:06 PM:
-

Thanks [~mikemccand]. I uploaded a new patch that incorporates the code review 
feed. The patch
 * Makes {{FacetLabelReader}} public as suggested.
 * Adds javadoc explaining why {{FacetLabelReader}} is not thread-safe.
 * Eliminates {{Iterator}} and replaces {{lookupLabels()}} methods with 
{{nextFacetLabel()}} methods that just return {{null}} if no more FacetLabels 
exist for input docId.
 * Adds {{@lucene.experimental}} to class level javadocs.
 * Enhances {{TestTaxonomyLabels.testBasic()}} method to check that fetching 
FacetLabels in decreasing docId order throws {{AssertionError.}}


was (Author: goankur):
Thanks [~mikemccand]. I uploaded a new patch that incorporates the code review 
feed. The patch
 * Makes {{FacetLabelReader}} public as suggested.
 * Adds javadoc to explaining that {{FacetLabelReader}} is not thread-safe.
 * Eliminates {{Iterator}} and replaces {{lookupLabels()}} methods with 
{{nextFacetLabel()}} methods that just return {{null}} if no more FacetLabels 
exist for input docId.
 * Adds {{@lucene.experimental}} to class level javadocs.
 * Enhances {{TestTaxonomyLabels.testBasic()}} method to check that fetching 
FacetLabels in decreasing docId order throws {{AssertionError.}}

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.v2.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-09 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193350#comment-17193350
 ] 

Ankur commented on LUCENE-9444:
---

Thanks [~mikemccand]. I uploaded a new patch that incorporates the code review 
feed. The patch
 * Makes {{FacetLabelReader}} public as suggested.
 * Adds javadoc to explaining that {{FacetLabelReader}} is not thread-safe.
 * Eliminates {{Iterator}} and replaces {{lookupLabels()}} methods with 
{{nextFacetLabel()}} methods that just return {{null}} if no more FacetLabels 
exist for input docId.
 * Adds {{@lucene.experimental}} to class level javadocs.
 * Enhances {{TestTaxonomyLabels.testBasic()}} method to check that fetching 
FacetLabels in decreasing docId order throws {{AssertionError.}}

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.v2.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-09 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9444:
--
Attachment: LUCENE-9444.v2.patch

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch, LUCENE-9444.v2.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-03 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190342#comment-17190342
 ] 

Ankur edited comment on LUCENE-9444 at 9/3/20, 6:09 PM:


Patch has been available for 1+ day, not sure why automated patch testing has 
not picked it up yet.


was (Author: goankur):
Patch has been available for 1+ day, not sure why automated patch testing has 
picked it up yet.

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-03 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190342#comment-17190342
 ] 

Ankur commented on LUCENE-9444:
---

Patch has been available for 1+ day, not sure why automated patch testing has 
picked it up yet.

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-02 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618
 ] 

Ankur edited comment on LUCENE-9444 at 9/2/20, 6:35 PM:


Here is a patch that adds a new utility class {{TaxonomyFacetLabels}}
 with a single method {{getFacetLabelReader(LeafReaderContext)}} that returns 
an instance of nested class {{FacetLabelReader}}. 

It uses an instance of {{OrdinalsSegmentReader}} to fetch and decode ordinals 
for input docid into a reusable buffer and returns an {{Iterator}} that uses 
{{TaxonomyReader}} to lookup and return {{FacetLabels}} for each ordinal.

The patch also adds a new test case {{TestTaxonomyLabels}} demonstrating the 
usage. 


was (Author: goankur):
Here is a patch that adds a new utility class {{TaxonomyFacetLabels}}
with a single method {{getFacetLabelReader(LeafReaderContext)}} that returns an 
instance of nested class {{FacetLabelReader}}. 

It uses an instance of {{OrdinalsSegmentReader}} to fetch and decode ordinals 
for input docid into a reusable buffer and returns an {{Iterator}} that uses 
{{TaxonomyReader}} to lookup and return {{FacetLabels}} for each ordinal.

 The patch also adds a new test case {{TestTaxonomyLabels}} demonstrating the 
usage. 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-02 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618
 ] 

Ankur edited comment on LUCENE-9444 at 9/2/20, 6:33 PM:


Here is a patch that adds a new utility class {{TaxonomyFacetLabels}}
with a single method {{getFacetLabelReader(LeafReaderContext)}} that returns an 
instance of nested class {{FacetLabelReader}}. 

It uses an instance of {{OrdinalsSegmentReader}} to fetch and decode ordinals 
for input docid into a reusable buffer and returns an {{Iterator}} that uses 
{{TaxonomyReader}} to lookup and return {{FacetLabels}} for each ordinal.

 The patch also adds a new test case {{TestTaxonomyLabels}} demonstrating the 
usage. 


was (Author: goankur):
Here is a patch that adds a new utility class __TaxonomyFacetLabels__ with a 
single method
{code:java}
getFacetLabelReader(LeafReaderContext){code}
that returns an instance of nested class

 

 
{code:java}
FacetLabelReader{code}
 

It uses an instance of
{code:java}
OrdinalsSegmentReader{code}
to fetch and decode ordinals for the input docid into a reusable buffer and 
returns an
{code:java}
Iterator{code}
that uses
{code:java}
TaxonomyReader{code}
to lookup
{code:java}
FacetLabels{code}
 for each ordinal.

 

The patch also adds a new test case
{code:java}
TestTaxonomyLabels{code}
demonstrating the usage. 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-02 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618
 ] 

Ankur edited comment on LUCENE-9444 at 9/2/20, 6:28 PM:


Here is a patch that adds a new utility class __TaxonomyFacetLabels__ with a 
single method
{code:java}
getFacetLabelReader(LeafReaderContext){code}
that returns an instance of nested class

 

 
{code:java}
FacetLabelReader{code}
 

It uses an instance of
{code:java}
OrdinalsSegmentReader{code}
to fetch and decode ordinals for the input docid into a reusable buffer and 
returns an
{code:java}
Iterator{code}
that uses
{code:java}
TaxonomyReader{code}
to lookup
{code:java}
FacetLabels{code}
 for each ordinal.

 

The patch also adds a new test case
{code:java}
TestTaxonomyLabels{code}
demonstrating the usage. 


was (Author: goankur):
Here is a patch that adds a new utility class 
{code:java}

{code}
_TaxonomyFacetLabels_with a single method
{code:java}
getFacetLabelReader(LeafReaderContext){code}
that returns an instance of nested class

 

 
{code:java}
FacetLabelReader{code}
 

It uses an instance of
{code:java}
OrdinalsSegmentReader{code}
to fetch and decode ordinals for the input docid into a reusable buffer and 
returns an
{code:java}
Iterator{code}
that uses
{code:java}
TaxonomyReader{code}
to lookup
{code:java}
FacetLabels{code}
 for each ordinal.

 

The patch also adds a new test case
{code:java}
TestTaxonomyLabels{code}
demonstrating the usage. 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-02 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618
 ] 

Ankur edited comment on LUCENE-9444 at 9/2/20, 6:27 PM:


Here is a patch that adds a new utility class 
{code:java}

{code}
_TaxonomyFacetLabels_with a single method
{code:java}
getFacetLabelReader(LeafReaderContext){code}
that returns an instance of nested class

 

 
{code:java}
FacetLabelReader{code}
 

It uses an instance of
{code:java}
OrdinalsSegmentReader{code}
to fetch and decode ordinals for the input docid into a reusable buffer and 
returns an
{code:java}
Iterator{code}
that uses
{code:java}
TaxonomyReader{code}
to lookup
{code:java}
FacetLabels{code}
 for each ordinal.

 

The patch also adds a new test case
{code:java}
TestTaxonomyLabels{code}
demonstrating the usage. 


was (Author: goankur):
Here is a patch that adds a new utility class - `TaxonomyFacetLabels` with the 
method - `getFacetLabelReader(LeafReaderContext)` that returns an instance of 
nested class - `FacetLabelReader`.

`FacetLabelReader` uses an instance of `OrdinalsSegmentReader` to fetch and 
decode ordinals for the input docid into a reusable buffer and returns an 
`Iterator` that uses `TaxonomyReader` instance to lookup `FacetLabels`.

The patch also adds a new test case `TestTaxonomyLabels` demonstrating the 
usage. 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-09-02 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189618#comment-17189618
 ] 

Ankur commented on LUCENE-9444:
---

Here is a patch that adds a new utility class - `TaxonomyFacetLabels` with the 
method - `getFacetLabelReader(LeafReaderContext)` that returns an instance of 
nested class - `FacetLabelReader`.

`FacetLabelReader` uses an instance of `OrdinalsSegmentReader` to fetch and 
decode ordinals for the input docid into a reusable buffer and returns an 
`Iterator` that uses `TaxonomyReader` instance to lookup `FacetLabels`.

The patch also adds a new test case `TestTaxonomyLabels` demonstrating the 
usage. 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-08-31 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9444:
--
   Attachment: LUCENE-9444.patch
Lucene Fields: New,Patch Available  (was: New)
   Labels: facet  (was: )
   Status: Open  (was: Open)

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-08-31 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9444:
--
Status: Patch Available  (was: Open)

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>  Labels: facet
> Attachments: LUCENE-9444.patch
>
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9489) Compilation failure due to broken link reference in javadoc

2020-08-29 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187105#comment-17187105
 ] 

Ankur commented on LUCENE-9489:
---

Thanks Tomoko Uchida for taking a look. 

It looks like my local git repo was in a weird state where it won't get the 
latest updates despite multiple 'git pull --rebase origin master' attempts. 

Doing a fresh git clone solved the problem for me.

Sorry for the false alarm.

> Compilation failure due to broken link reference in javadoc
> ---
>
> Key: LUCENE-9489
> URL: https://issues.apache.org/jira/browse/LUCENE-9489
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: master (9.0)
>Reporter: Ankur
>Priority: Trivial
> Fix For: master (9.0)
>
> Attachments: LUCENE-9489.patch
>
>
> Javadoc for method
> {code:java}
> org.apache.lucene.facet.taxonomy.DocValuesOrdinalsReader.decode(BytesRef buf, 
> IntsRef ordinals){code}
> has a broken link reference causing compilation failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9489) Compilation failure due to broken link reference in javadoc

2020-08-28 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9489:
--
Description: 
Javadoc for method
{code:java}
org.apache.lucene.facet.taxonomy.DocValuesOrdinalsReader.decode(BytesRef buf, 
IntsRef ordinals){code}
has a broken link reference causing compilation error.

  was:
Javadoc for method
{code:java}
org.apache.lucene.facet.taxonomy.DocValuesOrdinalsReader.decode(BytesRef buf, 
IntsRef ordinals){code}
has a broken link reference in javadoc causing compilation error.


> Compilation failure due to broken link reference in javadoc
> ---
>
> Key: LUCENE-9489
> URL: https://issues.apache.org/jira/browse/LUCENE-9489
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: master (9.0)
>Reporter: Ankur
>Priority: Trivial
> Fix For: master (9.0)
>
> Attachments: LUCENE-9489.patch
>
>
> Javadoc for method
> {code:java}
> org.apache.lucene.facet.taxonomy.DocValuesOrdinalsReader.decode(BytesRef buf, 
> IntsRef ordinals){code}
> has a broken link reference causing compilation error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9489) Compilation failure due to broken link reference in javadoc

2020-08-28 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9489:
--
Status: Patch Available  (was: Open)

> Compilation failure due to broken link reference in javadoc
> ---
>
> Key: LUCENE-9489
> URL: https://issues.apache.org/jira/browse/LUCENE-9489
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: master (9.0)
>Reporter: Ankur
>Priority: Trivial
> Fix For: master (9.0)
>
> Attachments: LUCENE-9489.patch
>
>
> Javadoc for method
> {code:java}
> org.apache.lucene.facet.taxonomy.DocValuesOrdinalsReader.decode(BytesRef buf, 
> IntsRef ordinals){code}
> has a broken link reference in javadoc causing compilation error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9489) Compilation failure due to broken link reference in javadoc

2020-08-28 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9489:
--
Description: 
Javadoc for method
{code:java}
org.apache.lucene.facet.taxonomy.DocValuesOrdinalsReader.decode(BytesRef buf, 
IntsRef ordinals){code}
has a broken link reference causing compilation failure.

  was:
Javadoc for method
{code:java}
org.apache.lucene.facet.taxonomy.DocValuesOrdinalsReader.decode(BytesRef buf, 
IntsRef ordinals){code}
has a broken link reference causing compilation error.


> Compilation failure due to broken link reference in javadoc
> ---
>
> Key: LUCENE-9489
> URL: https://issues.apache.org/jira/browse/LUCENE-9489
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: master (9.0)
>Reporter: Ankur
>Priority: Trivial
> Fix For: master (9.0)
>
> Attachments: LUCENE-9489.patch
>
>
> Javadoc for method
> {code:java}
> org.apache.lucene.facet.taxonomy.DocValuesOrdinalsReader.decode(BytesRef buf, 
> IntsRef ordinals){code}
> has a broken link reference causing compilation failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9489) Compilation failure due to broken link reference in javadoc

2020-08-28 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9489:
--
   Attachment: LUCENE-9489.patch
Fix Version/s: master (9.0)
Lucene Fields: New,Patch Available  (was: New)
   Status: Open  (was: Open)

> Compilation failure due to broken link reference in javadoc
> ---
>
> Key: LUCENE-9489
> URL: https://issues.apache.org/jira/browse/LUCENE-9489
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: master (9.0)
>Reporter: Ankur
>Priority: Trivial
> Fix For: master (9.0)
>
> Attachments: LUCENE-9489.patch
>
>
> Javadoc for method
> {code:java}
> org.apache.lucene.facet.taxonomy.DocValuesOrdinalsReader.decode(BytesRef buf, 
> IntsRef ordinals){code}
> has a broken link reference in javadoc causing compilation error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9489) Compilation failure due to broken link reference in javadoc

2020-08-28 Thread Ankur (Jira)
Ankur created LUCENE-9489:
-

 Summary: Compilation failure due to broken link reference in 
javadoc
 Key: LUCENE-9489
 URL: https://issues.apache.org/jira/browse/LUCENE-9489
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/facet
Affects Versions: master (9.0)
Reporter: Ankur


Javadoc for method
{code:java}
org.apache.lucene.facet.taxonomy.DocValuesOrdinalsReader.decode(BytesRef buf, 
IntsRef ordinals){code}
has a broken link reference in javadoc causing compilation error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-08-05 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171759#comment-17171759
 ] 

Ankur commented on LUCENE-9444:
---

>  ...and it returns another class for actually iterating over the 
>{{FacetLabel}} for each document in that segment?

Should this class extends {{DocIdSetIterator}} to allow intersection with 
another {{DocIdSetIterator}} created from {{FacetsCollector.MatchingDoc.bits}} ?
Making {{dim}} part of ctor feels a bit restrictive, how about providing 2 
separate APIs, one that accepts dimension and another that does not ?

> Instead of FacetLabel[] maybe ...

how about returning a {{java.util.Iterator}} instead of  
{{FacetLabel[]}} ?


> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-08-04 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171215#comment-17171215
 ] 

Ankur edited comment on LUCENE-9444 at 8/5/20, 1:41 AM:


Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().* In order to take care of hierarchical fields, I 
think it makes sense to return FacetLabel[] instead of String[].

The proposed API signature would look like this

{{public FacetLabel[] getLabels(int docId, String dim, BinaryDocValues dv)}}

 


was (Author: goankur):
Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().* In order to take care of hierarchical fields, I 
think it makes sense to return FacetLabel[] instead of String[].

The proposed API signature would look like this

{{public FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-08-04 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171215#comment-17171215
 ] 

Ankur edited comment on LUCENE-9444 at 8/5/20, 1:40 AM:


Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().* In order to take care of hierarchical fields, I 
think it makes sense to return FacetLabel[] instead of String[].

The proposed API signature would look like this

{{public FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}

 


was (Author: goankur):
Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().* In order to take care of hierarchical fields, I 
think it makes sense to return FacetLabel[] instead of String[].

The proposed API signature would look like this

{{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-08-04 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171215#comment-17171215
 ] 

Ankur edited comment on LUCENE-9444 at 8/5/20, 1:38 AM:


Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().* In order to take care of hierarchical fields, I 
think it makes sense to return FacetLabel[] instead of String[].

The proposed API signature would look like this

{{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}

 


was (Author: goankur):
Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().*

In order to take care of hierarchical fields, I think it makes sense to return 
FacetLabel[] instead of String[].

One last thing, should we make the API _*static*_ ?

The proposed API signature would look like this

{{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-08-04 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171215#comment-17171215
 ] 

Ankur commented on LUCENE-9444:
---

Thanks for your response [~mikemccand]

Yes, having _dim_ as additional parameter makes sense.

Regarding _*BinaryDocValues*_ iterator, initially I was thinking of providing a 
concrete implementation - *_TaxonomyFacetsLabels_* of abstract class 
*_TaxonomyFacets_* and add a constructor that accepts a *_LeafReaderContext_* 
which will then be used to instantiate and reuse the BinaryDocValues iterator 
between multiple calls to *getLabels(docId, dim).* That way a caller does not 
need to know if a _*BinaryDocValues*_ field existed at all. The downside is 
that caller will need to create a new instance of *_TaxonomyFacetsLabels_* for 
each different *_LeafReaderContext._*

But thinking more, I feel its simpler to pass BinaryDocValues iterator as a 3rd 
argument to *getLabels().*

In order to take care of hierarchical fields, I think it makes sense to return 
FacetLabel[] instead of String[].

One last thing, should we make the API _*static*_ ?

The proposed API signature would look like this

{{public static FacetLabel[] getLabels(int docId, String dim, BinaryDocValues)}}

 

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-08-01 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169319#comment-17169319
 ] 

Ankur commented on LUCENE-9444:
---

Any thoughts folks ?

> Need an API to easily fetch facet labels for a field in a document
> --
>
> Key: LUCENE-9444
> URL: https://issues.apache.org/jira/browse/LUCENE-9444
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Major
>
> A facet field may be included in the list of fields whose values are to be 
> returned for each hit.
> In order to get the facet labels for each hit we need to
>  # Create an instance of _DocValuesOrdinalsReader_ and invoke 
> _getReader(LeafReaderContext context)_ method to obtain an instance of 
> _OrdinalsSegmentReader()_
>  # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then 
> used to fetch and decode the binary payload in the document's BinaryDocValues 
> field. This provides the ordinals that refer to facet labels in the 
> taxonomy.**
>  # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
> returned.
>  
> Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
> all the above details and gives us the string labels. This can be part of 
> *TaxonomyFacets* but that's just one idea.
> I am opening this issue to get community feedback and suggestions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9444) Need an API to easily fetch facet labels for a field in a document

2020-07-30 Thread Ankur (Jira)
Ankur created LUCENE-9444:
-

 Summary: Need an API to easily fetch facet labels for a field in a 
document
 Key: LUCENE-9444
 URL: https://issues.apache.org/jira/browse/LUCENE-9444
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Affects Versions: 8.6
Reporter: Ankur


A facet field may be included in the list of fields whose values are to be 
returned for each hit.

In order to get the facet labels for each hit we need to
 # Create an instance of _DocValuesOrdinalsReader_ and invoke 
_getReader(LeafReaderContext context)_ method to obtain an instance of 
_OrdinalsSegmentReader()_
 # _OrdinalsSegmentReader.get(int docID, IntsRef ordinals)_ method is then used 
to fetch and decode the binary payload in the document's BinaryDocValues field. 
This provides the ordinals that refer to facet labels in the taxonomy.**
 # Lastly TaxonomyReader.getPath(ord) is used to fetch the labels to be 
returned.

 

Ideally there should be a simple API - *String[] getLabels(docId)* that hides 
all the above details and gives us the string labels. This can be part of 
*TaxonomyFacets* but that's just one idea.

I am opening this issue to get community feedback and suggestions.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9437) Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly accessible

2020-07-21 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162422#comment-17162422
 ] 

Ankur commented on LUCENE-9437:
---

Thanks [~mikemccand], I updated the patch fixing javadoc comments as suggested.

> Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly 
> accessible
> -
>
> Key: LUCENE-9437
> URL: https://issues.apache.org/jira/browse/LUCENE-9437
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Trivial
> Attachments: LUCENE-9437.patch
>
>
> Visibility of _DocValuesOrdinalsReader.decode(BytesRef, IntsRef)_ method is 
> set to 'protected'. This prevents the method from being used outside this 
> class in a setting where BinaryDocValues reader is instantiated outside the 
> class and binary payload containing ordinals still needs to be decoded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9437) Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly accessible

2020-07-21 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9437:
--
Status: Patch Available  (was: Open)

> Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly 
> accessible
> -
>
> Key: LUCENE-9437
> URL: https://issues.apache.org/jira/browse/LUCENE-9437
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Trivial
> Attachments: LUCENE-9437.patch
>
>
> Visibility of _DocValuesOrdinalsReader.decode(BytesRef, IntsRef)_ method is 
> set to 'protected'. This prevents the method from being used outside this 
> class in a setting where BinaryDocValues reader is instantiated outside the 
> class and binary payload containing ordinals still needs to be decoded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9437) Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly accessible

2020-07-21 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9437:
--
Status: Open  (was: Patch Available)

> Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly 
> accessible
> -
>
> Key: LUCENE-9437
> URL: https://issues.apache.org/jira/browse/LUCENE-9437
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Trivial
> Attachments: LUCENE-9437.patch
>
>
> Visibility of _DocValuesOrdinalsReader.decode(BytesRef, IntsRef)_ method is 
> set to 'protected'. This prevents the method from being used outside this 
> class in a setting where BinaryDocValues reader is instantiated outside the 
> class and binary payload containing ordinals still needs to be decoded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9437) Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly accessible

2020-07-21 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9437:
--
Attachment: LUCENE-9437.patch

> Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly 
> accessible
> -
>
> Key: LUCENE-9437
> URL: https://issues.apache.org/jira/browse/LUCENE-9437
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Trivial
> Attachments: LUCENE-9437.patch
>
>
> Visibility of _DocValuesOrdinalsReader.decode(BytesRef, IntsRef)_ method is 
> set to 'protected'. This prevents the method from being used outside this 
> class in a setting where BinaryDocValues reader is instantiated outside the 
> class and binary payload containing ordinals still needs to be decoded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9437) Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly accessible

2020-07-21 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9437:
--
Attachment: (was: LUCENE-9437.patch)

> Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly 
> accessible
> -
>
> Key: LUCENE-9437
> URL: https://issues.apache.org/jira/browse/LUCENE-9437
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Trivial
> Attachments: LUCENE-9437.patch
>
>
> Visibility of _DocValuesOrdinalsReader.decode(BytesRef, IntsRef)_ method is 
> set to 'protected'. This prevents the method from being used outside this 
> class in a setting where BinaryDocValues reader is instantiated outside the 
> class and binary payload containing ordinals still needs to be decoded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9437) Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly accessible

2020-07-20 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9437:
--
Description: Visibility of _DocValuesOrdinalsReader.decode(BytesRef, 
IntsRef)_ method is set to 'protected'. This prevents the method from being 
used outside this class in a setting where BinaryDocValues reader is 
instantiated outside the class and binary payload containing ordinals still 
needs to be decoded.  (was: Visibility of 
_DocValuesOrdinalsReader.decode(BytesRef, IntsRef)_ method is set to 
'protected'. This prevents the method from being used outside this class in a 
setting where BinaryDocValues reader is instantiated outside the class and 
binary payload containing ordinals still need to be decoded.)

> Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly 
> accessible
> -
>
> Key: LUCENE-9437
> URL: https://issues.apache.org/jira/browse/LUCENE-9437
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Trivial
> Attachments: LUCENE-9437.patch
>
>
> Visibility of _DocValuesOrdinalsReader.decode(BytesRef, IntsRef)_ method is 
> set to 'protected'. This prevents the method from being used outside this 
> class in a setting where BinaryDocValues reader is instantiated outside the 
> class and binary payload containing ordinals still needs to be decoded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9437) Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly accessible

2020-07-20 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9437:
--
Status: Patch Available  (was: Open)

> Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly 
> accessible
> -
>
> Key: LUCENE-9437
> URL: https://issues.apache.org/jira/browse/LUCENE-9437
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Trivial
> Attachments: LUCENE-9437.patch
>
>
> Visibility of _DocValuesOrdinalsReader.decode(BytesRef, IntsRef)_ method is 
> set to 'protected'. This prevents the method from being used outside this 
> class in a setting where BinaryDocValues reader is instantiated outside the 
> class and binary payload containing ordinals still need to be decoded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9437) Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly accessible

2020-07-20 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161587#comment-17161587
 ] 

Ankur commented on LUCENE-9437:
---

Simple code change to raise the visibility of 
_DocValuesOrdinalsReader.decode(BytesRef, IntsRef)_ method from 'protected' to 
'public'. Also added javadoc

> Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly 
> accessible
> -
>
> Key: LUCENE-9437
> URL: https://issues.apache.org/jira/browse/LUCENE-9437
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Trivial
> Attachments: LUCENE-9437.patch
>
>
> Visibility of _DocValuesOrdinalsReader.decode(BytesRef, IntsRef)_ method is 
> set to 'protected'. This prevents the method from being used outside this 
> class in a setting where BinaryDocValues reader is instantiated outside the 
> class and binary payload containing ordinals still need to be decoded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9437) Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly accessible

2020-07-20 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9437:
--
   Attachment: LUCENE-9437.patch
Lucene Fields: New,Patch Available  (was: New)
Affects Version/s: 8.6
   Status: Open  (was: Open)

> Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly 
> accessible
> -
>
> Key: LUCENE-9437
> URL: https://issues.apache.org/jira/browse/LUCENE-9437
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Ankur
>Priority: Trivial
> Attachments: LUCENE-9437.patch
>
>
> Visibility of _DocValuesOrdinalsReader.decode(BytesRef, IntsRef)_ method is 
> set to 'protected'. This prevents the method from being used outside this 
> class in a setting where BinaryDocValues reader is instantiated outside the 
> class and binary payload containing ordinals still need to be decoded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9437) Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) method publicly accessible

2020-07-20 Thread Ankur (Jira)
Ankur created LUCENE-9437:
-

 Summary: Make DocValuesOrdinalsReader.decode(BytesRef, IntsRef) 
method publicly accessible
 Key: LUCENE-9437
 URL: https://issues.apache.org/jira/browse/LUCENE-9437
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Ankur


Visibility of _DocValuesOrdinalsReader.decode(BytesRef, IntsRef)_ method is set 
to 'protected'. This prevents the method from being used outside this class in 
a setting where BinaryDocValues reader is instantiated outside the class and 
binary payload containing ordinals still need to be decoded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9392) Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public

2020-06-08 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9392:
--
Attachment: (was: LUCENE-9392.patch)

> Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public 
> ---
>
> Key: LUCENE-9392
> URL: https://issues.apache.org/jira/browse/LUCENE-9392
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.5.2
>Reporter: Ankur
>Priority: Minor
> Attachments: LUCENE-9392.patch
>
>
> FacetsConfig.DELIM_CHAR is marked as private.  An application that wants to 
> use this delimiter (in a unit test for example) is forced to re-declare it in 
> the application code. This can break the application if tetshe value of 
> DELIM_CHAR is changed in FacetsConfig



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9392) Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public

2020-06-08 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9392:
--
   Attachment: LUCENE-9392.patch
Lucene Fields: New,Patch Available  (was: New)
   Status: Patch Available  (was: Patch Available)

Incorporate code review feedback from Mike McCandless

> Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public 
> ---
>
> Key: LUCENE-9392
> URL: https://issues.apache.org/jira/browse/LUCENE-9392
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.5.2
>Reporter: Ankur
>Priority: Minor
> Attachments: LUCENE-9392.patch, LUCENE-9392.patch
>
>
> FacetsConfig.DELIM_CHAR is marked as private.  An application that wants to 
> use this delimiter (in a unit test for example) is forced to re-declare it in 
> the application code. This can break the application if tetshe value of 
> DELIM_CHAR is changed in FacetsConfig



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9392) Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public

2020-06-08 Thread Ankur (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128678#comment-17128678
 ] 

Ankur commented on LUCENE-9392:
---

Attached a patch with a fix that raises the visibility of 
FacetsConfig.DELIM_CHAR to public

> Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public 
> ---
>
> Key: LUCENE-9392
> URL: https://issues.apache.org/jira/browse/LUCENE-9392
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.5.2
>Reporter: Ankur
>Priority: Minor
> Attachments: LUCENE-9392.patch
>
>
> FacetsConfig.DELIM_CHAR is marked as private.  An application that wants to 
> use this delimiter (in a unit test for example) is forced to re-declare it in 
> the application code. This can break the application if tetshe value of 
> DELIM_CHAR is changed in FacetsConfig



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9392) Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public

2020-06-08 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9392:
--
Status: Patch Available  (was: Open)

> Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public 
> ---
>
> Key: LUCENE-9392
> URL: https://issues.apache.org/jira/browse/LUCENE-9392
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.5.2
>Reporter: Ankur
>Priority: Minor
> Attachments: LUCENE-9392.patch
>
>
> FacetsConfig.DELIM_CHAR is marked as private.  An application that wants to 
> use this delimiter (in a unit test for example) is forced to re-declare it in 
> the application code. This can break the application if tetshe value of 
> DELIM_CHAR is changed in FacetsConfig



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9392) Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public

2020-06-08 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9392:
--
Attachment: LUCENE-9392.patch

> Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public 
> ---
>
> Key: LUCENE-9392
> URL: https://issues.apache.org/jira/browse/LUCENE-9392
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.5.2
>Reporter: Ankur
>Priority: Minor
> Attachments: LUCENE-9392.patch
>
>
> FacetsConfig.DELIM_CHAR is marked as private.  An application that wants to 
> use this delimiter (in a unit test for example) is forced to re-declare it in 
> the application code. This can break the application if tetshe value of 
> DELIM_CHAR is changed in FacetsConfig



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9392) Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public

2020-06-04 Thread Ankur (Jira)
Ankur created LUCENE-9392:
-

 Summary: Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR 
to public 
 Key: LUCENE-9392
 URL: https://issues.apache.org/jira/browse/LUCENE-9392
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 8.5.2
Reporter: Ankur


FacetsConfig.DELIM_CHAR is marked as private.  An application that wants to use 
this delimiter (in a unit test for example) is forced to re-declare it in the 
application code. This can break the application if DELIM_CHAR is changed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9392) Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public

2020-06-04 Thread Ankur (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated LUCENE-9392:
--
Description: FacetsConfig.DELIM_CHAR is marked as private.  An application 
that wants to use this delimiter (in a unit test for example) is forced to 
re-declare it in the application code. This can break the application if tetshe 
value of DELIM_CHAR is changed in FacetsConfig  (was: FacetsConfig.DELIM_CHAR 
is marked as private.  An application that wants to use this delimiter (in a 
unit test for example) is forced to re-declare it in the application code. This 
can break the application if DELIM_CHAR is changed.)

> Change the visibility of o.a.l.f.FacetsConfig.DELIM_CHAR to public 
> ---
>
> Key: LUCENE-9392
> URL: https://issues.apache.org/jira/browse/LUCENE-9392
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 8.5.2
>Reporter: Ankur
>Priority: Minor
>
> FacetsConfig.DELIM_CHAR is marked as private.  An application that wants to 
> use this delimiter (in a unit test for example) is forced to re-declare it in 
> the application code. This can break the application if tetshe value of 
> DELIM_CHAR is changed in FacetsConfig



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9385) Skip indexing facet drill down terms

2020-05-27 Thread Ankur (Jira)
Ankur created LUCENE-9385:
-

 Summary: Skip indexing facet drill down terms
 Key: LUCENE-9385
 URL: https://issues.apache.org/jira/browse/LUCENE-9385
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/facet
Affects Versions: 8.5.2
Reporter: Ankur


FacetsConfig creates index terms from the Facet dimension and path 
automatically for the purpose of supporting drill-down queries.

An application that does not need drill-down ends up paying the index cost of 
the extra terms.

Ideally an option to skip indexing these drill down terms should be exposed to 
the application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org