date:20190902

[jira] [Commented] (SOLR-13709) Race condition on core reload while core is still loading?

2019-09-02 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921141#comment-16921141
 ] 

David Smiley commented on SOLR-13709:
-

bq. have the reload operation wait until core loading was complete. I'll give 
that a try with some debugging code in place just to prove the hypothesis. 
Thinking more, it seems like swap, unload, and create should all block until 
the coreContainer has completed loading as well. Actually, it seems like all 
core API commands should wait until after CoreContainer.load() is done.

+1 to all that

> Race condition on core reload while core is still loading?
> --
>
> Key: SOLR-13709
> URL: https://issues.apache.org/jira/browse/SOLR-13709
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Erick Erickson
>Priority: Major
> Attachments: apache_Lucene-Solr-Tests-8.x_449.log.txt
>
>
> A recent jenkins failure from {{TestSolrCLIRunExample}} seems to suggest that 
> there may be a race condition when attempting to re-load a SolrCore while the 
> core is currently in the process of (re)loading that can leave the SolrCore 
> in an unusable state.
> Details to follow...



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on issue #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

atris commented on issue #823: LUCENE-8939: Introduce Shared Count Early 
Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#issuecomment-527285002
 
 
   > > https://issues.apache.org/jira/browse/LUCENE-8681
   > 
   > I am not sure I see how that solves the problem?
   > 
   > The core change targeted in this PR is allowing `TOTAL_HITS_THRESHOLD` to 
be accurately counted across all slices. Today, we collect 
`TOTAL_HITS_THRESHOLD` per slice, which is not what the API definition is. Post 
this PR, we will collect `TOTAL_HITS_THRESHOLD` in aggregate. 
`TOTAL_HITS_THRESHOLD`' s definition does not guarantee any order of collection 
of hits in the concurrent case -- we inadvertently define one today by 
collection the threshold number of hits per slice.
   > 
   > RE: Proration, I believe that is a custom logic that can be added on top 
of this change. In any case, the proration logic also works on a bunch of 
static values + fudge factors, so it can go wrong and we might end up 
collecting lesser hits from a more valuable segment. To help prevent this 
scenario, I believe proration might also do well to build upon this PR and use 
the shared counter. But, I am unable to see why proration and accurate counting 
across slices are mutually exclusive.
   > 
   > In any case, unlike proration, this PR does not propose any algorithmic 
changes to the way collection is done -- it simply reduces extra work done 
across slices that we do not even advertise today, so might be something that 
the user is unaware of.
   
   To summarize my monologue, this PR is aimed at accurate counting of hits 
across all slices -- whereas proration targets a different use case of  trying 
to "distributing" hits across slices based on some parameters.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on issue #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

atris commented on issue #823: LUCENE-8939: Introduce Shared Count Early 
Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#issuecomment-527284505
 
 
   > https://issues.apache.org/jira/browse/LUCENE-8681
   
   I am not sure I see how that solves the problem?
   
   The core change targeted in this PR is allowing `TOTAL_HITS_THRESHOLD` to be 
accurately counted across all slices. Today, we collect `TOTAL_HITS_THRESHOLD` 
per slice, which is not what the API definition is. Post this PR, we will 
collect `TOTAL_HITS_THRESHOLD` in aggregate. `TOTAL_HITS_THRESHOLD`' s 
definition does not guarantee any order of collection of hits in the concurrent 
case -- we inadvertently define one today by collection the threshold number of 
hits per slice.
   
   RE: Proration, I believe that is a custom logic that can be added on top of 
this change. In any case, the proration logic also works on a bunch of static 
values + fudge factors, so it can go wrong and we might end up collecting 
lesser hits from a more valuable segment. To help prevent this scenario, I 
believe proration might also do well to build upon this PR and use the shared 
counter. But, I am unable to see why proration and accurate counting across 
slices are mutually exclusive.
   
   In any case, unlike proration, this PR does not propose any algorithmic 
changes to the way collection is done -- it simply reduces extra work done 
across slices that we do not even advertise today, so might be something that 
the user is unaware of.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-09-02 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921097#comment-16921097
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit 3aa3ce7b9b3c0ee1daf533bbda26cefad3de835a in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3aa3ce7 ]

SOLR-13105: Revamp simulations docs 7


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-09-02 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921093#comment-16921093
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit cc3ed5830156415d445d5d7d6355d822f8b6e229 in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cc3ed58 ]

SOLR-13105: Revamp simulations docs 6


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-09-02 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921092#comment-16921092
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit 613996202f6e480a4834f91abdc53afa86f16850 in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6139962 ]

SOLR-13105: Revamp simulations docs 5


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-09-02 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921091#comment-16921091
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit ff4404412ee44d14e76e017ba6fec99e1b69310f in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ff44044 ]

SOLR-13105: Revamp simulations docs 4


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-09-02 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921090#comment-16921090
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit 478c8a440a79cf04b19cceb78a79305c08b33ff4 in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=478c8a4 ]

SOLR-13105: Revamp simulations docs 3


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on issue #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

mikemccand commented on issue #823: LUCENE-8939: Introduce Shared Count Early 
Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#issuecomment-527262380
 
 
   > > > I'm trying to understand the behavior change Lucene users will see 
with this, when using concurrent searching for one query (passing 
`ExecutorService` to `IndexSearcher`):
   > > > It looks like with the change such users will see their search 
precisely when the total collected hits exceeds the limit (1000 by default?), 
versus today where we will try to collect 1000 per segment and then reduce that 
to the top 1000 overall? So this means the results will change depending on 
thread execution/timing?
   > > 
   > > 
   > > Looking at the documentation around `TOTAL_HITS_THRESHOLD`, I see that 
it intends to restrict the number of documents scored in total before the query 
is early terminated. If we do a single threaded search today, that is the 
behavior we get. However, for concurrent search, we actually look at N * 
`TOTAL_HITS_THRESHOLD`, where N is the number of slices. So, I believe that we 
are not doing the advertised behavior for concurrent searches in the status 
quo. This change should fix that.
   > > However, you are correct that thread timing will come into play here -- 
different slices may have different contributions to the overall number of 
hits. However, since we are anyways not scoring all documents, I do not believe 
we offer any guarantees on the documents that we return -- even today, the best 
documents might be the ones which just came in and hence are on the last 
segments to be traversed, so never even get looked. WDYT?
   > 
   > OK that makes sense @atris -- it seems that which specific top hits you'll 
get back is intentionally not defined in the API and so we have the freedom to 
make improvements like this.
   
   I'm still confused about this change -- wouldn't it be better to e.g. 
pro-rate the topN per segment for the concurrent case 
(https://issues.apache.org/jira/browse/LUCENE-8681) rather than rely on the 
JVM's thread scheduling to determine which `TOTAL_HITS_THRESHOLD` hits are 
collected?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-13723) JettySolrRunner should support /api/* (the v2 end point)

2019-09-02 Thread Noble Paul (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul resolved SOLR-13723.
---
Fix Version/s: 8.3
   Resolution: Fixed

> JettySolrRunner should support /api/* (the v2 end point)
> 
>
> Key: SOLR-13723
> URL: https://issues.apache.org/jira/browse/SOLR-13723
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
>  Labels: test-framework
> Fix For: 8.3
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Today we are not able to write proper v2 API testcases because of this



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-13723) JettySolrRunner should support /api/* (the v2 end point)

2019-09-02 Thread Noble Paul (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul reassigned SOLR-13723:
-

Assignee: Noble Paul

> JettySolrRunner should support /api/* (the v2 end point)
> 
>
> Key: SOLR-13723
> URL: https://issues.apache.org/jira/browse/SOLR-13723
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
>  Labels: test-framework
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Today we are not able to write proper v2 API testcases because of this



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-09-02 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921066#comment-16921066
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit 897eab85be4158a27934080633f1c496080665fe in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=897eab8 ]

SOLR-13105: Revamp simulations docs 2


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-NightlyTests-8.x - Build # 200 - Failure

2019-09-02 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-8.x/200/

No tests ran.

Build Log:
[...truncated 25 lines...]
ERROR: Failed to check out http://svn.apache.org/repos/asf/lucene/test-data
org.tmatesoft.svn.core.SVNException: svn: E175002: connection refused by the 
server
svn: E175002: OPTIONS request failed on '/repos/asf/lucene/test-data'
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:112)
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:96)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:765)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:352)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:340)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.performHttpRequest(DAVConnection.java:910)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.exchangeCapabilities(DAVConnection.java:702)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.open(DAVConnection.java:113)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVRepository.openConnection(DAVRepository.java:1035)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVRepository.getLatestRevision(DAVRepository.java:164)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgRepositoryAccess.getRevisionNumber(SvnNgRepositoryAccess.java:119)
at 
org.tmatesoft.svn.core.internal.wc2.SvnRepositoryAccess.getLocations(SvnRepositoryAccess.java:178)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgRepositoryAccess.createRepositoryFor(SvnNgRepositoryAccess.java:43)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgAbstractUpdate.checkout(SvnNgAbstractUpdate.java:831)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgCheckout.run(SvnNgCheckout.java:26)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgCheckout.run(SvnNgCheckout.java:11)
at 
org.tmatesoft.svn.core.internal.wc2.ng.SvnNgOperationRunner.run(SvnNgOperationRunner.java:20)
at 
org.tmatesoft.svn.core.internal.wc2.SvnOperationRunner.run(SvnOperationRunner.java:21)
at 
org.tmatesoft.svn.core.wc2.SvnOperationFactory.run(SvnOperationFactory.java:1239)
at org.tmatesoft.svn.core.wc2.SvnOperation.run(SvnOperation.java:294)
at 
hudson.scm.subversion.CheckoutUpdater$SubversionUpdateTask.perform(CheckoutUpdater.java:133)
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:168)
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:176)
at 
hudson.scm.subversion.UpdateUpdater$TaskImpl.perform(UpdateUpdater.java:134)
at 
hudson.scm.subversion.WorkspaceUpdater$UpdateTask.delegateTo(WorkspaceUpdater.java:168)
at 
hudson.scm.SubversionSCM$CheckOutTask.perform(SubversionSCM.java:1041)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:1017)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:990)
at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3086)
at hudson.remoting.UserRequest.perform(UserRequest.java:212)
at hudson.remoting.UserRequest.perform(UserRequest.java:54)
at hudson.remoting.Request$2.run(Request.java:369)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at 
org.tmatesoft.svn.core.internal.util.SVNSocketConnection.run(SVNSocketConnection.java:57)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
... 4 more
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)

[jira] [Commented] (SOLR-13452) Update the lucene-solr build from Ivy+Ant+Maven (shadow build) to Gradle.

2019-09-02 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921063#comment-16921063
 ] 

ASF subversion and git services commented on SOLR-13452:


Commit 3838894ff7ee2d19f5e7f439be3585e5bef63f98 in lucene-solr's branch 
refs/heads/jira/SOLR-13452_gradle_6 from Mark Robert Miller
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3838894 ]

SOLR-13452: Start improving logging around missingDependencies.


> Update the lucene-solr build from Ivy+Ant+Maven (shadow build) to Gradle.
> -
>
> Key: SOLR-13452
> URL: https://issues.apache.org/jira/browse/SOLR-13452
> Project: Solr
>  Issue Type: Improvement
>  Components: Build
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: gradle-build.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I took some things from the great work that Dat did in 
> [https://github.com/apache/lucene-solr/tree/jira/gradle] and took the ball a 
> little further.
>  
> When working with gradle in sub modules directly, I recommend 
> [https://github.com/dougborg/gdub]
> This gradle branch uses the following plugin for version locking, version 
> configuration and version consistency across modules: 
> [https://github.com/palantir/gradle-consistent-versions]
>  
> https://github.com/apache/lucene-solr/tree/jira/SOLR-13452_gradle_6



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-SmokeRelease-master - Build # 1440 - Still Failing

2019-09-02 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-master/1440/

No tests ran.

Build Log:
[...truncated 24456 lines...]
[asciidoctor:convert] asciidoctor: ERROR: about-this-guide.adoc: line 1: 
invalid part, must have at least one section (e.g., chapter, appendix, etc.)
[asciidoctor:convert] asciidoctor: ERROR: solr-glossary.adoc: line 1: invalid 
part, must have at least one section (e.g., chapter, appendix, etc.)
 [java] Processed 2595 links (2121 relative) to 3660 anchors in 260 files
 [echo] Validated Links & Anchors via: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/solr/build/solr-ref-guide/bare-bones-html/

-dist-changes:
 [copy] Copying 4 files to 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/solr/package/changes

package:

-unpack-solr-tgz:

-ensure-solr-tgz-exists:
[mkdir] Created dir: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/solr/build/solr.tgz.unpacked
[untar] Expanding: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/solr/package/solr-9.0.0.tgz
 into 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/solr/build/solr.tgz.unpacked

generate-maven-artifacts:

resolve:

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

[jira] [Commented] (SOLR-13452) Update the lucene-solr build from Ivy+Ant+Maven (shadow build) to Gradle.

2019-09-02 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921016#comment-16921016
 ] 

ASF subversion and git services commented on SOLR-13452:


Commit 1c02c5971d3736c70a0870bed76f7be8bbb6f2af in lucene-solr's branch 
refs/heads/jira/SOLR-13452_gradle_6 from Mark Robert Miller
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1c02c59 ]

SOLR-13452: Fix up how we make sure buildSrc resources are handled nicely in 
IDE.


> Update the lucene-solr build from Ivy+Ant+Maven (shadow build) to Gradle.
> -
>
> Key: SOLR-13452
> URL: https://issues.apache.org/jira/browse/SOLR-13452
> Project: Solr
>  Issue Type: Improvement
>  Components: Build
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: gradle-build.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I took some things from the great work that Dat did in 
> [https://github.com/apache/lucene-solr/tree/jira/gradle] and took the ball a 
> little further.
>  
> When working with gradle in sub modules directly, I recommend 
> [https://github.com/dougborg/gdub]
> This gradle branch uses the following plugin for version locking, version 
> configuration and version consistency across modules: 
> [https://github.com/palantir/gradle-consistent-versions]
>  
> https://github.com/apache/lucene-solr/tree/jira/SOLR-13452_gradle_6



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-09-02 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920996#comment-16920996
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit cd8c74948a1bc129cd1b2a9f730eadbe521300bf in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cd8c749 ]

SOLR-13105: Revamp simulations docs


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] diegoceccarelli commented on a change in pull request #300: SOLR-11831: Skip second grouping step if group.limit is 1 (aka Las Vegas Patch)

2019-09-02 Thread GitBox

diegoceccarelli commented on a change in pull request #300: SOLR-11831: Skip 
second grouping step if group.limit is 1 (aka Las Vegas Patch)
URL: https://github.com/apache/lucene-solr/pull/300#discussion_r320021306
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/search/grouping/distributed/responseprocessor/SkipSecondStepSearchGroupShardResponseProcessor.java
 ##
 @@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.search.grouping.distributed.responseprocessor;
+
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.lucene.search.Sort;
+import org.apache.lucene.search.TotalHits;
+import org.apache.lucene.search.grouping.GroupDocs;
+import org.apache.lucene.search.grouping.SearchGroup;
+import org.apache.lucene.search.grouping.TopGroups;
+import org.apache.lucene.util.BytesRef;
+import org.apache.solr.handler.component.ResponseBuilder;
+import org.apache.solr.handler.component.ShardDoc;
+import org.apache.solr.handler.component.ShardRequest;
+import org.apache.solr.handler.component.ShardResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.search.grouping.GroupingSpecification;
+import 
org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer;
+
+public class SkipSecondStepSearchGroupShardResponseProcessor extends 
SearchGroupShardResponseProcessor {
+
+  @Override
+  protected SearchGroupsResultTransformer 
newSearchGroupsResultTransformer(SolrIndexSearcher solrIndexSearcher) {
+return new 
SearchGroupsResultTransformer.SkipSecondStepSearchResultResultTransformer(solrIndexSearcher);
+  }
+
+  @Override
+  protected SearchGroupsContainer newSearchGroupsContainer(ResponseBuilder rb) 
{
+return new 
SkipSecondStepSearchGroupsContainer(rb.getGroupingSpec().getFields());
+  }
+
+  @Override
+  public void process(ResponseBuilder rb, ShardRequest shardRequest) {
+super.process(rb, shardRequest);
+TopGroupsShardResponseProcessor.fillResultIds(rb);
+  }
+
+  protected static class SkipSecondStepSearchGroupsContainer extends 
SearchGroupsContainer {
+
+private final Map docIdToShard = new HashMap<>();
+
+public SkipSecondStepSearchGroupsContainer(String[] fields) {
+  super(fields);
+}
+
+@Override
+public void addSearchGroups(ShardResponse srsp, String field, 
Collection> searchGroups) {
+  super.addSearchGroups(srsp, field, searchGroups);
+  for (SearchGroup searchGroup : searchGroups) {
+assert(srsp.getShard() != null);
+docIdToShard.put(searchGroup.topDocSolrId, srsp.getShard());
+  }
+}
+
+@Override
+public void addMergedSearchGroups(ResponseBuilder rb, String groupField, 
Collection> mergedTopGroups ) {
+  // TODO: add comment or javadoc re: why this method is overridden as a 
no-op
+}
+
+@Override
+public void addSearchGroupToShards(ResponseBuilder rb, String groupField, 
Collection> mergedTopGroups) {
+  super.addSearchGroupToShards(rb, groupField, mergedTopGroups);
+
+  final GroupingSpecification groupingSpecification = rb.getGroupingSpec();
+  final Sort groupSort = 
groupingSpecification.getGroupSortSpec().getSort();
+
+  GroupDocs[] groups = new GroupDocs[mergedTopGroups.size()];
+
+  // This is the max score found in any document on any group
+  float maxScore = 0;
+  int index = 0;
+
+  for (SearchGroup group : mergedTopGroups) {
+maxScore = Math.max(maxScore, group.topDocScore);
+final String shard = docIdToShard.get(group.topDocSolrId);
+assert(shard != null);
+final ShardDoc sdoc = new ShardDoc();
+sdoc.score = group.topDocScore;
+sdoc.id = group.topDocSolrId;
+sdoc.shard = shard;
+
+groups[index++] = new GroupDocs<>(group.topDocScore,
 
 Review comment:
   I'm a bit concerned about using `NaN` because *NaN cannot be compared with 
any floating type value* I'll have to double check that we are not sorting on 
that.


This is an automated message from the Apache Git Service.
To respond to the message, please log

[GitHub] [lucene-solr] diegoceccarelli commented on a change in pull request #300: SOLR-11831: Skip second grouping step if group.limit is 1 (aka Las Vegas Patch)

2019-09-02 Thread GitBox

diegoceccarelli commented on a change in pull request #300: SOLR-11831: Skip 
second grouping step if group.limit is 1 (aka Las Vegas Patch)
URL: https://github.com/apache/lucene-solr/pull/300#discussion_r320021306
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/search/grouping/distributed/responseprocessor/SkipSecondStepSearchGroupShardResponseProcessor.java
 ##
 @@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.search.grouping.distributed.responseprocessor;
+
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.lucene.search.Sort;
+import org.apache.lucene.search.TotalHits;
+import org.apache.lucene.search.grouping.GroupDocs;
+import org.apache.lucene.search.grouping.SearchGroup;
+import org.apache.lucene.search.grouping.TopGroups;
+import org.apache.lucene.util.BytesRef;
+import org.apache.solr.handler.component.ResponseBuilder;
+import org.apache.solr.handler.component.ShardDoc;
+import org.apache.solr.handler.component.ShardRequest;
+import org.apache.solr.handler.component.ShardResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.search.grouping.GroupingSpecification;
+import 
org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer;
+
+public class SkipSecondStepSearchGroupShardResponseProcessor extends 
SearchGroupShardResponseProcessor {
+
+  @Override
+  protected SearchGroupsResultTransformer 
newSearchGroupsResultTransformer(SolrIndexSearcher solrIndexSearcher) {
+return new 
SearchGroupsResultTransformer.SkipSecondStepSearchResultResultTransformer(solrIndexSearcher);
+  }
+
+  @Override
+  protected SearchGroupsContainer newSearchGroupsContainer(ResponseBuilder rb) 
{
+return new 
SkipSecondStepSearchGroupsContainer(rb.getGroupingSpec().getFields());
+  }
+
+  @Override
+  public void process(ResponseBuilder rb, ShardRequest shardRequest) {
+super.process(rb, shardRequest);
+TopGroupsShardResponseProcessor.fillResultIds(rb);
+  }
+
+  protected static class SkipSecondStepSearchGroupsContainer extends 
SearchGroupsContainer {
+
+private final Map docIdToShard = new HashMap<>();
+
+public SkipSecondStepSearchGroupsContainer(String[] fields) {
+  super(fields);
+}
+
+@Override
+public void addSearchGroups(ShardResponse srsp, String field, 
Collection> searchGroups) {
+  super.addSearchGroups(srsp, field, searchGroups);
+  for (SearchGroup searchGroup : searchGroups) {
+assert(srsp.getShard() != null);
+docIdToShard.put(searchGroup.topDocSolrId, srsp.getShard());
+  }
+}
+
+@Override
+public void addMergedSearchGroups(ResponseBuilder rb, String groupField, 
Collection> mergedTopGroups ) {
+  // TODO: add comment or javadoc re: why this method is overridden as a 
no-op
+}
+
+@Override
+public void addSearchGroupToShards(ResponseBuilder rb, String groupField, 
Collection> mergedTopGroups) {
+  super.addSearchGroupToShards(rb, groupField, mergedTopGroups);
+
+  final GroupingSpecification groupingSpecification = rb.getGroupingSpec();
+  final Sort groupSort = 
groupingSpecification.getGroupSortSpec().getSort();
+
+  GroupDocs[] groups = new GroupDocs[mergedTopGroups.size()];
+
+  // This is the max score found in any document on any group
+  float maxScore = 0;
+  int index = 0;
+
+  for (SearchGroup group : mergedTopGroups) {
+maxScore = Math.max(maxScore, group.topDocScore);
+final String shard = docIdToShard.get(group.topDocSolrId);
+assert(shard != null);
+final ShardDoc sdoc = new ShardDoc();
+sdoc.score = group.topDocScore;
+sdoc.id = group.topDocSolrId;
+sdoc.shard = shard;
+
+groups[index++] = new GroupDocs<>(group.topDocScore,
 
 Review comment:
   I'm a bit concerned about using `NaN` because *NaN cannot be compared with 
any floating type value* I'll have to double check that we not sorting on that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to

[GitHub] [lucene-solr] diegoceccarelli commented on a change in pull request #300: SOLR-11831: Skip second grouping step if group.limit is 1 (aka Las Vegas Patch)

2019-09-02 Thread GitBox

diegoceccarelli commented on a change in pull request #300: SOLR-11831: Skip 
second grouping step if group.limit is 1 (aka Las Vegas Patch)
URL: https://github.com/apache/lucene-solr/pull/300#discussion_r320020253
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/search/grouping/distributed/responseprocessor/SkipSecondStepSearchGroupShardResponseProcessor.java
 ##
 @@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.search.grouping.distributed.responseprocessor;
+
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.lucene.search.Sort;
+import org.apache.lucene.search.TotalHits;
+import org.apache.lucene.search.grouping.GroupDocs;
+import org.apache.lucene.search.grouping.SearchGroup;
+import org.apache.lucene.search.grouping.TopGroups;
+import org.apache.lucene.util.BytesRef;
+import org.apache.solr.handler.component.ResponseBuilder;
+import org.apache.solr.handler.component.ShardDoc;
+import org.apache.solr.handler.component.ShardRequest;
+import org.apache.solr.handler.component.ShardResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.search.grouping.GroupingSpecification;
+import 
org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer;
+
+public class SkipSecondStepSearchGroupShardResponseProcessor extends 
SearchGroupShardResponseProcessor {
+
+  @Override
+  protected SearchGroupsResultTransformer 
newSearchGroupsResultTransformer(SolrIndexSearcher solrIndexSearcher) {
+return new 
SearchGroupsResultTransformer.SkipSecondStepSearchResultResultTransformer(solrIndexSearcher);
+  }
+
+  @Override
+  protected SearchGroupsContainer newSearchGroupsContainer(ResponseBuilder rb) 
{
+return new 
SkipSecondStepSearchGroupsContainer(rb.getGroupingSpec().getFields());
+  }
+
+  @Override
+  public void process(ResponseBuilder rb, ShardRequest shardRequest) {
+super.process(rb, shardRequest);
+TopGroupsShardResponseProcessor.fillResultIds(rb);
+  }
+
+  protected static class SkipSecondStepSearchGroupsContainer extends 
SearchGroupsContainer {
+
+private final Map docIdToShard = new HashMap<>();
+
+public SkipSecondStepSearchGroupsContainer(String[] fields) {
+  super(fields);
+}
+
+@Override
+public void addSearchGroups(ShardResponse srsp, String field, 
Collection> searchGroups) {
+  super.addSearchGroups(srsp, field, searchGroups);
+  for (SearchGroup searchGroup : searchGroups) {
+assert(srsp.getShard() != null);
+docIdToShard.put(searchGroup.topDocSolrId, srsp.getShard());
+  }
+}
+
+@Override
+public void addMergedSearchGroups(ResponseBuilder rb, String groupField, 
Collection> mergedTopGroups ) {
+  // TODO: add comment or javadoc re: why this method is overridden as a 
no-op
+}
+
+@Override
+public void addSearchGroupToShards(ResponseBuilder rb, String groupField, 
Collection> mergedTopGroups) {
+  super.addSearchGroupToShards(rb, groupField, mergedTopGroups);
+
+  final GroupingSpecification groupingSpecification = rb.getGroupingSpec();
+  final Sort groupSort = 
groupingSpecification.getGroupSortSpec().getSort();
+
+  GroupDocs[] groups = new GroupDocs[mergedTopGroups.size()];
+
+  // This is the max score found in any document on any group
+  float maxScore = 0;
+  int index = 0;
+
+  for (SearchGroup group : mergedTopGroups) {
+maxScore = Math.max(maxScore, group.topDocScore);
+final String shard = docIdToShard.get(group.topDocSolrId);
+assert(shard != null);
+final ShardDoc sdoc = new ShardDoc();
+sdoc.score = group.topDocScore;
+sdoc.id = group.topDocSolrId;
+sdoc.shard = shard;
+
+groups[index++] = new GroupDocs<>(group.topDocScore,
+group.topDocScore,
+new TotalHits(1, TotalHits.Relation.EQUAL_TO), /* we don't know 
the actual number of hits in the group- we set it to 1 as we only keep track of 
the top doc */
+new ShardDoc[] { sdoc }, /* only top doc */
+group.groupValue,
+group.sortValues);
+  }
+

[GitHub] [lucene-solr] diegoceccarelli commented on a change in pull request #300: SOLR-11831: Skip second grouping step if group.limit is 1 (aka Las Vegas Patch)

2019-09-02 Thread GitBox

diegoceccarelli commented on a change in pull request #300: SOLR-11831: Skip 
second grouping step if group.limit is 1 (aka Las Vegas Patch)
URL: https://github.com/apache/lucene-solr/pull/300#discussion_r320020253
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/search/grouping/distributed/responseprocessor/SkipSecondStepSearchGroupShardResponseProcessor.java
 ##
 @@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.search.grouping.distributed.responseprocessor;
+
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.lucene.search.Sort;
+import org.apache.lucene.search.TotalHits;
+import org.apache.lucene.search.grouping.GroupDocs;
+import org.apache.lucene.search.grouping.SearchGroup;
+import org.apache.lucene.search.grouping.TopGroups;
+import org.apache.lucene.util.BytesRef;
+import org.apache.solr.handler.component.ResponseBuilder;
+import org.apache.solr.handler.component.ShardDoc;
+import org.apache.solr.handler.component.ShardRequest;
+import org.apache.solr.handler.component.ShardResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.search.grouping.GroupingSpecification;
+import 
org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer;
+
+public class SkipSecondStepSearchGroupShardResponseProcessor extends 
SearchGroupShardResponseProcessor {
+
+  @Override
+  protected SearchGroupsResultTransformer 
newSearchGroupsResultTransformer(SolrIndexSearcher solrIndexSearcher) {
+return new 
SearchGroupsResultTransformer.SkipSecondStepSearchResultResultTransformer(solrIndexSearcher);
+  }
+
+  @Override
+  protected SearchGroupsContainer newSearchGroupsContainer(ResponseBuilder rb) 
{
+return new 
SkipSecondStepSearchGroupsContainer(rb.getGroupingSpec().getFields());
+  }
+
+  @Override
+  public void process(ResponseBuilder rb, ShardRequest shardRequest) {
+super.process(rb, shardRequest);
+TopGroupsShardResponseProcessor.fillResultIds(rb);
+  }
+
+  protected static class SkipSecondStepSearchGroupsContainer extends 
SearchGroupsContainer {
+
+private final Map docIdToShard = new HashMap<>();
+
+public SkipSecondStepSearchGroupsContainer(String[] fields) {
+  super(fields);
+}
+
+@Override
+public void addSearchGroups(ShardResponse srsp, String field, 
Collection> searchGroups) {
+  super.addSearchGroups(srsp, field, searchGroups);
+  for (SearchGroup searchGroup : searchGroups) {
+assert(srsp.getShard() != null);
+docIdToShard.put(searchGroup.topDocSolrId, srsp.getShard());
+  }
+}
+
+@Override
+public void addMergedSearchGroups(ResponseBuilder rb, String groupField, 
Collection> mergedTopGroups ) {
+  // TODO: add comment or javadoc re: why this method is overridden as a 
no-op
+}
+
+@Override
+public void addSearchGroupToShards(ResponseBuilder rb, String groupField, 
Collection> mergedTopGroups) {
+  super.addSearchGroupToShards(rb, groupField, mergedTopGroups);
+
+  final GroupingSpecification groupingSpecification = rb.getGroupingSpec();
+  final Sort groupSort = 
groupingSpecification.getGroupSortSpec().getSort();
+
+  GroupDocs[] groups = new GroupDocs[mergedTopGroups.size()];
+
+  // This is the max score found in any document on any group
+  float maxScore = 0;
+  int index = 0;
+
+  for (SearchGroup group : mergedTopGroups) {
+maxScore = Math.max(maxScore, group.topDocScore);
+final String shard = docIdToShard.get(group.topDocSolrId);
+assert(shard != null);
+final ShardDoc sdoc = new ShardDoc();
+sdoc.score = group.topDocScore;
+sdoc.id = group.topDocSolrId;
+sdoc.shard = shard;
+
+groups[index++] = new GroupDocs<>(group.topDocScore,
+group.topDocScore,
+new TotalHits(1, TotalHits.Relation.EQUAL_TO), /* we don't know 
the actual number of hits in the group- we set it to 1 as we only keep track of 
the top doc */
+new ShardDoc[] { sdoc }, /* only top doc */
+group.groupValue,
+group.sortValues);
+  }
+

[GitHub] [lucene-solr] atris commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

atris commented on a change in pull request #823: LUCENE-8939: Introduce Shared 
Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r320019631
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/HitsThresholdChecker.java
 ##
 @@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.LongAdder;
+import java.util.function.BooleanSupplier;
+
+/**
+ * Used for defining custom algorithms to allow searches to early terminate
+ */
+abstract class HitsThresholdChecker implements BooleanSupplier {
+  /**
+   * Implementation of HitsThresholdChecker which allows global hit counting
+   */
+  private static class GlobalHitsThresholdChecker extends HitsThresholdChecker 
{
+private final int totalHitsThreshold;
+private final LongAdder globalHitCount;
 
 Review comment:
   Reverted to AtomicLong per discussion


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on issue #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

mikemccand commented on issue #823: LUCENE-8939: Introduce Shared Count Early 
Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#issuecomment-527203682
 
 
   > > I'm trying to understand the behavior change Lucene users will see with 
this, when using concurrent searching for one query (passing `ExecutorService` 
to `IndexSearcher`):
   > > It looks like with the change such users will see their search precisely 
when the total collected hits exceeds the limit (1000 by default?), versus 
today where we will try to collect 1000 per segment and then reduce that to the 
top 1000 overall? So this means the results will change depending on thread 
execution/timing?
   > 
   > Looking at the documentation around `TOTAL_HITS_THRESHOLD`, I see that it 
intends to restrict the number of documents scored in total before the query is 
early terminated. If we do a single threaded search today, that is the behavior 
we get. However, for concurrent search, we actually look at N * 
`TOTAL_HITS_THRESHOLD`, where N is the number of slices. So, I believe that we 
are not doing the advertised behavior for concurrent searches in the status 
quo. This change should fix that.
   > 
   > However, you are correct that thread timing will come into play here -- 
different slices may have different contributions to the overall number of 
hits. However, since we are anyways not scoring all documents, I do not believe 
we offer any guarantees on the documents that we return -- even today, the best 
documents might be the ones which just came in and hence are on the last 
segments to be traversed, so never even get looked. WDYT?
   
   OK that makes sense @atris -- it seems that which specific top hits you'll 
get back is intentionally not defined in the API and so we have the freedom to 
make improvements like this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

mikemccand commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r320018343
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/HitsThresholdChecker.java
 ##
 @@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.LongAdder;
+import java.util.function.BooleanSupplier;
+
+/**
+ * Used for defining custom algorithms to allow searches to early terminate
+ */
+abstract class HitsThresholdChecker implements BooleanSupplier {
+  /**
+   * Implementation of HitsThresholdChecker which allows global hit counting
+   */
+  private static class GlobalHitsThresholdChecker extends HitsThresholdChecker 
{
+private final int totalHitsThreshold;
+private final LongAdder globalHitCount;
 
 Review comment:
   Ahh thanks for the clarification @jpountz.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] diegoceccarelli commented on a change in pull request #300: SOLR-11831: Skip second grouping step if group.limit is 1 (aka Las Vegas Patch)

2019-09-02 Thread GitBox

diegoceccarelli commented on a change in pull request #300: SOLR-11831: Skip 
second grouping step if group.limit is 1 (aka Las Vegas Patch)
URL: https://github.com/apache/lucene-solr/pull/300#discussion_r320017507
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/search/grouping/endresulttransformer/GroupedEndResultTransformer.java
 ##
 @@ -75,7 +75,13 @@ public void transform(Map result, 
ResponseBuilder rb, SolrDocumentSou
   SimpleOrderedMap groupResult = new SimpleOrderedMap<>();
   if (group.groupValue != null) {
 // use createFields so that fields having doc values are also 
supported
-List fields = 
groupField.createFields(group.groupValue.utf8ToString());
+final String groupValue;
+if (rb.getGroupingSpec().isSkipSecondGroupingStep()) {
+  groupValue = 
groupField.getType().indexedToReadable(group.groupValue.utf8ToString());
 
 Review comment:
   Looks good, merged


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] diegoceccarelli commented on a change in pull request #300: SOLR-11831: Skip second grouping step if group.limit is 1 (aka Las Vegas Patch)

2019-09-02 Thread GitBox

diegoceccarelli commented on a change in pull request #300: SOLR-11831: Skip 
second grouping step if group.limit is 1 (aka Las Vegas Patch)
URL: https://github.com/apache/lucene-solr/pull/300#discussion_r320017542
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/search/grouping/distributed/shardresultserializer/SearchGroupsResultTransformer.java
 ##
 @@ -142,4 +150,58 @@ private NamedList 
serializeSearchGroup(Collection> data, S
 return result;
   }
 
+  public static class SkipSecondStepSearchResultResultTransformer extends 
SearchGroupsResultTransformer {
+
+private static final String TOP_DOC_SOLR_ID_KEY = "topDocSolrId";
+private static final String TOP_DOC_SCORE_KEY = "topDocScore";
+private static final String SORTVALUES_KEY = "sortValues";
+
+private final SchemaField uniqueField;
+
+public SkipSecondStepSearchResultResultTransformer(SolrIndexSearcher 
searcher) {
+  super(searcher);
+  this.uniqueField = searcher.getSchema().getUniqueKeyField();
+}
+
+@Override
+protected Object[] getSortValues(Object groupDocs) {
+  NamedList groupInfo = (NamedList) groupDocs;
+  final ArrayList sortValues = (ArrayList) 
groupInfo.get(SORTVALUES_KEY);
+  return sortValues.toArray(new Comparable[sortValues.size()]);
+}
+
+@Override
+protected SearchGroup deserializeOneSearchGroup(SchemaField 
groupField, String groupValue,
+  SortField[] 
groupSortField, Object rawSearchGroupData) {
+  SearchGroup searchGroup = 
super.deserializeOneSearchGroup(groupField, groupValue, groupSortField, 
rawSearchGroupData);
+  NamedList groupInfo = (NamedList) rawSearchGroupData;
+  searchGroup.topDocLuceneId = DocIdSetIterator.NO_MORE_DOCS;
+  searchGroup.topDocScore = (float) groupInfo.get(TOP_DOC_SCORE_KEY);
+  searchGroup.topDocSolrId = groupInfo.get(TOP_DOC_SOLR_ID_KEY);
+  return searchGroup;
+}
+
+@Override
+protected Object serializeOneSearchGroup(SortField[] groupSortField, 
SearchGroup searchGroup) {
+  Document luceneDoc = null;
+  /** Use the lucene id to get the unique solr id so that it can be sent 
to the federator.
+   * The lucene id of a document is not unique across all shards i.e. 
different documents
+   * in different shards could have the same lucene id, whereas the solr 
id is guaranteed
+   * to be unique so this is what we need to return to the federator
+   **/
+  try {
+luceneDoc = searcher.doc(searchGroup.topDocLuceneId, 
Collections.singleton(uniqueField.getName()));
 
 Review comment:
   Looks good, merged


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8860) LatLonShapeBoundingBoxQuery could make more decisions on inner nodes

2019-09-02 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920960#comment-16920960
 ] 

Adrien Grand commented on LUCENE-8860:
--

Why is it a problem? If the query polygon fully contains all left edges of the 
MBRs on a given node then we would know it intersects all indexed triangles, 
just like when the query is a box?

> LatLonShapeBoundingBoxQuery could make more decisions on inner nodes
> 
>
> Key: LUCENE-8860
> URL: https://issues.apache.org/jira/browse/LUCENE-8860
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: fig1.png, fig2.png, fig3.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently LatLonShapeBoundingBoxQuery with the INTERSECTS relation only 
> returns CELL_INSIDE_QUERY if the query contains ALL minimum bounding 
> rectangles of the indexed triangles.
> I think we could return CELL_INSIDE_QUERY if the box contains either of the 
> edges of all MBRs of indexed triangles since triangles are guaranteed to 
> touch all edges of their MBR by definition. In some cases this would help 
> save decoding triangles and running costly point-in-triangle computations.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13733) add more class-level javadocs for public org.apache.solr.metrics classes

2019-09-02 Thread Christine Poerschke (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920959#comment-16920959
 ] 

Christine Poerschke commented on SOLR-13733:


Attached patch is partial e.g. for 
[SolrRrdBackend.java|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/metrics/rrd/SolrRrdBackend.java#L32]
 I couldn't easily see what a good description might be and 
[JmxMetricsReporter.java|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/metrics/reporters/jmx/JmxMetricsReporter.java]
 has some inner {{public interface ...}} classes/interfaces which don't yet 
have javadocs.

> add more class-level javadocs for public org.apache.solr.metrics classes
> 
>
> Key: SOLR-13733
> URL: https://issues.apache.org/jira/browse/SOLR-13733
> Project: Solr
>  Issue Type: Task
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-13733.patch
>
>
> Most public {{org.apache.solr.metrics}} classes already have class-level 
> javadocs and adding javadocs for the few that don't have them yet could be a 
> step towards {{ant precommit}} requiring(\*) javadocs (class-level only, for 
> public classes, for org.apache.solr.metrics) selectively on an opt-in basis.
> (\*) = e.g. via {{solr/build.xml}} content like this:
> {code}
>  dir="${javadoc.dir}/solr-core/org/apache/solr/metrics" level="class"/>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8961) CheckIndex: pre-exorcise document id salvage

2019-09-02 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920958#comment-16920958
 ] 

Adrien Grand commented on LUCENE-8961:
--

This feels too unsafe to me for CheckIndex. For instance, what if idField is 
the corrupt field, you could end up with missing ids or the wrong ids? I'm fine 
with adding more information to the CheckIndex status in order to make it 
easier to do this kind of hacks on top of CheckIndex, but I'd like to keep 
CheckIndex something that is rock solid.

> CheckIndex: pre-exorcise document id salvage
> 
>
> Key: LUCENE-8961
> URL: https://issues.apache.org/jira/browse/LUCENE-8961
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: LUCENE-8961.patch
>
>
> The 
> [CheckIndex|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java]
>  tool supports the exorcising of corrupt segments from an index.
> This ticket proposes to add an extra option which could first be used to 
> potentially salvage the document ids of the segment(s) about to be exorcised. 
> Re-ingestion for those documents could then be arranged so as to repair the 
> data damage caused by the exorcising.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-13733) add more class-level javadocs for public org.apache.solr.metrics classes

2019-09-02 Thread Christine Poerschke (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-13733:
---
Summary: add more class-level javadocs for public org.apache.solr.metrics 
classes  (was: add class-level javadocs for public org.apache.solr.metrics 
classes)

> add more class-level javadocs for public org.apache.solr.metrics classes
> 
>
> Key: SOLR-13733
> URL: https://issues.apache.org/jira/browse/SOLR-13733
> Project: Solr
>  Issue Type: Task
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-13733.patch
>
>
> Most public {{org.apache.solr.metrics}} classes already have class-level 
> javadocs and adding javadocs for the few that don't have them yet could be a 
> step towards {{ant precommit}} requiring(\*) javadocs (class-level only, for 
> public classes, for org.apache.solr.metrics) selectively on an opt-in basis.
> (\*) = e.g. via {{solr/build.xml}} content like this:
> {code}
>  dir="${javadoc.dir}/solr-core/org/apache/solr/metrics" level="class"/>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-13733) add class-level javadocs for public org.apache.solr.metrics classes

2019-09-02 Thread Christine Poerschke (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-13733:
---
Attachment: SOLR-13733.patch

> add class-level javadocs for public org.apache.solr.metrics classes
> ---
>
> Key: SOLR-13733
> URL: https://issues.apache.org/jira/browse/SOLR-13733
> Project: Solr
>  Issue Type: Task
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-13733.patch
>
>
> Most public {{org.apache.solr.metrics}} classes already have class-level 
> javadocs and adding javadocs for the few that don't have them yet could be a 
> step towards {{ant precommit}} requiring(\*) javadocs (class-level only, for 
> public classes, for org.apache.solr.metrics) selectively on an opt-in basis.
> (\*) = e.g. via {{solr/build.xml}} content like this:
> {code}
>  dir="${javadoc.dir}/solr-core/org/apache/solr/metrics" level="class"/>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-13733) add class-level javadocs for public org.apache.solr.metrics classes

2019-09-02 Thread Christine Poerschke (Jira)

Christine Poerschke created SOLR-13733:
--

 Summary: add class-level javadocs for public 
org.apache.solr.metrics classes
 Key: SOLR-13733
 URL: https://issues.apache.org/jira/browse/SOLR-13733
 Project: Solr
  Issue Type: Task
Reporter: Christine Poerschke


Most public {{org.apache.solr.metrics}} classes already have class-level 
javadocs and adding javadocs for the few that don't have them yet could be a 
step towards {{ant precommit}} requiring(\*) javadocs (class-level only, for 
public classes, for org.apache.solr.metrics) selectively on an opt-in basis.

(\*) = e.g. via {{solr/build.xml}} content like this:
{code}

{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

jpountz commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r319987093
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/HitsThresholdChecker.java
 ##
 @@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.LongAdder;
+import java.util.function.BooleanSupplier;
+
+/**
+ * Used for defining custom algorithms to allow searches to early terminate
+ */
+abstract class HitsThresholdChecker implements BooleanSupplier {
+  /**
+   * Implementation of HitsThresholdChecker which allows global hit counting
+   */
+  private static class GlobalHitsThresholdChecker extends HitsThresholdChecker 
{
+private final int totalHitsThreshold;
+private final LongAdder globalHitCount;
 
 Review comment:
   AtomicLong would probably perform better. LongAdder usually performs better 
to accumulate counts, but the fact that we frequently retrieve the count, which 
is slower with LongAdder than AtomicLong, probably kills this benefit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

jpountz commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r320005574
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java
 ##
 @@ -191,32 +194,38 @@ public static TopScoreDocCollector create(int numHits, 
int totalHitsThreshold) {
* objects.
*/
   public static TopScoreDocCollector create(int numHits, ScoreDoc after, int 
totalHitsThreshold) {
+return create(numHits, after, 
HitsThresholdChecker.create(totalHitsThreshold));
+  }
+
+  public static TopScoreDocCollector create(int numHits, ScoreDoc after, 
HitsThresholdChecker hitsThresholdChecker) {
 
 Review comment:
   similar comment here


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

jpountz commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r320004946
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java
 ##
 @@ -467,9 +467,19 @@ public TopDocs searchAfter(ScoreDoc after, Query query, 
int numHits) throws IOEx
 
 final CollectorManager manager = new 
CollectorManager() {
 
+  private HitsThresholdChecker hitsThresholdChecker;
   @Override
   public TopScoreDocCollector newCollector() throws IOException {
-return TopScoreDocCollector.create(cappedNumHits, after, 
TOTAL_HITS_THRESHOLD);
+
+if (hitsThresholdChecker == null) {
+  if (executor == null || leafSlices.length <= 1) {
+hitsThresholdChecker = 
HitsThresholdChecker.create(TOTAL_HITS_THRESHOLD);
+  } else {
+hitsThresholdChecker = 
HitsThresholdChecker.createShared(TOTAL_HITS_THRESHOLD);
+  }
+}
 
 Review comment:
   can we create it eagerly instead of lazily?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

jpountz commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r320004644
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
 ##
 @@ -382,8 +383,19 @@ public static TopFieldCollector create(Sort sort, int 
numHits, int totalHitsThre
* @return a {@link TopFieldCollector} instance which will sort the results 
by
* the sort criteria.
*/
+  public static TopFieldCollector create(Sort sort, int numHits, FieldDoc 
after, int totalHitsThreshold) {
+if (totalHitsThreshold < 0) {
+  throw new IllegalArgumentException("totalHitsThreshold must be >= 0, got 
" + totalHitsThreshold);
+}
+
+return create(sort, numHits, after, 
HitsThresholdChecker.create(totalHitsThreshold));
+  }
+
+  /**
+   * Same as above with an additional parameter to allow passing in the 
threshold checker
+   */
   public static TopFieldCollector create(Sort sort, int numHits, FieldDoc 
after,
-  int totalHitsThreshold) {
+ HitsThresholdChecker 
hitsThresholdChecker) {
 
 Review comment:
   I'd have a preference for not exposing `HitsThresholdChecker` in the 
user-facing API and instead providing users with a factory method for a 
collector manager that can be used with multiple threads, e.g. `public static 
CollectorManager createSharedManager(sort, numHits, 
after, totalHitsThreshold)`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

jpountz commented on a change in pull request #823: LUCENE-8939: Introduce 
Shared Count Early Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#discussion_r319961589
 
 

 ##
 File path: 
lucene/core/src/java/org/apache/lucene/search/HitsThresholdChecker.java
 ##
 @@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.util.concurrent.atomic.LongAdder;
+import java.util.function.BooleanSupplier;
+
+/**
+ * Used for defining custom algorithms to allow searches to early terminate
+ */
+abstract class HitsThresholdChecker implements BooleanSupplier {
 
 Review comment:
   I don't think we actually need to implement BooleanSupplier? Maybe remove it 
from the implemented interfaces, which will in turn help use a more explicit 
method name?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13717) Distributed Grouping breaks multi valued 'fl' param

2019-09-02 Thread Lucene/Solr QA (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920927#comment-16920927
 ] 

Lucene/Solr QA commented on SOLR-13717:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
26s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  2m 53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  2m 53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  2m 53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 58m  
0s{color} | {color:green} core in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 31s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-13717 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979151/SOLR-13717.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP 
Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 1862ffd |
| ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 |
| Default Java | LTS |
|  Test Results | 
https://builds.apache.org/job/PreCommit-SOLR-Build/544/testReport/ |
| modules | C: solr/core U: solr/core |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/544/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Distributed Grouping breaks multi valued 'fl' param
> ---
>
> Key: SOLR-13717
> URL: https://issues.apache.org/jira/browse/SOLR-13717
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-13717.patch, SOLR-13717.patch
>
>
> Co-worker discovered a bug with (distributed) grouping when multiple {{fl}} 
> params are specified.
> {{StoredFieldsShardRequestFactory}} has very (old and) brittle code that 
> assumes there will be 0 or 1 {{fl}} params in the original request that it 
> should inspect to see if it needs to append (via string concat) the uniqueKey 
> field onto in order to collate the returned stored fields into their 
> respective (grouped) documents -- and then ignores any additional {{fl}} 
> params that may exist in the original request when it does so.
> The net result is that only the uniqueKey field and whatever fields _are_ 
> specified in the first {{fl}} param specified are fetched from each shard and 
> ultimately returned.
> The only workaround is to replace multiple {{fl}} params with a single {{fl}} 
> param containing a comma seperated list of the requested fields.
> 
> Bug is trivial to reproduce with {{bin/solr -e cloud -noprompt}} by comparing 
> these requests which should all be equivilent...
> {noformat}
> $ bin/post -c gettingstarted -out yes example/exampledocs/books.csv
> ...
> $ curl 
> 'http://localhost:8983/solr/gettingstarted/query?omitHeader=true=true=author,name,id=*:*=true=genre_s'
> {
>   "grouped":{
> "genre_s":{
>   "matches":10,
>   "groups":[{
>   "groupValue":"fantasy",
>   "doclist":{"numFound":8,"start":0,"maxScore":1.0,"docs":[
>   {
> "id":"0812521390",
> "name":["The Black Company"],
> "author":["Glen Cook"]}]
>   }},
> {
>   "groupValue":"scifi",
>   "doclist":{"numFound":2,"start":0,"docs":[
>

No BadApple report this week

2019-09-02 Thread Erick Erickson

I’ll probably just continue to gather Hoss’ rollups each week, but until we get 
the jenkins stuff back running it’s probably not worth the effort.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Can we change forceMerge to not need as much disk space?

2019-09-02 Thread Erick Erickson

It’s always bothered me that optimize/forceMerge needs 100% of the disk space. 
I’ve recently been wondering whether that’s absolutely necessary, especially 
now that forceMerge respects the max segment size.

I HAVE NOT looked at the code closely, so this is mostly theory for someone to 
shoot down before diving in at all.

I’ve seen some situations where optimizing makes a radical difference. For 
instance, the time it takes for the Terms component to return is essentially 
linear to the number of segments. An artificially bad case to be sure, but 
still. We’re talking the difference between 17 seconds and sub-second here. A 
large index to be sure…

Anyway, it occurred to me that once a max-sized segment is created, _if_ we 
write the segments_n file out with the current state of the index, we could 
freely delete the segments that were merged into the new one. With 300G indexes 
(which I see regularly in the field, even multiple ones per node that size), 
this could result in substantial disk savings.

Off the top of my head, I can see some concerns:
1> we’d have to open new searchers every time we wrote the segments_n file to 
release file handles on the old segments
2> coordinating multiple merge threads
3> maxMergeAtOnceExplicit could mean unnecessary thrashing/opening searchers 
(could this be deprecated?)
4> Don’t quite know what to do if maxSegments is 1 (or other very low number).

Something like this would also pave the way for “background optimizing”. 
Instead of a monolithic forceMerge, I can envision a process whereby we created 
a low-level task that merged one max-sized segment at a time, came up for air 
and reopened searchers then went back in and merged the next one. With its own 
problems about coordinating ongoing updates, but that’s another discussion ;).

There’s lots of details to work out, throwing this out for discussion. I can 
raise a JIRA if people think the idea has legs.

Erick



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8961) CheckIndex: pre-exorcise document id salvage

2019-09-02 Thread Christine Poerschke (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920917#comment-16920917
 ] 

Christine Poerschke commented on LUCENE-8961:
-

Attached outline work-in-progress patch:
* a new {{-skipCheckIntegrity}} option would allow the tool to proceed past the 
initial integrity checks (which would fail e.g. due to footer checksum failure)
* a new {{-idField F}} option would identify the field from which terms are to 
be salvaged
* the salvaged terms are currently printed to std.err (and this obviously would 
have to be changed somehow)

> CheckIndex: pre-exorcise document id salvage
> 
>
> Key: LUCENE-8961
> URL: https://issues.apache.org/jira/browse/LUCENE-8961
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: LUCENE-8961.patch
>
>
> The 
> [CheckIndex|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java]
>  tool supports the exorcising of corrupt segments from an index.
> This ticket proposes to add an extra option which could first be used to 
> potentially salvage the document ids of the segment(s) about to be exorcised. 
> Re-ingestion for those documents could then be arranged so as to repair the 
> data damage caused by the exorcising.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-8961) CheckIndex: pre-exorcise document id salvage

2019-09-02 Thread Christine Poerschke (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated LUCENE-8961:

Attachment: LUCENE-8961.patch

> CheckIndex: pre-exorcise document id salvage
> 
>
> Key: LUCENE-8961
> URL: https://issues.apache.org/jira/browse/LUCENE-8961
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: LUCENE-8961.patch
>
>
> The 
> [CheckIndex|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java]
>  tool supports the exorcising of corrupt segments from an index.
> This ticket proposes to add an extra option which could first be used to 
> potentially salvage the document ids of the segment(s) about to be exorcised. 
> Re-ingestion for those documents could then be arranged so as to repair the 
> data damage caused by the exorcising.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-8961) CheckIndex: pre-exorcise document id salvage

2019-09-02 Thread Christine Poerschke (Jira)

Christine Poerschke created LUCENE-8961:
---

 Summary: CheckIndex: pre-exorcise document id salvage
 Key: LUCENE-8961
 URL: https://issues.apache.org/jira/browse/LUCENE-8961
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Christine Poerschke


The 
[CheckIndex|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java]
 tool supports the exorcising of corrupt segments from an index.

This ticket proposes to add an extra option which could first be used to 
potentially salvage the document ids of the segment(s) about to be exorcised. 
Re-ingestion for those documents could then be arranged so as to repair the 
data damage caused by the exorcising.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] iverase opened a new pull request #851: LUCENE-8960: Add LatLonDocValuesPointInPolygonQuery

2019-09-02 Thread GitBox

iverase opened a new pull request #851: LUCENE-8960: Add 
LatLonDocValuesPointInPolygonQuery
URL: https://github.com/apache/lucene-solr/pull/851
 
 
   Adds a new query that iterates over LatLonPoint docValues and test if the 
point is within a provided polygon. In addition, `LatLonPointInPolygonQuery` is 
rewritten so it can be used efficiently with the new add query using 
`IndexOrDocValuesQuery`.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-8960) Add LatLonDocValuesPointInPolygonQuery

2019-09-02 Thread Ignacio Vera (Jira)

Ignacio Vera created LUCENE-8960:


 Summary: Add LatLonDocValuesPointInPolygonQuery
 Key: LUCENE-8960
 URL: https://issues.apache.org/jira/browse/LUCENE-8960
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Ignacio Vera


Currently LatLonDocValuesField contain queries for bounding box and circle. 
This issue adds a polygon query as well.




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13717) Distributed Grouping breaks multi valued 'fl' param

2019-09-02 Thread Christine Poerschke (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920856#comment-16920856
 ] 

Christine Poerschke commented on SOLR-13717:


Code change looks good to me.

Attached patch has a potential alternative for the test change, using a {{for 
(boolean withFL : new boolean[] \{true, false\})}} loop to reduce code 
duplication associated with the "special check: same query, but empty fl" test 
portion.


> Distributed Grouping breaks multi valued 'fl' param
> ---
>
> Key: SOLR-13717
> URL: https://issues.apache.org/jira/browse/SOLR-13717
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-13717.patch, SOLR-13717.patch
>
>
> Co-worker discovered a bug with (distributed) grouping when multiple {{fl}} 
> params are specified.
> {{StoredFieldsShardRequestFactory}} has very (old and) brittle code that 
> assumes there will be 0 or 1 {{fl}} params in the original request that it 
> should inspect to see if it needs to append (via string concat) the uniqueKey 
> field onto in order to collate the returned stored fields into their 
> respective (grouped) documents -- and then ignores any additional {{fl}} 
> params that may exist in the original request when it does so.
> The net result is that only the uniqueKey field and whatever fields _are_ 
> specified in the first {{fl}} param specified are fetched from each shard and 
> ultimately returned.
> The only workaround is to replace multiple {{fl}} params with a single {{fl}} 
> param containing a comma seperated list of the requested fields.
> 
> Bug is trivial to reproduce with {{bin/solr -e cloud -noprompt}} by comparing 
> these requests which should all be equivilent...
> {noformat}
> $ bin/post -c gettingstarted -out yes example/exampledocs/books.csv
> ...
> $ curl 
> 'http://localhost:8983/solr/gettingstarted/query?omitHeader=true=true=author,name,id=*:*=true=genre_s'
> {
>   "grouped":{
> "genre_s":{
>   "matches":10,
>   "groups":[{
>   "groupValue":"fantasy",
>   "doclist":{"numFound":8,"start":0,"maxScore":1.0,"docs":[
>   {
> "id":"0812521390",
> "name":["The Black Company"],
> "author":["Glen Cook"]}]
>   }},
> {
>   "groupValue":"scifi",
>   "doclist":{"numFound":2,"start":0,"docs":[
>   {
> "id":"0553293354",
> "name":["Foundation"],
> "author":["Isaac Asimov"]}]
>   }}]}}}
> $ curl 
> 'http://localhost:8983/solr/gettingstarted/query?omitHeader=true=true=author=name,id=*:*=true=genre_s'
> {
>   "grouped":{
> "genre_s":{
>   "matches":10,
>   "groups":[{
>   "groupValue":"fantasy",
>   "doclist":{"numFound":8,"start":0,"maxScore":1.0,"docs":[
>   {
> "id":"0812521390",
> "author":["Glen Cook"]}]
>   }},
> {
>   "groupValue":"scifi",
>   "doclist":{"numFound":2,"start":0,"docs":[
>   {
> "id":"0553293354",
> "author":["Isaac Asimov"]}]
>   }}]}}}
> $ curl 
> 'http://localhost:8983/solr/gettingstarted/query?omitHeader=true=true=id=author=name=*:*=true=genre_s'
> {
>   "grouped":{
> "genre_s":{
>   "matches":10,
>   "groups":[{
>   "groupValue":"fantasy",
>   "doclist":{"numFound":8,"start":0,"maxScore":1.0,"docs":[
>   {
> "id":"0553573403"}]
>   }},
> {
>   "groupValue":"scifi",
>   "doclist":{"numFound":2,"start":0,"docs":[
>   {
> "id":"0553293354"}]
>   }}]}}}
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-13717) Distributed Grouping breaks multi valued 'fl' param

2019-09-02 Thread Christine Poerschke (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-13717:
---
Attachment: SOLR-13717.patch

> Distributed Grouping breaks multi valued 'fl' param
> ---
>
> Key: SOLR-13717
> URL: https://issues.apache.org/jira/browse/SOLR-13717
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Major
> Attachments: SOLR-13717.patch, SOLR-13717.patch
>
>
> Co-worker discovered a bug with (distributed) grouping when multiple {{fl}} 
> params are specified.
> {{StoredFieldsShardRequestFactory}} has very (old and) brittle code that 
> assumes there will be 0 or 1 {{fl}} params in the original request that it 
> should inspect to see if it needs to append (via string concat) the uniqueKey 
> field onto in order to collate the returned stored fields into their 
> respective (grouped) documents -- and then ignores any additional {{fl}} 
> params that may exist in the original request when it does so.
> The net result is that only the uniqueKey field and whatever fields _are_ 
> specified in the first {{fl}} param specified are fetched from each shard and 
> ultimately returned.
> The only workaround is to replace multiple {{fl}} params with a single {{fl}} 
> param containing a comma seperated list of the requested fields.
> 
> Bug is trivial to reproduce with {{bin/solr -e cloud -noprompt}} by comparing 
> these requests which should all be equivilent...
> {noformat}
> $ bin/post -c gettingstarted -out yes example/exampledocs/books.csv
> ...
> $ curl 
> 'http://localhost:8983/solr/gettingstarted/query?omitHeader=true=true=author,name,id=*:*=true=genre_s'
> {
>   "grouped":{
> "genre_s":{
>   "matches":10,
>   "groups":[{
>   "groupValue":"fantasy",
>   "doclist":{"numFound":8,"start":0,"maxScore":1.0,"docs":[
>   {
> "id":"0812521390",
> "name":["The Black Company"],
> "author":["Glen Cook"]}]
>   }},
> {
>   "groupValue":"scifi",
>   "doclist":{"numFound":2,"start":0,"docs":[
>   {
> "id":"0553293354",
> "name":["Foundation"],
> "author":["Isaac Asimov"]}]
>   }}]}}}
> $ curl 
> 'http://localhost:8983/solr/gettingstarted/query?omitHeader=true=true=author=name,id=*:*=true=genre_s'
> {
>   "grouped":{
> "genre_s":{
>   "matches":10,
>   "groups":[{
>   "groupValue":"fantasy",
>   "doclist":{"numFound":8,"start":0,"maxScore":1.0,"docs":[
>   {
> "id":"0812521390",
> "author":["Glen Cook"]}]
>   }},
> {
>   "groupValue":"scifi",
>   "doclist":{"numFound":2,"start":0,"docs":[
>   {
> "id":"0553293354",
> "author":["Isaac Asimov"]}]
>   }}]}}}
> $ curl 
> 'http://localhost:8983/solr/gettingstarted/query?omitHeader=true=true=id=author=name=*:*=true=genre_s'
> {
>   "grouped":{
> "genre_s":{
>   "matches":10,
>   "groups":[{
>   "groupValue":"fantasy",
>   "doclist":{"numFound":8,"start":0,"maxScore":1.0,"docs":[
>   {
> "id":"0553573403"}]
>   }},
> {
>   "groupValue":"scifi",
>   "doclist":{"numFound":2,"start":0,"docs":[
>   {
> "id":"0553293354"}]
>   }}]}}}
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8956) QueryRescorer sort optimization

2019-09-02 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920853#comment-16920853
 ] 

Adrien Grand commented on LUCENE-8956:
--

I was thinking that we could verify that we have the right hits by rescoring 
twice, once with topN=random().nextInt(numDocs) like in your patch, and another 
time with topN=numDocs, then make sure that the first topN hits are the same in 
both cases (CheckHits#checkEquals might help).


> QueryRescorer sort optimization
> ---
>
> Key: LUCENE-8956
> URL: https://issues.apache.org/jira/browse/LUCENE-8956
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/query/scoring
>Reporter: Paul Sanwald
>Priority: Minor
> Attachments: LUCENE-8956.patch
>
>
> This patch addresses a TODO in QueryRescorer: We should not sort the full 
> array of the results returned from rescoring, but rather only topN, when topN 
> is less than total hits.
>  
> Made this optimization with some suggestions from [~jpountz] and [~jimczi], 
> this is my first lucene patch submission.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on issue #816: LUCENE-8942: Tighten Up LRUQueryCache's Methods

2019-09-02 Thread GitBox

atris commented on issue #816: LUCENE-8942: Tighten Up LRUQueryCache's Methods
URL: https://github.com/apache/lucene-solr/pull/816#issuecomment-527143844
 
 
   > Sorry for the lag, I was off for the last 5 weeks.
   
   No sweat, welcome back!
   
   > I see the CHANGES entry is in 9.0 for now, maybe we should get it in 8.x 
too to keep backports clean?
   
   Sure, I will add a CHANGES entry for 8.x (whichever x is the latest)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on issue #816: LUCENE-8942: Tighten Up LRUQueryCache's Methods

2019-09-02 Thread GitBox

jpountz commented on issue #816: LUCENE-8942: Tighten Up LRUQueryCache's Methods
URL: https://github.com/apache/lucene-solr/pull/816#issuecomment-527139265
 
 
   Sorry for the lag, I was off for the last 5 weeks. I'll merge the change 
soon. I see the CHANGES entry is in 9.0 for now, maybe we should get it in 8.x 
too to keep backports clean?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6930) Provide "Circuit Breakers" For Expensive Solr Queries

2019-09-02 Thread Dr Oleg Savrasov (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dr Oleg Savrasov updated SOLR-6930:
---
Attachment: SOLR-6930.patch

> Provide "Circuit Breakers" For Expensive Solr Queries
> -
>
> Key: SOLR-6930
> URL: https://issues.apache.org/jira/browse/SOLR-6930
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Mike Drob
>Priority: Major
> Attachments: SOLR-6930.patch, SOLR-6930.patch, SOLR-6930.patch, 
> SOLR-6930.patch, SOLR-6930.patch
>
>
> Ref: 
> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_limiting_memory_usage.html
> ES currently allows operators to configure "circuit breakers" to preemptively 
> fail queries that are estimated too large rather than allowing an OOM 
> Exception to happen. We might be able to do the same thing.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-repro - Build # 3585 - Unstable

2019-09-02 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-repro/3585/

[...truncated 28 lines...]
[repro] Jenkins log URL: 
https://builds.apache.org/job/Lucene-Solr-NightlyTests-8.x/199/consoleText

[repro] Revision: dd27d003a4e8cf58e53df1e8359b1c63f6c9278a

[repro] Ant options: -Dtests.multiplier=2 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-8.x/test-data/enwiki.random.lines.txt
[repro] Repro line:  ant test  -Dtestcase=RollingRestartTest 
-Dtests.method=test -Dtests.seed=AAB71FED9054CD33 -Dtests.multiplier=2 
-Dtests.nightly=true -Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-8.x/test-data/enwiki.random.lines.txt
 -Dtests.locale=fi -Dtests.timezone=America/Guyana -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII

[repro] Repro line:  ant test  -Dtestcase=SimpleMLTQParserTest 
-Dtests.method=doTest -Dtests.seed=AAB71FED9054CD33 -Dtests.multiplier=2 
-Dtests.nightly=true -Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-8.x/test-data/enwiki.random.lines.txt
 -Dtests.locale=pl-PL -Dtests.timezone=America/Barbados -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII

[repro] git rev-parse --abbrev-ref HEAD
[repro] git rev-parse HEAD
[repro] Initial local git branch/revision: 
1862ffd6a4a651db0ef27e1ef4f82bc0702be59b
[repro] git fetch

[...truncated 7 lines...]
[repro] git checkout dd27d003a4e8cf58e53df1e8359b1c63f6c9278a

[...truncated 2 lines...]
[repro] git merge --ff-only

[...truncated 1 lines...]
[repro] ant clean

[...truncated 6 lines...]
[repro] Test suites by module:
[repro]solr/core
[repro]   SimpleMLTQParserTest
[repro]   RollingRestartTest
[repro] ant compile-test

[...truncated 3579 lines...]
[repro] ant test-nocompile -Dtests.dups=5 -Dtests.maxfailures=10 
-Dtests.class="*.SimpleMLTQParserTest|*.RollingRestartTest" 
-Dtests.showOutput=onerror -Dtests.multiplier=2 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-8.x/test-data/enwiki.random.lines.txt
 -Dtests.seed=AAB71FED9054CD33 -Dtests.multiplier=2 -Dtests.nightly=true 
-Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-8.x/test-data/enwiki.random.lines.txt
 -Dtests.locale=pl-PL -Dtests.timezone=America/Barbados -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII

[...truncated 1437704 lines...]
[repro] Setting last failure code to 256

[repro] Failures:
[repro]   1/5 failed: org.apache.solr.cloud.RollingRestartTest
[repro]   2/5 failed: org.apache.solr.search.mlt.SimpleMLTQParserTest
[repro] git checkout 1862ffd6a4a651db0ef27e1ef4f82bc0702be59b

[...truncated 2 lines...]
[repro] Exiting with code 256

[...truncated 6 lines...]

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on issue #816: LUCENE-8942: Tighten Up LRUQueryCache's Methods

2019-09-02 Thread GitBox

atris commented on issue #816: LUCENE-8942: Tighten Up LRUQueryCache's Methods
URL: https://github.com/apache/lucene-solr/pull/816#issuecomment-527110529
 
 
   Could we merge this? Seems safe enough to merge?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on issue #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

atris commented on issue #823: LUCENE-8939: Introduce Shared Count Early 
Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#issuecomment-527109501
 
 
   > > I think the output I pasted above does mention the tasks run?
   > 
   > Hmm the first few `luceneutil` results you posted didn't seem to state 
which task (`wikimedium10k`)? Or maybe I missed it, sorry ... in general when 
posting benchmarks it's vital to give enough details and use/share public 
benchmark tools/data (like `luceneutil` and `wikimedia`'s corpus) so that 
someone else could go and replicate your results.
   
   +1, thanks for the input -- will ensure that henceforth.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on issue #823: LUCENE-8939: Introduce Shared Count Early Termination In Parallel Search

2019-09-02 Thread GitBox

mikemccand commented on issue #823: LUCENE-8939: Introduce Shared Count Early 
Termination In Parallel Search
URL: https://github.com/apache/lucene-solr/pull/823#issuecomment-527103813
 
 
   > I think the output I pasted above does mention the tasks run?
   
   Hmm the first few `luceneutil` results you posted didn't seem to state which 
task (`wikimedium10k`)?  Or maybe I missed it, sorry ... in general when 
posting benchmarks it's vital to give enough details and use/share public 
benchmark tools/data (like `luceneutil` and `wikimedia`'s corpus) so that 
someone else could go and replicate your results.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-master - Build # 3667 - Failure

2019-09-02 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-master/3667/

All tests passed

Build Log:
[...truncated 5126 lines...]
   [junit4] JVM J1: stdout was not empty, see: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/lucene/build/backward-codecs/test/temp/junit4-J1-20190902_093244_0126480541521229511094.sysout
   [junit4] >>> JVM J1 emitted unexpected output (verbatim) 
   [junit4] #
   [junit4] # A fatal error has been detected by the Java Runtime Environment:
   [junit4] #
   [junit4] #  SIGSEGV (0xb) at pc=0x7fa2feb2926c, pid=20250, tid=20285
   [junit4] #
   [junit4] # JRE version: Java(TM) SE Runtime Environment (11.0.1+13) (build 
11.0.1+13-LTS)
   [junit4] # Java VM: Java HotSpot(TM) 64-Bit Server VM (11.0.1+13-LTS, mixed 
mode, tiered, compressed oops, g1 gc, linux-amd64)
   [junit4] # Problematic frame:
   [junit4] # V  [libjvm.so+0xd4026c][thread 22682 also had an error]
   [junit4]   PhaseIdealLoop::split_up(Node*, Node*, Node*) [clone 
.part.39]+0x47c
   [junit4] #
   [junit4] # Core dump will be written. Default location: Core dumps may be 
processed with "/usr/share/apport/apport %p %s %c %d %P" (or dumping to 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/lucene/build/backward-codecs/test/J1/core.20250)
   [junit4] #
   [junit4] # An error report file with more information is saved as:
   [junit4] # 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/lucene/build/backward-codecs/test/J1/hs_err_pid20250.log
   [junit4] 
   [junit4] [timeout occurred during error reporting in step ""] after 30 s.
   [junit4] #
   [junit4] # If you would like to submit a bug report, please visit:
   [junit4] #   http://bugreport.java.com/bugreport/crash.jsp
   [junit4] #
   [junit4] <<< JVM J1: EOF 

[...truncated 3 lines...]
   [junit4] ERROR: JVM J1 ended with an exception, command line: 
/usr/local/asfpackages/java/jdk-11.0.1/bin/java -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/heapdumps
 -ea -esa --illegal-access=deny -Dtests.prefix=tests 
-Dtests.seed=FE673A0F7506B4 -Xmx512M -Dtests.iters= -Dtests.verbose=false 
-Dtests.infostream=false -Dtests.codec=random -Dtests.postingsformat=random 
-Dtests.docvaluesformat=random -Dtests.locale=random -Dtests.timezone=random 
-Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz 
-Dtests.luceneMatchVersion=9.0.0 -Dtests.cleanthreads=perMethod 
-Djava.util.logging.config.file=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/lucene/tools/junit4/logging.properties
 -Dtests.nightly=false -Dtests.weekly=false -Dtests.monster=false 
-Dtests.slow=true -Dtests.asserts=true -Dtests.multiplier=2 -DtempDir=./temp 
-Djava.io.tmpdir=./temp 
-Dcommon.dir=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/lucene
 
-Dclover.db.dir=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/lucene/build/clover/db
 
-Djava.security.policy=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/lucene/tools/junit4/tests.policy
 -Dtests.LUCENE_VERSION=9.0.0 -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -Djdk.map.althashing.threshold=0 
-Dtests.src.home=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master 
-Djava.security.egd=file:/dev/./urandom 
-Djunit4.childvm.cwd=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/lucene/build/backward-codecs/test/J1
 
-Djunit4.tempDir=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/lucene/build/backward-codecs/test/temp
 -Djunit4.childvm.id=1 -Djunit4.childvm.count=3 
-Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
-Dtests.filterstacks=true -Dtests.leaveTemporary=false -Dtests.badapples=false 
-classpath

[jira] [Created] (SOLR-13732) Always traces requests contain configured header key

2019-09-02 Thread Cao Manh Dat (Jira)

Cao Manh Dat created SOLR-13732:
---

 Summary: Always traces requests contain configured header key
 Key: SOLR-13732
 URL: https://issues.apache.org/jira/browse/SOLR-13732
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat
Assignee: Cao Manh Dat


Sometimes, users want to always trace requests contains {{special}} header 
(ignoring sampling rate). For example traced request comes from other backend 
services.

Users should be able to do that by setting a {{ClusterProperty}}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-8.x - Build # 510 - Still Unstable

2019-09-02 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-8.x/510/

2 tests failed.
FAILED:  org.apache.solr.handler.TestContainerReqHandler.testPackageAPI

Error Message:
attempt: 9 Mismatch for value : '[requestHandler]' in response {   
"responseHeader":{ "status":0, "QTime":0},   "metadata":{"version":3},  
 "runtimeLib":[{   "name":"global",   
"url":"http://localhost:34099/jar3.jar;,   
"sha256":"20e0bfaec71b2e93c4da9f2ed3745dda04dc3fc915b66cc0275863982e73b2a3",
   "znodeVersion":3}],   "requestHandler":[{   "name":"bar",   
"znodeVersion":3,   "class":"org.apache.solr.core.RuntimeLibReqHandler",
   "package":"global"}]}

Stack Trace:
java.lang.AssertionError: attempt: 9 Mismatch for value : '[requestHandler]' in 
response {
  "responseHeader":{
"status":0,
"QTime":0},
  "metadata":{"version":3},
  "runtimeLib":[{
  "name":"global",
  "url":"http://localhost:34099/jar3.jar;,
  
"sha256":"20e0bfaec71b2e93c4da9f2ed3745dda04dc3fc915b66cc0275863982e73b2a3",
  "znodeVersion":3}],
  "requestHandler":[{
  "name":"bar",
  "znodeVersion":3,
  "class":"org.apache.solr.core.RuntimeLibReqHandler",
  "package":"global"}]}
at 
__randomizedtesting.SeedInfo.seed([9C3BBC3A2BE2DEBD:EE34A97487B984AD]:0)
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.solr.handler.TestContainerReqHandler.assertResponseValues(TestContainerReqHandler.java:121)
at 
org.apache.solr.handler.TestContainerReqHandler.testPackageAPI(TestContainerReqHandler.java:236)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at

[GitHub] [lucene-solr] atris commented on issue #815: LUCENE-8213: Introduce Asynchronous Caching in LRUQueryCache

2019-09-02 Thread GitBox

atris commented on issue #815: LUCENE-8213: Introduce Asynchronous Caching in 
LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/815#issuecomment-527039867
 
 
   Another set of runs on wikimedium all with concurrent searching enabled:
   
 Fuzzy1   47.29  (7.1%)   45.06 (11.1%)   
-4.7% ( -21% -   14%)
   OrHighNotMed  405.86  (3.4%)  392.55  (2.2%)   
-3.3% (  -8% -2%)
  OrNotHighHigh  386.16  (4.7%)  373.54  (4.1%)   
-3.3% ( -11% -5%)
  BrowseDayOfYearTaxoFacets 6003.62  (2.6%) 5808.73 
 (2.2%)   -3.2% (  -7% -1%)
Prefix3  176.87 (10.1%)  172.28  (8.9%)   
-2.6% ( -19% -   18%)
  BrowseMonthTaxoFacets 6190.97  (3.8%) 6044.46  
(4.9%)   -2.4% ( -10% -6%)
  MedPhrase   40.97  (5.1%)   40.06  (5.5%)   
-2.2% ( -12% -8%)
   OrNotHighMed  383.00  (3.3%)  374.82  (4.7%)   
-2.1% (  -9% -6%)
 AndHighLow  191.05  (3.4%)  187.88  (3.2%)   
-1.7% (  -7% -5%)
   OrHighNotLow  416.92  (4.4%)  411.50  (4.3%)   
-1.3% (  -9% -7%)
 AndHighMed   39.58  (2.2%)   39.17  (1.9%)   
-1.0% (  -5% -3%)
   Wildcard   24.72  (7.9%)   24.49  (6.6%)   
-0.9% ( -14% -   14%)
 HighPhrase   52.11  (5.6%)   51.63  (4.8%)   
-0.9% ( -10% -   10%)
  LowPhrase   13.43  (2.7%)   13.33  (2.5%)   
-0.8% (  -5% -4%)
AndHighHigh   12.68  (3.9%)   12.58  (3.2%)   
-0.8% (  -7% -6%)
   HighTerm  717.09  (4.8%)  712.98  (5.2%)   
-0.6% ( -10% -9%)
   PKLookup   91.70  (2.8%)   91.27  (3.8%)   
-0.5% (  -6% -6%)
 IntNRQ   21.92 (17.9%)   21.83 (18.0%)   
-0.4% ( -30% -   43%)
Respell   34.38  (3.3%)   34.24  (2.3%)   
-0.4% (  -5% -5%)
   HighTermDayOfYearSort   27.44  (3.2%)   27.33  
(1.6%)   -0.4% (  -5% -4%)
  OrHighNotHigh  463.40  (5.4%)  461.74  (4.3%)   
-0.4% (  -9% -9%)
  BrowseDateTaxoFacets0.69  (0.4%)0.69  
(0.5%)   -0.2% (  -1% -0%)
  BrowseMonthSSDVFacets2.63  (1.6%)2.63  
(1.1%)   -0.1% (  -2% -2%)
MedTerm  885.64  (5.0%)  885.56  (3.6%)   
-0.0% (  -8% -8%)
 Fuzzy2   39.47  (7.1%)   39.47  (9.4%)   
-0.0% ( -15% -   17%)
  BrowseDayOfYearSSDVFacets2.41  (0.4%)2.41 
 (0.3%)   -0.0% (   0% -0%)
   HighSpanNear6.53  (1.0%)6.53  (1.3%)
0.0% (  -2% -2%)
   MedSloppyPhrase   31.76  (2.0%)   31.79  (1.6%)  
  0.1% (  -3% -3%)
  HighIntervalsOrdered6.10  (1.6%)6.11  
(2.1%)0.1% (  -3% -3%)
  OrHighLow  177.27  (2.2%)  177.50  (2.5%)
0.1% (  -4% -5%)
LowSloppyPhrase   30.62  (1.9%)   30.67  (1.7%)
0.2% (  -3% -3%)
MedSpanNear7.63  (1.7%)7.64  (2.0%)
0.2% (  -3% -3%)
LowSpanNear8.34  (1.3%)8.37  (1.8%)
0.3% (  -2% -3%)
   OrNotHighLow  308.14  (2.0%)  309.75  (4.9%)
0.5% (  -6% -7%)
LowTerm  861.94  (4.6%)  870.93  (2.8%)
1.0% (  -6% -8%)
   HighSloppyPhrase5.58  (3.2%)5.64  (2.8%)
1.1% (  -4% -7%)
  OrHighMed   10.81  (2.4%)   10.95  (2.6%)
1.4% (  -3% -6%)
 OrHighHigh   10.28  (2.5%)   10.48  (3.2%)
1.9% (  -3% -7%)
   
   
   Seems there is no degradation?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8403) Support 'filtered' term vectors - don't require all terms to be present

2019-09-02 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920659#comment-16920659
 ] 

Adrien Grand commented on LUCENE-8403:
--

bq. mucking with the higher level features to use a separate field for the term 
vector (e.g. in a highlighter)

We could do this quite transparently for highlighters by using a FilterReader 
that redirects term vector calls to the filtered field. This would still be a 
hack as the reader would not pass CheckIndex either, but a much more contained 
one that might be ok, and would avoid making highlighters too complex?

> Support 'filtered' term vectors - don't require all terms to be present
> ---
>
> Key: LUCENE-8403
> URL: https://issues.apache.org/jira/browse/LUCENE-8403
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Braun
>Priority: Minor
> Attachments: LUCENE-8403.patch
>
>
> The genesis of this was a conversation and idea from [~dsmiley] several years 
> ago.
> In order to optimize term vector storage, we may not actually need all tokens 
> to be present in the term vectors - and if so, ideally our codec could just 
> opt not to store them.
> I attempted to fork the standard codec and override the TermVectorsFormat and 
> TermVectorsWriter to ignore storing certain Terms within a field. This 
> worked, however, CheckIndex checks that the terms present in the standard 
> postings are also present in the TVs, if TVs enabled. So this then doesn't 
> work as 'valid' according to CheckIndex.
> Can the TermVectorsFormat be made in such a way to support configuration of 
> tokens that should not be stored (benefits: less storage, more optimal 
> retrieval per doc)? Is this valuable to the wider community? Is there a way 
> we can design this to not break CheckIndex's contract while at the same time 
> lessening storage for unneeded tokens?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

60 matches

Mail list logo